PathTrace

PathTrace is an efficient algorithm for parsimony-based reconstructions of the evolutionary history of individual metabolic pathways. DOI

Requirements

Perl (verified to work in v5.24.1)
- Requires the Tree::Simple module for successful execution.

Compatible with all major Operating Systems (verified to work with Linux Ubuntu 16.04 LTS and Windows 10)

Installing and Running PathTrace on the sample input

Download or clone the repository.
Go to the directory you cloned (or uncompressed) the application.
Run the command:

perl pathTrace.pl input/pathTraceInput

Creating the BLAST-DB

In order to run pathTrace with the provided demo data, the BLAST-DB necessary for detecting the homologs must be initially constructed.

The FASTA file with all target sequences used in the Case Study are available on FigShare here. After downloading and uncompressing the data, the BLAST-able database can be constructed with the following command.

makeblastdb -in PathTrace-Demo-Target.fasta -parse_seqids -dbtype prot -title bacteria_ensembl_DB -out bacteria_ensembl_DB

The path of the final DB should be listed within the input file of PathTrace.

Description of the Sample input

The demo folder of PathTrace comprises of the following four files:

pathTraceInput

This is the main input file, and essentially points to the individual files necessary for a successful pathTrace execution, ordered as follows:

Query Pathway in BioPAX format
A file containing the target genomes
The BLAST-able database that will be used as the basis of the homology
A tree (phylogenetic or taxonomic) of the target genomes

An example of the pathTrace input file is the following:

input/Sample_Pathway_Lysine_BioPAX_L3.owl
input/genomeList
BLAST-DB/bacteria_ensembl_DB
input/genomeTree.nodes

The BLAST-able database entry should correspond to the location defined in the previous step (Creating the BLAST-DB)

genomeList

This is a list of the target genomes, i.e. the genomes against which the inference of presence or absence will be performed. The file has 4 tab-delimited columns that correspond to (a) the incremental number of the genome, (b) the full name of the genome (c) the CoGENT-like code of the genome and (d) the grouping based on common pangenome.

An example of the genomeList input file is the following:

1  P_abyssi                   PABY-XXX  1
2  P_horikoshii               PHOR-XXX  1
3  S_pneumoniae_70585         SPNE-705  2
4  S_pyogenes_sf370           SPYO-SF3  2
5  B_anthracis_ames_ancestor  BANT-AMA  3
6  B_subtilis                 BSUB-XXX  3
7  B_aphidicola_5a            BAPH-5AX  4
8  B_aphidicola_schizaphis    BAPH-SCH  4
9  E_coli_dh10b               ECOL-DH1  5
10 E_coli_k12                 ECOL-K12  5

genomeTree.nodes

The tree file format is historical/legacy, and can be easily be converted from a Newick tree format input file; the first column is node name, the second is parent name, and leaf nodes of the tree are marked with the third column with undef value in it. The root node does not have a parent and there should obviously be only one root. As an example, see the tree structure below:

node8                    	                         	      
node0       node8                    	       
PABY-XXX    node0    undef  
PHOR-XXX    node0    undef

Sample_Pathway_Lysine_BioPAX_L3.owl

A level-3 BioPAX file, containing the target pathway in the analysis. This can be downloaded directly from BioCyc.

List of relevant FigShare files

PathTrace Demo (Query): A FASTA-formated file containing 86 enzymes that participate in the nine metabolic pathways in the Case Study of the PathTrace tool.
- DOI: 10.6084/m9.figshare.5426710.v1
PathTrace Demo (Target): This is a snapshot of the Bacteria Ensembl database release 12, containing a total of 937,529 protein sequences from 249 species, organized into ten major collections. This data is used to construct the target BLAST database for the demo Case Study of the PathTrace tool
- DOI: 10.6084/m9.figshare.5422852.v1
PathTrace Demo (BLAST output): This is the BLAST output file, produced by running BLAST for the PathTrace Demo (Query) FASTA file against PathTrace Demo (Target) sequences.
- DOI: 10.6084/m9.figshare.5426725.v1

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
figures-data		figures-data
geneTrace		geneTrace
input		input
lib		lib
output		output
BiopaxClient.jar		BiopaxClient.jar
CreateMatrix.pm		CreateMatrix.pm
LICENSE		LICENSE
PopulateNodes.pm		PopulateNodes.pm
ProcessNodeFiles.pm		ProcessNodeFiles.pm
README.md		README.md
pathTrace.pl		pathTrace.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PathTrace

Requirements

Installing and Running PathTrace on the sample input

Creating the BLAST-DB

Description of the Sample input

List of relevant FigShare files

About

Releases

Packages

Contributors 2

Languages

License

CGU-CERTH/PathTrace

Folders and files

Latest commit

History

Repository files navigation

PathTrace

Requirements

Installing and Running PathTrace on the sample input

Creating the BLAST-DB

Description of the Sample input

List of relevant FigShare files

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages