The term is an acronym of BIOlogical Scripts for PHylogeny Analyses.
It consists of six separated scripts:
** BuildDB ** – This script uses the NCBI taxonomy database to classify and separate sequences from a given a list of FASTA formatted sequences. Any rank parameter from NCBI taxonomy database can be use to filter the desirable sequences.
** DUPWIPE ** – a shell script that uses scripts available at Scriptome to clean duplicate sequence from a file.
** SEARCH ** – Search for a complete taxonomic classification using a GI number, TAXID or scientific name.
** FASTAHDR ** – Using Bioperl components this script rebuild the FASTA sequence header for a more friendly view, enabling custom fields do insert. It also can change the FASTA header for a taxonomic classification.
The last two scripts were intended to build all information necessary for character tracing study using Mesquite (http://mesquiteproject.org/)
** BUILDCHAR ** – using a list of sequences in FASTA format as input, this script taxonomically classify all sequences and use it as character states. It builds the nexus block used as input on mesquite software for simulations of character evolution on a given tree.
** BUILDPTP ** – has the same function of Buildchar, but, it shuffles the characters states n times to build the nexus file used for modified PTP text.
You can update all databases using our automated script.
Download it and save as shell script executable and run:
LINUX: > ./update_databases.sh
The duration of the update will depend on you connection speed and your computer power, on average it takes about 2-3 hours to download all fles and generate the local database.
This script uses the NCBI taxonomy database to classify and separate sequences from a given a list of FASTA formatted sequences. Any rank parameter from NCBI taxonomy database can be use to filter the desirable sequences.
To use it simply save it as perl script and call:
LINUX:> builddb.pl input.file format
BuildDB accepts all formats included on Bioperl
You should costumize the following part of the script to fit on your needs.
if (($node->rank eq "superkingdom") && ($node->scientific_name eq "Bacteria"))
{
#print $node->rank,"\t", $node->scientific_name, "\n";
# grava no texto
$seq_out->write_seq($seq);
###Put it on an array for counting purposes
push(@in, $node->scientific_name);
}
TaxSearch is a perl script to search for a complete taxonomic classification using a GI number, TAXID or scientific name.
It runs on a shell and need some additional database to run correctly.
To run TaxSearch:
To search for taxonomy classification of Homo sapiens
LINUX:> perl search.pl -n "homo sapiens"
To search for classification of a GI number you may
LINUX:> perl search.pl -g 220941669 -t -c
The result will be the complete taxonomic classificaton (due to -t option) and the common name list (-c option)
Using Bioperl components this script rebuild the FASTA sequence header for a more friendly view, enabling custom fields do insert. It also can change the FASTA header for a taxonomic classification.
FASTAHDR was primarily written to be used with NCBI fasta input, but, this can be easelly cutomized.
Linux> script.pl file.fasta > outfile.fasta
A shell script that uses scripts available at Scriptome (http://archive.sysbio.harvard.edu/csb/resources/computational/scriptome/) to clean duplicate sequence from a file.
The script will generate 2 files: .uni is the fasta unique sequences and .diff all differences found.
To use this you must save this as a shell script, and call:
Linux> ./script.sh file.fasta
Using a list of sequences in FASTA format as input, this script taxonomically classify all sequences and use it as character states.
It builds the nexus block used as input on mesquite software for simulations of character evolution on a given tree.
To use Buildchar:
LINUX:> perl rebuildfasta.pl infile > outfile
infile = input file in FASTA format
output = altered FASTA file
It has the same function of Buildchar, but, it shuffles the characters states n times to build the nexus file used for modified PTP text (1 and 2).
If you have an slow computer, it may take a while to open the file generated by Buildptp.
- Wahlberg N: The phylogenetics and biochemistry of host-plant specialization in Melitaeine butterflies (Lepidoptera: Nymphalidae). Evolution 2001, 55:522-537.
- Faith DP, Cranston PS: Could a cladogram this short have arisen by chance alone?: On permutation tests for cladistic structure. Cladistics 1991, 5:235-258.