-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Welcome to the barapost wiki!
"Barapost" command line toolkit is designed for binning FASTA, FASTQ and FAST5 files (i.e. separation into different files) according to taxonomic classification of nucleotide sequences stored in them. Classification is implemented as finding the most similar reference sequence in a nucleotide database: remotely using NCBI BLAST web serveice or on a local machine with BLAST+ toolkit.
- Demultiplexing whole genome sequencing reads (basically, "long" reads) without barcoding. This demultiplexing is based on taxonomic annotation, therefore organisms of interest should be distant (having in mind naive classification algorithm and lateral gene transfer). Usually, it is enough to organisms to belong to different genera.
- Genome assembly: Barapost can detect and remove contigs assembled from cross-talks. You might have seen such contigs: they are short and have low coverage, they don't belong to genome of interest and should be removed.
- Comparing different sets of contigs of the same genome (see Example 7 on barapost-local page).
- I dare to surmise this list isn't complete :-)
-
barapost-prober.py -- this script submits several sequences (i.e. only a part of your data set) to NCBI BLAST server in order to determine what taxons are "present" in data set. "barapost-prober.py" saves accession numbers of best hit(s) of each submitted input sequence. Processing all sequences in this way takes too much time, what leads us to "barapost-local.py".
-
barapost-local.py -- this script firstly downloads best hits "discovered" by "barapost-prober.py" from GenBank, then composes a database from downloaded reference sequences on local machine and finally classifies the major part of data using created database. "barapost-local.py" creates a database and "BLASTs" input sequences with "BLAST+" toolkit.
-
barapost-binning.py -- this script bins (divides into separate files) nucleotide sequences according to results of "barapost-prober.py" and/or "barapost-local.py"