VTAM is a metabarcoding package with various commands to process high throughput sequencing (HTS) data of amplicons of one or several metabarcoding markers in FASTQ format and produce a table of amplicon sequence variants (ASVs) assigned to taxonomic groups. If you use VTAM in scientific works, please cite the following article:
González, A., Dubut, V., Corse, E., Mekdad, R., Dechatre, T. and Meglécz, E.. VTAM: A robust pipeline for processing metabarcoding data using internal controls. bioRxiv: 10.1101/2020.11.06.371187v1.
Commands for a quick installation:
conda create --name vtam python=3.9 -y
conda activate vtam
Then install dependencies
python3 -m pip install cutadapt
conda install -c bioconda blast -y
conda install -c bioconda vsearch -y
python3 -m pip install vtam
Commands for a quick working example:
vtam example
cd example
snakemake --printshellcmds --resources db=1 --snakefile snakefile.yml --cores 4 --configfile asper1/user_input/snakeconfig_mfzr.yml --until asvtable_taxa
The table of amplicon sequence variants (ASV) is here:
(vtam) user@host:~/vtam/example$ head -n4 asper1/run1_mfzr/asvtable_default_taxa.tsv
run marker variant sequence_length read_count tpos1_run1 tnegtag_run1 14ben01 14ben02 clusterid clustersize chimera_borderlineltg_tax_id ltg_tax_name ltg_rank identity blast_db phylum class order family genus species sequence
run1 MFZR 25 181 478 478 0 0 0 25 1 False 131567 cellular organisms no rank 80 coi_blast_db_20200420 ACTATACCTTATCTTCGCAGTATTCTCAGGAATGCTAGGAACTGCTTTTAGTGTTCTTATTCGAATGGAACTAACATCTCCAGGTGTACAATACCTACAGGGAAACCACCAACTTTACAATGTAATCATTACAGCTCACGCATTCCTAATGATCTTTTTCATGGTTATGCCAGGACTTGTT
run1 MFZR 51 181 165 0 0 0 165 51 1 False coi_blast_db_20200420 ACTATATTTAATTTTTGCTGCAATTTCTGGTGTAGCAGGAACTACGCTTTCATTGTTTATTAGAGCTACATTAGCGACACCAAATTCTGGTGTTTTAGATTATAATTACCATTTGTATAATGTTATAGTTACGGGTCATGCTTTTTTGATGATCTTTTTTTTAGTAATGCCTGCTTTATTG
run1 MFZR 88 175 640 640 0 0 0 88 1 False 1592914 Caenis pusilla species 100 coi_blast_db_20200420 Arthropoda Insecta Ephemeroptera Caenidae Caenis Caenis pusilla ACTATATTTTATTTTTGGGGCTTGATCCGGAATGCTGGGCACCTCTCTAAGCCTTCTAATTCGTGCCGAGCTGGGGCACCCGGGTTCTTTAATTGGCGACGATCAAATTTACAATGTAATCGTCACAGCCCATGCTTTTATTATGATTTTTTTCATGGTTATGCCTATTATAATC
The database of intermediate data is here:
(vtam) user@host:~/vtam/example$ sqlite3 asper1/db.sqlite '.tables'
FilterChimera Sample
FilterChimeraBorderline SampleInformation
FilterCodonStop SortedReadFile
FilterIndel TaxAssign
FilterLFN Variant
FilterLFNreference VariantReadCount
FilterMinReplicateNumber wom_Execution
FilterMinReplicateNumber2 wom_FileInputOutputInformation
FilterMinReplicateNumber3 wom_Option
FilterPCRerror wom_TableInputOutputInformation
FilterRenkonen wom_TableModificationTime
Marker wom_ToolWrapper
ReadCountAverageOverReplicates wom_TypeInputOrOutput
Run
The VTAM documentation is hosted at ReadTheDocs.
VTAM is maintained by Aitor González (aitor dot gonzalez at univ-amu dot fr) and Emese Meglécz (emese dot meglecz at univ-amu dot fr).