Skip to content

ctmrbio/optivag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OptiVag:

Tools and databases for annotating vaginal communities

Citation

If using any part of this repo, please refer to:

Luisa W. Hugerth, Marcela Pereira, Yinghua Zha, Maike Seifert, Vilde Kaldhusdal, Fredrik Boulund, Maria C. Krog, , Zahra Bashir, Marica Hamsten, Emma Fransson, Henriette Svarre Nielsen, Ina Schuppe-Koistinen, and Lars Engstrand (2018) Assessment of In Vitro and In Silico Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome mSphere, 5(6): e00448-20

Contents

Database

db

16S:

  • optivag_db.aln.fasta.gz: aligned file with all 16S sequences used to simulate amplicons

  • optivag_db.fasta.gz: unaligned file with all 16S sequences used to simulate amplicons

  • optivag_seqinfo.csv: information on each of these sequences, including accession ID and taxonomy

genome_info:

  • bacteria_list.tsv: list of bacteria, needed for creating a database locally

  • updated_taxonomy.tsv: taxon names which changed since the inclusion in the database

tools

3 scripts, required for recreating the shotgun database from the files in genome_info

For instructions on how to create your local database, look here

Amplicon simulation

A single script, extracts amplicons and reads of a given length, given forward and reverse primer sequences

Shotgun tools

Two scripts:

  • is_it_human.py: classifies reads in a fasta file as mapped or unmapped, given a reference file in UC format

  • make_roc_curve.py: classifies reads in one or more fastas files as correctly mapped, incorrectly mapped, correclty unmapped or incorrectly unmapped, given a reference file in UC or SAM format

About

Database for annotating vaginal communities

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published