Snakemake workflow: Assembly of near full length 16S rRNA sequences

The workflow assembles and annotates near full length 16S rRNA illumina sequences using spades or megahit, drops chimeric contigs and contigs less than 1000bp, and annotates the assembled 16S sequences against a chosen 16S database e.g SILVA database using BLAST.

I will be be happy to fix any bug that you might find, so please feel free to reach out to me at obadbotanist@yahoo.com or initiate a pull request.

Please do not forget to cite the authors of the tools used.

The Pipeline does the following:

Quality checks, summarizes and counts the input reads using FASTQC, MultiQC and seqkit
Trims off primers and adapters using a combination of Trimmomatic and Trimgalore
Quality checks, summarizes and counts the trimmed reads using FASTQC, MultiQC and seqkit
Assembles the clean reads using either spades or megahit
Detects Chimeric contigs using Usearch
Drops Chimeric sequences and sequences less than 1000bp, since full length 16S rRNA sequences should be longer
Annotates the full length 16S rRNA sequences using BLAST

Authors

Olabiyi Obayomi (@olabiyi)

Before you start, make sure you have all the required software installed. You can optionally install my bioinfo environment which contains snakemake and many other useful bioinformatics tools.

miniconda
snakemake
multiqc
fastqc
parallel
trim_galore
cutadapt
trimmomatic
seqkit
blast
spades
megahit
usearch

Please see the README file here for more details on how to install miniconda, snakemake and my bioinfo environment.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
01.raw_data		01.raw_data
config		config
database		database
images		images
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
find_seqs.sh		find_seqs.sh
rulegraph.png		rulegraph.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snakemake workflow: Assembly of near full length 16S rRNA sequences

Authors

About

Releases

Packages

Languages

License

olabiyi/snakemake-16S-assembly

Folders and files

Latest commit

History

Repository files navigation

Snakemake workflow: Assembly of near full length 16S rRNA sequences

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages