Skip to content

Directly align reads to a universal reference database. Use the whole community and uniquely mapping reads -> unique genes to clean-up matching. Get quantitative and comparable community phylogenies, genes, and functions.

License

Notifications You must be signed in to change notification settings

TealFurnholm/Strain-Level_Metatranscriptome_Analysis

Repository files navigation

The Complete Metatranscriptome Pipeline

This is a direct read alignment pipeline. In the case of multiple conditions, it also includes optional differentially expressed gene (DEGs), differentally abundant function (DAFs), and differentially abundant organisms (DAOs) community analysis.

Purpose:

This pipeline was created so scientists could run the complete microbiome RNA-seq analysis in a single simplified process. I have a Universal Reference Database with genes from all sequenced organisms (Eukaryotes, Viruses, Archaea, Bacteria, Plasmids...) that have both functional (Kegg, COG, Pfam, GO, InterPro, MetaCyc, metal binding...etc) and phylogenetic annotations. After QC and alignment of your reads, various scripts are run to output:
1. Community Phylogenetic Tree(s) - which include quantitative and comparative (if specified) data
2. Community Functional Analysis - summary of reads, genes, top organism, and lowest common ancestor for each identified function
3. Gene Info Matrix - contains each matched gene from the universal database, with gene info, alignment score, various read counts (RPM, unique RPM, and RPKM), functional annotations and phylogeny. This info matrix can be used in R for comparative analysis.

Direct Alignment:

trimming -> read cleaning -> direct gene alignment -> differential analysis

OR

Metagenome Alignment:

You can first analyze your contig genes: https://github.com/TealFurnholm/Teals_Strain-Level_Metagenome_Pipeline/wiki/Contig-Analysis

trimming -> read cleaning -> align to annotated contigs -> differential analysis

Requirements:

Metatranscriptomics data is very large and requires substantial computing power. Hopefully you have server access and some familiarity working on a linux/unix system.
* If you don't already have them on your system, install
- trimmomatic: http://www.usadellab.org/cms/?page=trimmomatic
- bbtools: https://sourceforge.net/projects/bbmap/
- diamond: https://github.com/bbuchfink/diamond
- perl: https://www.perl.org/get.html
- R (if doing differential expression analysis): https://www.r-project.org/

How to Use:

https://github.com/TealFurnholm/Metatranscriptome/wiki

About

Directly align reads to a universal reference database. Use the whole community and uniquely mapping reads -> unique genes to clean-up matching. Get quantitative and comparable community phylogenies, genes, and functions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published