GitHub - TealFurnholm/Strain-Level_Metatranscriptome_Analysis: Directly align reads to a universal reference database. Use the whole community and uniquely mapping reads -> unique genes to clean-up matching. Get quantitative and comparable community phylogenies, genes, and functions.

The Complete Metatranscriptome Pipeline

This is a direct read alignment pipeline. In the case of multiple conditions, it also includes optional differentially expressed gene (DEGs), differentally abundant function (DAFs), and differentially abundant organisms (DAOs) community analysis.

Purpose:

This pipeline was created so scientists could run the complete microbiome RNA-seq analysis in a single simplified process. I have a Universal Reference Database with genes from all sequenced organisms (Eukaryotes, Viruses, Archaea, Bacteria, Plasmids...) that have both functional (Kegg, COG, Pfam, GO, InterPro, MetaCyc, metal binding...etc) and phylogenetic annotations. After QC and alignment of your reads, various scripts are run to output:

1. Community Phylogenetic Tree(s) - which include quantitative and comparative (if specified) data

2. Community Functional Analysis - summary of reads, genes, top organism, and lowest common ancestor for each identified function

3. Gene Info Matrix - contains each matched gene from the universal database, with gene info, alignment score, various read counts (RPM, unique RPM, and RPKM), functional annotations and phylogeny. This info matrix can be used in R for comparative analysis.

Direct Alignment:

trimming -> read cleaning -> direct gene alignment -> differential analysis

OR

Metagenome Alignment:

You can first analyze your contig genes: https://github.com/TealFurnholm/Teals_Strain-Level_Metagenome_Pipeline/wiki/Contig-Analysis

trimming -> read cleaning -> align to annotated contigs -> differential analysis

Requirements:

Metatranscriptomics data is very large and requires substantial computing power. Hopefully you have server access and some familiarity working on a linux/unix system.
* If you don't already have them on your system, install
- trimmomatic: http://www.usadellab.org/cms/?page=trimmomatic
- bbtools: https://sourceforge.net/projects/bbmap/
- diamond: https://github.com/bbuchfink/diamond
- perl: https://www.perl.org/get.html
- R (if doing differential expression analysis): https://www.r-project.org/

How to Use:

https://github.com/TealFurnholm/Metatranscriptome/wiki

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
BOSS.pl		BOSS.pl
Get_Info_Matrix.pl		Get_Info_Matrix.pl
LICENSE		LICENSE
Metatranscriptome_Limma_R_Analysis.ipynb		Metatranscriptome_Limma_R_Analysis.ipynb
README.md		README.md
RemovePoly.pl		RemovePoly.pl
contrast.matrix		contrast.matrix
design.matrix		design.matrix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Complete Metatranscriptome Pipeline

Purpose:

1. Community Phylogenetic Tree(s) - which include quantitative and comparative (if specified) data

2. Community Functional Analysis - summary of reads, genes, top organism, and lowest common ancestor for each identified function

3. Gene Info Matrix - contains each matched gene from the universal database, with gene info, alignment score, various read counts (RPM, unique RPM, and RPKM), functional annotations and phylogeny. This info matrix can be used in R for comparative analysis.

Direct Alignment:

trimming -> read cleaning -> direct gene alignment -> differential analysis

OR

Metagenome Alignment:

trimming -> read cleaning -> align to annotated contigs -> differential analysis

Requirements:

How to Use:

About

Releases

Packages

Languages

License

TealFurnholm/Strain-Level_Metatranscriptome_Analysis

Folders and files

Latest commit

History

Repository files navigation

The Complete Metatranscriptome Pipeline

Purpose:

1. Community Phylogenetic Tree(s) - which include quantitative and comparative (if specified) data

2. Community Functional Analysis - summary of reads, genes, top organism, and lowest common ancestor for each identified function

3. Gene Info Matrix - contains each matched gene from the universal database, with gene info, alignment score, various read counts (RPM, unique RPM, and RPKM), functional annotations and phylogeny. This info matrix can be used in R for comparative analysis.

Direct Alignment:

trimming -> read cleaning -> direct gene alignment -> differential analysis

OR

Metagenome Alignment:

trimming -> read cleaning -> align to annotated contigs -> differential analysis

Requirements:

How to Use:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages