-
Notifications
You must be signed in to change notification settings - Fork 72
metaphlan1
MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
MetaPhlAn is available as a Galaxy module and as a github repository. For additional information, please refer to the MetaPhlAn paper.
We provide support for MetaPhlAn users. Please join our Google group designated specifically for MetaPhlAn users. Feel free to post any questions on the google group by posting directly or emailing metaphlan-users@googlegroups.com
Table of contents
The following figure shows the workflow of MetaPhlAn.
MetaPhlAn accepts the metagenomic shotgun sequencing data for metagenome profiling (input formats include: .fasta, .fastq, .tar.bz2 etc.)
Follow the instructions below to perform MetaPhlAn on a sample dataset using the Galaxy module.
-
Go to the Huttenhower's Galaxy server.
-
Click on the
Get Data
link on the left pane -
* Click
Browse
to select the metagenome sequencing file OR paste the link of the dataset in theURL/Text
textbox. (For the purpose of this tutorial we will be using the dataset available here: https://bitbucket.org/nsegata/metaphlan/wiki/LC1.fna) -
- Click
Execute
button to upload the file.
- Click
-
Click on the
MetaPhlAn
link on the left pane, and select the uploaded data set from theInput metagenome
drop-down menu, and pressExecute
. (You may change the sensitivity settings according to your preferences) -
Once completed, the icon representing the result (microbial abundance tables: first Column representing microbial species, second column representing microbial abundances) on the right pane will turn green. The data will be ready to download (by clicking on the save button on the right pane).
Please refer to the MetaPhlAn documentation for the pre-requisites/dependencies and installation instructions.
MetaPhlAn accepts the metagenomic shotgun sequencing data for metagenome profiling (input formats include: .fasta, .fastq, .tar.bz2 etc.)
For the purpose of this tutorial we will use the following 8 samples as inputs (downloaded from the Human Microbiome Project website).
Input Samples: Buccal mucosal(SRS063417, SRS022158, SRS052620, SRS019379), Posterior Fornix(SRS016297, SRS014575, SRS019024, SRS058186)
- Create an input directory under the metaphlan repository as:
/metaphlan/input
, and place your input files under it.
-
Create a directory under metaphlan as:
/metaphlan/profiled_samples
to save the output results from MetaPhlAn into. -
Run the following command from the terminal to save the list of sample names under a variable: :
$ samples ="SRS014575 SRS016297 SRS019024 SRS051868 SRS019379 SRS052620 SRS022158 SRS063417"
-
Run the following command to run metaphlan over all the input samples (this might take a while): :
$ for s in ${samples} > do > tar xjf input/${s}.tar.bz2 --to-stdout | ./metaphlan.py --bowtie2db bowtie2db/mpa --bt2_ps very-sensitive --input_type multifastq > profiled_samples/mp_${s}.txt > done
-
The saved output files will appear in the directory:
/metaphlan/profiled_samples
. -
The output microbial abundance tables (tab-delimited) contain the microbial species (Column 1) and their associated relative abundances (Column 2) per sample.
To visualize the MetaPhlAn results in the form of a heatmap, please follow the instructions below. The heatmap can be plotted for any, some or all of the microbial abundance table results. For the purpose of this tutorial we will plot the heatmap for all of the samples.
-
Create an output folder under
/metaphlan/output/
-
Run the following command to merge all the microbial abundance tables in the profiled_samples directory: :
$ python utils/merge_metaphlan_tables.py profiled_samples/*.txt > output/merged_abundance_table.txt
-
Create an output_images directory as
/metaphlan/output_images/
to store all the output images. -
Run the following command to generate the heatmap: :
$ python plotting_scripts/metaphlan_hclust_heatmap.py -c bbcry --top 25 --minv 0.1 -s log --in output/merged_abundance_table.txt --out output_images/abundance_heatmap.png
The resulting heatmap is shown below:
You may use GraPhlAn (Galaxy module (see Section 3.2.2 for instructions) or Github repository (see Section 3.2.1) for instructions). For information on dependencies and installation for the GraPhlAn github repository please refer to the GraPhlAn tutorial.
GraPhlAn requires two inputs: (i) a tree structure to represent and (ii) graphical annotation options for the tree. MetaPhlAn includes the functionality to generate these files. Follow the instructions below to generate the GraPhlAn input files.
-
Create a temporary directory (e.g.
/metaphlan/tmp
) to store these files. -
Run the following command from the terminal (current directory: metaphlan) to generate the two input files for GraPhlAn (Tree: merged_abundance.tree.txt, Annotation: merged_abundance.annot.txt): :
$ python plotting_scripts/metaphlan2graphlan.py output/merged_abundance_table.txt --tree_file tmp/merged_abundance.tree.txt --annot_file tmp/merged_abundance.annot.txt
Once generated, you can use these files to visualize the results using either the GraPhlAn github repository (Section 3.2.1) or the GraPhlAn Galaxy module (Section 3.2.2).
To visualize using the GraPhlAn Github repository, please ensure that the PATH environment variable is set to have access to the graphlan repository (for more information please see the documentation for GraPhlAn).
-
Run the following commands to (i) create a PhyloXML file from the two inputs (merged_abundance.tree.txt, merged_abundance.annot.txt), (ii) generate the cladogram: :
$ graphlan_annotate.py --annot tmp/merged_abundance.annot.txt tmp/merged_abundance.tree.txt tmp/merged_abundance.xml $ graphlan.py --dpi 200 tmp/merged_abundance.xml output_images/merged_abundance.png
The generated cladogram is shown below:
You can also use the GraPhlAn Galaxy module to visualize the results. Follow the instructions below.
- Go to the Huttenhower Galaxy server
- Click on the
Upload File
underGet Data
link on the left pane. - * Select the input tree file (merged_table.tree. for this tutorial)
- * Select the File Format as circl
-
- Click on the
Execute
button to upload the Tree, as shown below:
- Click on the
- Click on the
Annotate tree
underGraPhlAn
link on the left pane - * From the
Input Tree
, select the tree you uploaded (merged_abundance.tree.txt in this tutorial). -
- From the
Select Clade(s)
list, select the clades you want to be displayed on the figure.
- From the
- * In the text field
Annotation Label
, enter * - * From the
Annotation Label Clade Selector
drop-down menu, select theClade and its leaf nodes
option -
- Click on the
Execute
button.
- Click on the
- Click on the
Get Data
link on the left pane, under theLOAD DATA MODULE
, and upload the annotation file (merged_table.annot) as shown below:
- Click on the
Add rings to the tree
link under the GraPhlAn module on the left pane. -
- Select the annotated tree (produced from the
Annotate tree
step) from theInput Tree
drop-down menu, and from theRing Input File
drop-down menu, select the annotation file (merged_abundance.annot.txt) that you just uploaded, as shown below:
- Select the annotated tree (produced from the
- Click on the
Plot Tree
link from the left pane, and select the result (produced from the step above), and click on theExecute
button, as shown below:
The resulting image is going to be the same as shown above.
For further analysis, please refer to the tutorials for LEfSe and MaAsLin.
For more information on MetaPhlAn, please refer to the following wiki pages:
- HUMAnN 2.0
- HUMAnN 3.0
- MetaPhlAn 2.0
- MetaPhlAn 3.0
- MetaPhlAn 4.0
- MetaPhlAn 4.1
- PhyloPhlAn 3
- PICRUSt 2.0
- ShortBRED
- PPANINI
- StrainPhlAn 3.0
- StrainPhlAn 4.0
- MelonnPan
- WAAFLE
- MetaWIBELE
- MACARRoN
- FUGAsseM
- HAllA
- HAllA Legacy
- ARepA
- CCREPE
- LEfSe
- MaAsLin 2.0
- MMUPHin
- microPITA
- SparseDOSSA
- SparseDOSSA2
- BAnOCC
- anpan
- MTXmodel
- PARATHAA