The MetaFora (Metagenomics For All) pipeline analyzes metagenomic data based on 16S DNA obtained from the application of next-generation sequencing (NGS) platforms. The files should be demultiplexed.
Stage 1. Pre-process the raw data
- Parse the raw reads
- Trim based on quality and adaptors
- Quality control
Stage 2. Quantification of metagenomic data
- Submodule I: Taxonomy profile
- Submodule II: Diversity profile
- Submodule III: Comparative integrative analysis
A schematic of the workflow:
- FastQC v.0.11.7 (Andrews, 2010)
- Java 7 or higher - We used Java 8
- Trimmomatic v.0.36 (Bolger et al., 2014)
- PEAR v. 0.9.11 (Zhang et al., 2014)
- QIIME v.1.9.1 (Caporaso, et al., 2010)
- Vsearch v2.9.1 (Rognes et al., 2016)
- The R Project for Statistical Computing(Team, 2000)
- dyplr R package (Wickham et al., 2018)
- purrr R package(Henry and Wickham, 2018)
Installing QIIME
QIIME is a set of python scripts that are called using the terminal. Authors recommend installing QIIME through Miniconda package. Here, you can see a guide for Linux OS.
- Install Miniconda package.
- Create your qiime1 environmental and install QIIME.
$ conda create -n qiime1 python=2.7 qiime matplotlib=1.4.3 mock nose -c bioconda
Install the additional components
Add additional channels for required packages
$ conda config --add channels defaults
$ conda config --add channels r
$ conda config --add channels bioconda
$ conda config --add channels biocore
$ conda config --add channels hcc
Install additional packages
$ conda install --yes r-essentials blast-legacy cd-hit cdbtools cogent ea-utils infernal microbiomeutil muscle mothur r-vegan sourcetracker sortmerna sumaclust rdp-classifier swarm tax2tree
- Activate your qiime1 environment and test your QIIME installation.
$ source activate qiime1
$ print_qiime_config.py –t
NOTE: If you have any problems with the proper functioning of QIIME, you should check the versions of these packages:
NumPy version: 1.15.0
pandas version: 0.23.1
matplotlib version: 1.4.3
Installing vsearch v2.9.1
- Download and install vsearch:
$ wget https://github.com/torognes/vsearch/archive/v2.9.1.tar.gz
$ tar xzf v2.9.1.tar.gz
$ cd vsearch-2.9.1
$ ./autogen.sh
$ ./configure
$ make
$ make install # as root or sudo make install
- Rename vsearch to usearch61:
$ mv vsearch-2.9.1 usearch61
- It is necessary, set executable permissions:
$ chmod a+x usearch61
- Move files to /miniconda3/envs/QIIME1/bin
$ mv usearch61 /$path/miniconda3/envs/QIIME191/bin
-
Fastq files have to be included in a folder called “Reads”
-
A mapping file should be created including information about samples and metadata. For information about its format file, see the webpage: http://qiime.org/documentation/file_formats.html?highlight=mapping.
Before starting, you should check that your working directory includes:
- Mapping file
- “Reads” folder
- Pipeline files: metafora1.pl, metafora2.pl and metagenomics_mod.pm
Run the pipeline in the terminal:
$ perl metafora1.pl -t <data type, PE = Paired-End or SE = Single-End> -q -tr
$ perl metafora2.pl -db <database: gg, RDP or SILVA> -jobs -m <Filename of mapping file (without extension)> -level <taxonomic level (L2, L3…L7)> -count -samples
Some examples of graphics generated using this pipeline:
Taxonomic profile
Taxonomy profile from analysis using SILVA database:
Diversity profile
2D representation of PCoA results from the unweighted unifrac, weighted and Bray-Curtis index based on Greengenes database. The "Description" title is referred to samples: