Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update QIIME2 2018.6 to 2019.7 #97

Merged
merged 15 commits into from Dec 5, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Expand Up @@ -13,7 +13,7 @@ before_install:
# Pull the docker image first so the test doesn't wait for this
- docker pull nfcore/ampliseq:dev
# Fake the tag locally so that the pipeline runs properly
- docker tag nfcore/ampliseq:dev nfcore/ampliseq:1.0.0
- docker tag nfcore/ampliseq:dev nfcore/ampliseq:dev

install:
# Install Nextflow
Expand Down
8 changes: 8 additions & 0 deletions CHANGELOG.md
@@ -1,5 +1,13 @@
# nf-core/ampliseq

## nf-core/ampliseq version 1.1.1

#### Pipeline updates
* Update from QIIME2 v2018.6 to v2019.7, including DADA2 v1.6 to DADA2 v1.10

#### Bug fixes
* [#78](https://github.com/nf-core/ampliseq/issues/78) - All sequenced classifed to the same species

## nf-core/ampliseq version 1.1.0 "Silver Lime Bee" - 2019

#### Pipeline updates
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Expand Up @@ -2,7 +2,7 @@ FROM nfcore/base
LABEL description="Docker image containing all requirements for nf-core/ampliseq pipeline"
COPY environment.yml /
RUN conda env create -f /environment.yml && conda clean -a
ENV PATH /opt/conda/envs/nf-core-ampliseq-1.0.0/bin:$PATH
ENV PATH /opt/conda/envs/nf-core-ampliseq-dev/bin:$PATH
## Required to build the container properly
RUN mkdir -p /root/.config/matplotlib
RUN echo "backend : Agg" > /root/.config/matplotlib/matplotlibrc
Expand Down
5 changes: 2 additions & 3 deletions README.md
@@ -1,7 +1,6 @@
# ![nf-core/ampliseq](docs/images/nfcore-ampliseq_logo.png)

[![Build Status](https://travis-ci.org/nf-core/ampliseq.svg?branch=master)](https://travis-ci.org/nf-core/ampliseq)
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A518.10.1-brightgreen.svg)](https://www.nextflow.io/)
[![Build Status](https://travis-ci.com/nf-core/ampliseq.svg?branch=master)](https://travis-ci.com/nf-core/ampliseq)[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A518.10.1-brightgreen.svg)](https://www.nextflow.io/)

[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)
[![Docker](https://img.shields.io/docker/automated/nfcore/ampliseq.svg)](https://hub.docker.com/r/nfcore/ampliseq)
Expand All @@ -12,7 +11,7 @@ https://img.shields.io/badge/singularity-available-7E4C74.svg)

**nfcore/ampliseq** is a bioinformatics analysis pipeline used for 16S rRNA amplicon sequencing data.

The workflow processes raw data from FastQ inputs ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), trims primer sequences from the reads ([Cutadapt](https://journal.embnet.org/index.php/embnetjournal/article/view/200)), imports data into [QIIME2](https://qiime2.org/), generates amplicon sequencing variants (ASV, [DADA2](https://www.nature.com/articles/nmeth.3869)), classifies features against the [SILVA](https://www.arb-silva.de/) [v132](https://www.arb-silva.de/documentation/release-132/) database, excludes unwanted taxa, produces absolute and relative feature/taxa count tables and plots, plots alpha rarefaction curves, computes alpha and beta diversity indices and plots thereof, and finally calls differentially abundant taxa ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)). See the [output documentation](docs/output.md) for more details of the results.
The workflow processes raw data from FastQ inputs ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), trims primer sequences from the reads ([Cutadapt](https://journal.embnet.org/index.php/embnetjournal/article/view/200)), imports data into [QIIME2](https://www.nature.com/articles/s41587-019-0209-9), generates amplicon sequencing variants (ASV, [DADA2](https://www.nature.com/articles/nmeth.3869)), classifies features against the [SILVA](https://www.arb-silva.de/) [v132](https://www.arb-silva.de/documentation/release-132/) database, excludes unwanted taxa, produces absolute and relative feature/taxa count tables and plots, plots alpha rarefaction curves, computes alpha and beta diversity indices and plots thereof, and finally calls differentially abundant taxa ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)). See the [output documentation](docs/output.md) for more details of the results.

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.

Expand Down
3 changes: 3 additions & 0 deletions conf/base.config
Expand Up @@ -64,6 +64,9 @@ process {
memory = { check_max (16.GB * task.attempt, 'memory' ) }
time = { check_max (2.h * task.attempt, 'time' ) }
}
withName: alpha_diversity {
errorStrategy = { task.exitStatus in [143,137] ? 'retry' : 'ignore' }
}
}

params {
Expand Down
2 changes: 1 addition & 1 deletion conf/test.config
Expand Up @@ -17,7 +17,7 @@ params {
// Input data
FW_primer = "GTGYCAGCMGCCGCGGTAA"
RV_primer = "GGACTACNVGGGTWTCTAAT"
classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-classifier.qza"
classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-gg_13_8-85-qiime2_2019.7-classifier.qza"
metadata = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/Metadata.tsv"
outdir = "./results"
temp_dir = "./results/tmp_dir"
Expand Down
2 changes: 1 addition & 1 deletion conf/test_multi.config
Expand Up @@ -17,7 +17,7 @@ params {
// Input data
FW_primer = "GTGYCAGCMGCCGCGGTAA"
RV_primer = "GGACTACNVGGGTWTCTAAT"
classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-classifier.qza"
classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-gg_13_8-85-qiime2_2019.7-classifier.qza"
metadata = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/Metadata_multi.tsv"
outdir = "./results"
temp_dir = "./results/tmp_dir"
Expand Down
6 changes: 3 additions & 3 deletions docs/output.md
Expand Up @@ -56,7 +56,7 @@ The pipeline has special steps which allow the software versions used to be repo
For more information about how to use MultiQC reports, see [http://multiqc.info](http://multiqc.info)

### QIIME2
**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303). QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2018.6. QIIME2 has a wide variety of analysis tools available and has excellent support in its [forum](https://docs.qiime2.org/2018.6/).
**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303). QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2019.7. QIIME2 has a wide variety of analysis tools available and has excellent support in its [forum](https://docs.qiime2.org/2019.7/).

At this point of the analysis the trimmed reads are imported into QIIME2 and an interactive quality plot is made.

Expand Down Expand Up @@ -205,15 +205,15 @@ ANCOM is applied to each suitable or specified metadata column for 6 taxonomic l
* taxonomic level: level-2 (phylum), level-3 (class), level-4 (order), level-5 (family), level-6 (genus), ASV

## More help
QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2018.6. QIIME2 has excellent support in its [forum](https://docs.qiime2.org/2018.6/).
QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2019.7. QIIME2 has excellent support in its [forum](https://docs.qiime2.org/2019.7/).

## Citations
All tools inside the pipeline have to be cited in a publication properly:

* FastQC, "Andrews, Simon. "FastQC: a quality control tool for high throughput sequence data." (2010)."
* Cutadapt "Martin, Marcel. "Cutadapt removes adapter sequences from high-throughput sequencing reads." EMBnet. journal 17.1 (2011): pp-10."
* MultiQC, "Ewels, Philip, et al. "MultiQC: summarize analysis results for multiple tools and samples in a single report." Bioinformatics 32.19 (2016): 3047-3048."
* QIIME2, "Bolyen, Evan, et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. No. e27295v1. PeerJ Preprints, 2018."
* QIIME2, "Bolyen, Evan, et al. "Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2." Nature Biotechnology 37 (2019): 852–857."
* DADA2, "Callahan, Benjamin J., et al. "DADA2: high-resolution sample inference from Illumina amplicon data." Nature methods 13.7 (2016): 581."
* Matplotlib, "Hunter, John D. "Matplotlib: A 2D graphics environment." Computing in science & engineering 9.3 (2007): 90-95."
* Feature-classifier, "Bokulich, Kaehler, et al. "Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin." Microbiome 6 (2018): 90.
Expand Down
6 changes: 3 additions & 3 deletions docs/usage.md
Expand Up @@ -184,7 +184,7 @@ This is optional, but for performing downstream analysis such as barplots, diver
Please note the following requirements:

1. The path must be enclosed in quotes
2. The metadata file has to follow the [QIIME2 specifications](https://docs.qiime2.org/2018.6/tutorials/metadata/)
2. The metadata file has to follow the [QIIME2 specifications](https://docs.qiime2.org/2019.7/tutorials/metadata/)
3. In case of multiple sequencing runs, specific naming of samples are required, see [here](#--multipleSequencingRuns)

## Other input options
Expand Down Expand Up @@ -358,9 +358,9 @@ If you have trained a compatible classifier before, from sources such as [SILVA]
Please note the following requirements:

1. The path must be enclosed in quotes
2. The cassifier is a Naive Bayes classifier produced by "qiime feature-classifier fit-classifier-naive-bayes" (e.g. by this pipeline or from [QIIME2 resources](https://docs.qiime2.org/2018.6/data-resources/))
2. The cassifier is a Naive Bayes classifier produced by "qiime feature-classifier fit-classifier-naive-bayes" (e.g. by this pipeline or from [QIIME2 resources](https://docs.qiime2.org/2019.7/data-resources/))
3. The primer pair for the amplicon PCR and the computing of the classifier are exactly the same
4. The classifier has to be trained by the same version of scikit-learn as this version of the pipeline uses (0.19.1)
4. The classifier has to be trained by the same version of scikit-learn as this version of the pipeline uses (0.21.2)

### `--classifier_removeHash`

Expand Down