Merge pull request #97 from d4straub/dev

Update QIIME2 2018.6 to 2019.7
nf-core · Dec 5, 2019 · d3221ab · d3221ab
2 parents 095e7ab + 8ab3512
commit d3221ab
Show file tree

Hide file tree

Showing 12 changed files with 409 additions and 322 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -13,7 +13,7 @@ before_install:
   # Pull the docker image first so the test doesn't wait for this
   - docker pull nfcore/ampliseq:dev
   # Fake the tag locally so that the pipeline runs properly
-  - docker tag nfcore/ampliseq:dev nfcore/ampliseq:1.0.0
+  - docker tag nfcore/ampliseq:dev nfcore/ampliseq:dev
 
 install:
   # Install Nextflow

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,13 @@
 # nf-core/ampliseq
 
+## nf-core/ampliseq version 1.1.1
+
+#### Pipeline updates
+* Update from QIIME2 v2018.6 to v2019.7, including DADA2 v1.6 to DADA2 v1.10
+
+#### Bug fixes
+* [#78](https://github.com/nf-core/ampliseq/issues/78) - All sequenced classifed to the same species
+
 ## nf-core/ampliseq version 1.1.0 "Silver Lime Bee" - 2019
 
 #### Pipeline updates

diff --git a/Dockerfile b/Dockerfile
@@ -2,7 +2,7 @@ FROM nfcore/base
 LABEL description="Docker image containing all requirements for nf-core/ampliseq pipeline"
 COPY environment.yml /
 RUN conda env create -f /environment.yml && conda clean -a
-ENV PATH /opt/conda/envs/nf-core-ampliseq-1.0.0/bin:$PATH
+ENV PATH /opt/conda/envs/nf-core-ampliseq-dev/bin:$PATH
 ## Required to build the container properly
 RUN mkdir -p /root/.config/matplotlib
 RUN echo "backend : Agg" > /root/.config/matplotlib/matplotlibrc

diff --git a/README.md b/README.md
@@ -1,7 +1,6 @@
 # ![nf-core/ampliseq](docs/images/nfcore-ampliseq_logo.png)
 
-[![Build Status](https://travis-ci.org/nf-core/ampliseq.svg?branch=master)](https://travis-ci.org/nf-core/ampliseq)
-[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A518.10.1-brightgreen.svg)](https://www.nextflow.io/)
+[![Build Status](https://travis-ci.com/nf-core/ampliseq.svg?branch=master)](https://travis-ci.com/nf-core/ampliseq)[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A518.10.1-brightgreen.svg)](https://www.nextflow.io/)
 
 [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)
 [![Docker](https://img.shields.io/docker/automated/nfcore/ampliseq.svg)](https://hub.docker.com/r/nfcore/ampliseq)
@@ -12,7 +11,7 @@ https://img.shields.io/badge/singularity-available-7E4C74.svg)
 
 **nfcore/ampliseq** is a bioinformatics analysis pipeline used for 16S rRNA amplicon sequencing data.
 
-The workflow processes raw data from FastQ inputs ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), trims primer sequences from the reads ([Cutadapt](https://journal.embnet.org/index.php/embnetjournal/article/view/200)), imports data into [QIIME2](https://qiime2.org/), generates amplicon sequencing variants (ASV, [DADA2](https://www.nature.com/articles/nmeth.3869)), classifies features against the [SILVA](https://www.arb-silva.de/) [v132](https://www.arb-silva.de/documentation/release-132/) database, excludes unwanted taxa, produces absolute and relative feature/taxa count tables and plots, plots alpha rarefaction curves, computes alpha and beta diversity indices and plots thereof, and finally calls differentially abundant taxa ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)). See the [output documentation](docs/output.md) for more details of the results.
+The workflow processes raw data from FastQ inputs ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)), trims primer sequences from the reads ([Cutadapt](https://journal.embnet.org/index.php/embnetjournal/article/view/200)), imports data into [QIIME2](https://www.nature.com/articles/s41587-019-0209-9), generates amplicon sequencing variants (ASV, [DADA2](https://www.nature.com/articles/nmeth.3869)), classifies features against the [SILVA](https://www.arb-silva.de/) [v132](https://www.arb-silva.de/documentation/release-132/) database, excludes unwanted taxa, produces absolute and relative feature/taxa count tables and plots, plots alpha rarefaction curves, computes alpha and beta diversity indices and plots thereof, and finally calls differentially abundant taxa ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)). See the [output documentation](docs/output.md) for more details of the results.
 
 The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.
 

diff --git a/conf/base.config b/conf/base.config
@@ -64,6 +64,9 @@ process {
     memory = { check_max (16.GB * task.attempt, 'memory' ) }
     time = { check_max (2.h * task.attempt, 'time' ) }
   }
+  withName: alpha_diversity {
+    errorStrategy = { task.exitStatus in [143,137] ? 'retry' : 'ignore' }
+  }
 }
 
 params {

diff --git a/conf/test.config b/conf/test.config
@@ -17,7 +17,7 @@ params {
   // Input data
   FW_primer = "GTGYCAGCMGCCGCGGTAA"
   RV_primer = "GGACTACNVGGGTWTCTAAT"
-  classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-classifier.qza"
+  classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-gg_13_8-85-qiime2_2019.7-classifier.qza"
   metadata = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/Metadata.tsv"
   outdir = "./results"
   temp_dir = "./results/tmp_dir"

diff --git a/conf/test_multi.config b/conf/test_multi.config
@@ -17,7 +17,7 @@ params {
   // Input data
   FW_primer = "GTGYCAGCMGCCGCGGTAA"
   RV_primer = "GGACTACNVGGGTWTCTAAT"
-  classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-classifier.qza"
+  classifier = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/GTGYCAGCMGCCGCGGTAA-GGACTACNVGGGTWTCTAAT-gg_13_8-85-qiime2_2019.7-classifier.qza"
   metadata = "https://github.com/nf-core/test-datasets/raw/ampliseq/testdata/Metadata_multi.tsv"
   outdir = "./results"
   temp_dir = "./results/tmp_dir"

diff --git a/docs/output.md b/docs/output.md
@@ -56,7 +56,7 @@ The pipeline has special steps which allow the software versions used to be repo
 For more information about how to use MultiQC reports, see [http://multiqc.info](http://multiqc.info)
 
 ### QIIME2
-**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303). QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2018.6. QIIME2 has a wide variety of analysis tools available and has excellent support in its [forum](https://docs.qiime2.org/2018.6/).
+**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303). QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2019.7. QIIME2 has a wide variety of analysis tools available and has excellent support in its [forum](https://docs.qiime2.org/2019.7/).
 
 At this point of the analysis the trimmed reads are imported into QIIME2 and an interactive quality plot is made.
 
@@ -205,15 +205,15 @@ ANCOM is applied to each suitable or specified metadata column for 6 taxonomic l
   * taxonomic level: level-2 (phylum), level-3 (class), level-4 (order), level-5 (family), level-6 (genus), ASV
 
 ## More help
-QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2018.6. QIIME2 has excellent support in its [forum](https://docs.qiime2.org/2018.6/).
+QIIME2 is currently **under heavy development** and often updated, this version of ampliseq uses QIIME2 2019.7. QIIME2 has excellent support in its [forum](https://docs.qiime2.org/2019.7/).
 
 ## Citations
 All tools inside the pipeline have to be cited in a publication properly:
 
 * FastQC, "Andrews, Simon. "FastQC: a quality control tool for high throughput sequence data." (2010)."
 * Cutadapt "Martin, Marcel. "Cutadapt removes adapter sequences from high-throughput sequencing reads." EMBnet. journal 17.1 (2011): pp-10."
 * MultiQC, "Ewels, Philip, et al. "MultiQC: summarize analysis results for multiple tools and samples in a single report." Bioinformatics 32.19 (2016): 3047-3048."
-* QIIME2, "Bolyen, Evan, et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. No. e27295v1. PeerJ Preprints, 2018."
+* QIIME2, "Bolyen, Evan, et al. "Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2." Nature Biotechnology 37 (2019): 852–857."
 * DADA2, "Callahan, Benjamin J., et al. "DADA2: high-resolution sample inference from Illumina amplicon data." Nature methods 13.7 (2016): 581."
 * Matplotlib, "Hunter, John D. "Matplotlib: A 2D graphics environment." Computing in science & engineering 9.3 (2007): 90-95."
 * Feature-classifier, "Bokulich, Kaehler, et al. "Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin." Microbiome 6 (2018): 90.

diff --git a/docs/usage.md b/docs/usage.md
@@ -184,7 +184,7 @@ This is optional, but for performing downstream analysis such as barplots, diver
 Please note the following requirements:
 
 1. The path must be enclosed in quotes
-2. The metadata file has to follow the [QIIME2 specifications](https://docs.qiime2.org/2018.6/tutorials/metadata/)
+2. The metadata file has to follow the [QIIME2 specifications](https://docs.qiime2.org/2019.7/tutorials/metadata/)
 3. In case of multiple sequencing runs, specific naming of samples are required, see [here](#--multipleSequencingRuns)
 
 ## Other input options
@@ -358,9 +358,9 @@ If you have trained a compatible classifier before, from sources such as [SILVA]
 Please note the following requirements:
 
 1. The path must be enclosed in quotes
-2. The cassifier is a Naive Bayes classifier produced by "qiime feature-classifier fit-classifier-naive-bayes" (e.g. by this pipeline or from [QIIME2 resources](https://docs.qiime2.org/2018.6/data-resources/))
+2. The cassifier is a Naive Bayes classifier produced by "qiime feature-classifier fit-classifier-naive-bayes" (e.g. by this pipeline or from [QIIME2 resources](https://docs.qiime2.org/2019.7/data-resources/))
 3. The primer pair for the amplicon PCR and the computing of the classifier are exactly the same
-4. The classifier has to be trained by the same version of scikit-learn as this version of the pipeline uses (0.19.1)
+4. The classifier has to be trained by the same version of scikit-learn as this version of the pipeline uses (0.21.2)
 
 ### `--classifier_removeHash`