From 0f4af2c50c533413178a9f77668fd288d44fa44e Mon Sep 17 00:00:00 2001 From: JoseEspinosa Date: Wed, 19 Apr 2023 12:07:02 +0200 Subject: [PATCH 1/5] Fix samplesheet control column in docs examples --- CHANGELOG.md | 1 + docs/usage.md | 38 +++++++++++++++++++------------------- 2 files changed, 20 insertions(+), 19 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b6d20d60d..88152e18f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - [[#311](https://github.com/nf-core/chipseq/issues/311)] Add back `--skip_spp` parameter which was unintentionally removed from the code. - Install available nf-core subworkflows and refactor code accordingly - [[#318](https://github.com/nf-core/chipseq/issues/318)] Update `bowtie2/align` module to fix issue when downloading its singularity image. +- [[#320](https://github.com/nf-core/chipseq/issues/320)] Fix samplesheet control column in documentation examples. ## [[2.0.0](https://github.com/nf-core/chipseq/releases/tag/2.0.0)] - 2022-10-03 diff --git a/docs/usage.md b/docs/usage.md index 0bdce0c37..74143398f 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -18,11 +18,11 @@ The `sample` identifiers have to be the same when you have re-sequenced the same ```console sample,fastq_1,fastq_2,antibody,control -WT_BCATENIN_IP_REP1,BLA203A1_S27_L006_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP2,BLA203A25_S16_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP2,BLA203A25_S16_L002_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP2,BLA203A25_S16_L003_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP3,BLA203A49_S40_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT +WT_BCATENIN_IP_REP1,BLA203A1_S27_L006_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP1 +WT_BCATENIN_IP_REP2,BLA203A25_S16_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP2 +WT_BCATENIN_IP_REP2,BLA203A25_S16_L002_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP2 +WT_BCATENIN_IP_REP2,BLA203A25_S16_L003_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP2 +WT_BCATENIN_IP_REP3,BLA203A49_S40_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP3 WT_INPUT_REP1,BLA203A6_S32_L006_R1_001.fastq.gz,,, WT_INPUT_REP2,BLA203A30_S21_L001_R1_001.fastq.gz,,, WT_INPUT_REP2,BLA203A30_S21_L002_R1_001.fastq.gz,,, @@ -41,20 +41,20 @@ A final design file may look something like the one below. This is for two antib ```console sample,fastq_1,fastq_2,antibody,control -WT_BCATENIN_IP_REP1,BLA203A1_S27_L006_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP2,BLA203A25_S16_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP2,BLA203A25_S16_L002_R1_001.fastq.gz,,BCATENIN,WT_INPUT -WT_BCATENIN_IP_REP3,BLA203A49_S40_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT -NAIVE_BCATENIN_IP_REP1,BLA203A7_S60_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT -NAIVE_BCATENIN_IP_REP2,BLA203A43_S34_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT -NAIVE_BCATENIN_IP_REP2,BLA203A43_S34_L002_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT -NAIVE_BCATENIN_IP_REP3,BLA203A64_S55_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT -WT_TCF4_IP_REP1,BLA203A3_S29_L006_R1_001.fastq.gz,,TCF4,WT_INPUT -WT_TCF4_IP_REP2,BLA203A27_S18_L001_R1_001.fastq.gz,,TCF4,WT_INPUT -WT_TCF4_IP_REP3,BLA203A51_S42_L001_R1_001.fastq.gz,,TCF4,WT_INPUT -NAIVE_TCF4_IP_REP1,BLA203A9_S62_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT -NAIVE_TCF4_IP_REP2,BLA203A45_S36_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT -NAIVE_TCF4_IP_REP3,BLA203A66_S57_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT +WT_BCATENIN_IP_REP1,BLA203A1_S27_L006_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP1 +WT_BCATENIN_IP_REP2,BLA203A25_S16_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP2 +WT_BCATENIN_IP_REP2,BLA203A25_S16_L002_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP2 +WT_BCATENIN_IP_REP3,BLA203A49_S40_L001_R1_001.fastq.gz,,BCATENIN,WT_INPUT_REP3 +NAIVE_BCATENIN_IP_REP1,BLA203A7_S60_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT_REP1 +NAIVE_BCATENIN_IP_REP2,BLA203A43_S34_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT_REP2 +NAIVE_BCATENIN_IP_REP2,BLA203A43_S34_L002_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT_REP2 +NAIVE_BCATENIN_IP_REP3,BLA203A64_S55_L001_R1_001.fastq.gz,,BCATENIN,NAIVE_INPUT_REP3 +WT_TCF4_IP_REP1,BLA203A3_S29_L006_R1_001.fastq.gz,,TCF4,WT_INPUT_REP1 +WT_TCF4_IP_REP2,BLA203A27_S18_L001_R1_001.fastq.gz,,TCF4,WT_INPUT_REP2 +WT_TCF4_IP_REP3,BLA203A51_S42_L001_R1_001.fastq.gz,,TCF4,WT_INPUT_REP3 +NAIVE_TCF4_IP_REP1,BLA203A9_S62_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT_REP1 +NAIVE_TCF4_IP_REP2,BLA203A45_S36_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT_REP2 +NAIVE_TCF4_IP_REP3,BLA203A66_S57_L001_R1_001.fastq.gz,,TCF4,NAIVE_INPUT_REP3 WT_INPUT_REP1,BLA203A6_S32_L006_R1_001.fastq.gz,,, WT_INPUT_REP2,BLA203A30_S21_L001_R1_001.fastq.gz,,, WT_INPUT_REP3,BLA203A31_S21_L003_R1_001.fastq.gz,,, From 683428a6760b192cd46edd35879fb0cbb6260125 Mon Sep 17 00:00:00 2001 From: JoseEspinosa Date: Wed, 19 Apr 2023 12:07:57 +0200 Subject: [PATCH 2/5] Try to clarify read_length docs --- CHANGELOG.md | 1 + nextflow_schema.json | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 88152e18f..b0266c0d1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Install available nf-core subworkflows and refactor code accordingly - [[#318](https://github.com/nf-core/chipseq/issues/318)] Update `bowtie2/align` module to fix issue when downloading its singularity image. - [[#320](https://github.com/nf-core/chipseq/issues/320)] Fix samplesheet control column in documentation examples. +- [[#328](https://github.com/nf-core/chipseq/issues/328)] Modify documentation to clarify that is necessary to provide the `--read_length` when `--genome` is set and `--macs_gsize` has not provided. ## [[2.0.0](https://github.com/nf-core/chipseq/releases/tag/2.0.0)] - 2022-10-03 diff --git a/nextflow_schema.json b/nextflow_schema.json index 77a094e3d..7cadb7c4d 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -37,6 +37,7 @@ "type": "integer", "description": "Read length used to calculate MACS2 genome size for peak calling if `--macs_gsize` isn't provided.", "fa_icon": "fas fa-chart-area", + "help_text": "Read length is used to calculate MACS2 genome size using the following [approach](https://deeptools.readthedocs.io/en/develop/content/feature/effectiveGenomeSize.html#effective-genome-size). For all the genomes present in the `igenomes.config` the genome size has been precomputed and the read length is then used to retrieve the corresponding value", "enum": [50, 75, 100, 150, 200] }, "outdir": { @@ -132,7 +133,7 @@ "macs_gsize": { "type": "number", "description": "Effective genome size parameter required by MACS2.", - "help_text": "[Effective genome size](https://github.com/taoliu/MACS#-g--gsize) parameter required by MACS2. If using an iGenomes reference these have been provided when `--genome` is set as *GRCh37*, *GRCh38*, *GRCm38*, *WBcel235*, *BDGP6*, *R64-1-1*, *EF2*, *hg38*, *hg19* and *mm10*. For other genomes, if this parameter is not specified then the MACS2 peak-calling and differential analysis will be skipped.", + "help_text": "[Effective genome size](https://github.com/taoliu/MACS#-g--gsize) parameter required by MACS2. If using an iGenomes reference these have been provided for any of the genomes available in the igenomes.config, and for the following read lengths (50,75,100,150,200) that should be set using the `--read_length` parameter. For other genomes, if this parameter is not specified then the MACS2 peak-calling and differential analysis will be skipped.", "fa_icon": "fas fa-arrows-alt-h" }, "blacklist": { From 58bd1f6bc3dd4f1440abfaacc419d4d30a0e8dbd Mon Sep 17 00:00:00 2001 From: JoseEspinosa Date: Wed, 19 Apr 2023 22:20:33 +0200 Subject: [PATCH 3/5] Fix macs_gsize help text --- nextflow_schema.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/nextflow_schema.json b/nextflow_schema.json index 7cadb7c4d..3e876f042 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -133,7 +133,7 @@ "macs_gsize": { "type": "number", "description": "Effective genome size parameter required by MACS2.", - "help_text": "[Effective genome size](https://github.com/taoliu/MACS#-g--gsize) parameter required by MACS2. If using an iGenomes reference these have been provided for any of the genomes available in the igenomes.config, and for the following read lengths (50,75,100,150,200) that should be set using the `--read_length` parameter. For other genomes, if this parameter is not specified then the MACS2 peak-calling and differential analysis will be skipped.", + "help_text": "[Effective genome size](https://github.com/taoliu/MACS#-g--gsize) parameter required by MACS2. If using an iGenomes reference these have been provided for any of the genomes available in the igenomes.config, and for the following read lengths (50,75,100,150,200) that should be set using the `--read_length` parameter. For other genomes, if this parameter is not specified it will be inferred using the provided `--read_length` or otherwise the pipeline execution will stop with an error.", "fa_icon": "fas fa-arrows-alt-h" }, "blacklist": { From f4cbf2a53ebff87395ec41d9d01b63e5d15b7b64 Mon Sep 17 00:00:00 2001 From: JoseEspinosa Date: Wed, 19 Apr 2023 22:25:17 +0200 Subject: [PATCH 4/5] Refine read_length help text --- nextflow_schema.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/nextflow_schema.json b/nextflow_schema.json index 3e876f042..ee2b1ba8a 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -37,7 +37,7 @@ "type": "integer", "description": "Read length used to calculate MACS2 genome size for peak calling if `--macs_gsize` isn't provided.", "fa_icon": "fas fa-chart-area", - "help_text": "Read length is used to calculate MACS2 genome size using the following [approach](https://deeptools.readthedocs.io/en/develop/content/feature/effectiveGenomeSize.html#effective-genome-size). For all the genomes present in the `igenomes.config` the genome size has been precomputed and the read length is then used to retrieve the corresponding value", + "help_text": "Read length together with the genome fasta are used to calculate MACS2 genome size using the `khmer` program as explained [here](https://deeptools.readthedocs.io/en/develop/content/feature/effectiveGenomeSize.html#effective-genome-size). For all the genomes present in the `igenomes.config` the genome size has been already precomputed and the read length is then used to retrieve the corresponding value", "enum": [50, 75, 100, 150, 200] }, "outdir": { From 23efb256ff8145c1fc22f2fe819bbbd15df98d0c Mon Sep 17 00:00:00 2001 From: JoseEspinosa Date: Wed, 19 Apr 2023 22:27:44 +0200 Subject: [PATCH 5/5] Update readme since the steps below mapping can be performed when chromap is used with PE reads --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 232ef8b8a..b6c474ab4 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ You can find numerous talks on the [nf-core events page](https://nf-co.re/events 2. Adapter trimming ([`Trim Galore!`](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/)) 3. Choice of multiple aligners 1.([`BWA`](https://sourceforge.net/projects/bio-bwa/files/)) - 2.([`Chromap`](https://github.com/haowenz/chromap)). **For paired-end reads only working until mapping steps, see [here](https://github.com/nf-core/chipseq/issues/291)** + 2.([`Chromap`](https://github.com/haowenz/chromap)) 3.([`Bowtie2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml)) 4.([`STAR`](https://github.com/alexdobin/STAR)) 4. Mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))