Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change chromap arguments and more fixes #290

Merged
merged 23 commits into from
Aug 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
8a5f546
Fix path to igv files
JoseEspinosa Jul 25, 2022
ea4b0cf
Fix consensus peaks path for IGV
JoseEspinosa Jul 25, 2022
190939e
Fix consensus peaks path for IGV
JoseEspinosa Jul 27, 2022
f3f3a70
Switch --preset chip by its argument to not include the removal of du…
JoseEspinosa Jul 27, 2022
61ecb07
Add chromap back to ci tests
JoseEspinosa Jul 27, 2022
4a191ab
Suggested fix for macs
JoseEspinosa Jul 29, 2022
f5e46f6
Filter paired end data when using chromap
JoseEspinosa Aug 2, 2022
d013f66
Merge branch 'morefixes' of https://github.com/JoseEspinosa/chipseq i…
JoseEspinosa Aug 2, 2022
f57d8da
Fix indexes publication
JoseEspinosa Aug 3, 2022
17aac1d
Change deeptools by deepTools (results dir)
JoseEspinosa Aug 3, 2022
a28cd28
Add annotate_boolean_peaks process
JoseEspinosa Aug 3, 2022
53f2673
Emit output of annotate_boolean_peaks
JoseEspinosa Aug 4, 2022
0bcdb84
Fix publish folder for annotate_boolean_peaks
JoseEspinosa Aug 4, 2022
b627caa
Emit rds file form deseq2
JoseEspinosa Aug 4, 2022
2aff26a
Update docs
JoseEspinosa Aug 4, 2022
f845951
Update changelog
JoseEspinosa Aug 4, 2022
e309bc1
Fix indents
JoseEspinosa Aug 4, 2022
7a2b89f
Merge branch 'morefixes' of https://github.com/JoseEspinosa/chipseq i…
JoseEspinosa Aug 4, 2022
c31fb69
Make prettier happy
JoseEspinosa Aug 4, 2022
9c5c5ff
Update changelog
JoseEspinosa Aug 4, 2022
fd59088
Update readme and usage
JoseEspinosa Aug 4, 2022
1b91929
Add bytesize talk to readme
JoseEspinosa Aug 5, 2022
b87af07
Make prettier happy
JoseEspinosa Aug 5, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ jobs:
matrix:
aligner:
- "bowtie2"
# - "chromap"
- "chromap"
- "star"
steps:
- name: Check out pipeline code
Expand Down
11 changes: 8 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unpublished Version / DEV]
## [1.2.2] - 2022-08-22

### Enhancements & fixes

- Pipeline has been re-implemented in [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html)
- All software containers are now exclusively obtained from [Biocontainers](https://biocontainers.pro/#/registry)
- Updated pipeline template to [nf-core/tools 2.4.1](https://github.com/nf-core/tools/releases/tag/2.4.1)
- [[#128](https://github.com/nf-core/chipseq/issues/128)] - Filter files with no peaks to avoid errors in downstream processes
- [[#220](https://github.com/nf-core/chipseq/issues/220)] - Fix `phantompeakqualtools` protection stack overflow error
Expand All @@ -20,9 +22,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[228](https://github.com/nf-core/chipseq/issues/228)] - Update blacklist bed files.
- [nf-core/tools#1415](https://github.com/nf-core/tools/issues/1415) - Make `--outdir` a mandatory parameter
- [[282](https://github.com/nf-core/chipseq/issues/282)] - Fix `genome.fa` publication for IGV.
- [[280](https://github.com/nf-core/chipseq/issues/280)] - Update `macs_gsize` in `igenomes.config`, create a new `--read_length` parameter and implement the logic to calculate `--macs_gsize` when the parameter is missing.
- Eliminate `if`s conditions from `deseq2_qc` and `macs2_consensus` {local module and use `ext.when` instead.
- [[280](https://github.com/nf-core/chipseq/issues/280)] - Update `macs_gsize` in `igenomes.config`, create a new `--read_length` parameter and implement the logic to calculate `--macs_gsize` when the parameter is missing
- Eliminate `if`s conditions from `deseq2_qc` and `macs2_consensus` (local module and use `ext.when` instead)
- Remove `deseq2` differential binding analysis of consensus peaks.
- Filter paired-end files produced by `chromap` due to [this](https://github.com/nf-core/chipseq/issues/291) issue
- Remove <ANTIBODY> from the macs2 consensus publish directory since it can not be referred as input from the IGV process (meta.id not resolved at execution time)
- Add bytesize link to readme.

### Parameters

Expand Down
24 changes: 16 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,28 @@ On release, automated continuous integration tests run the pipeline on a [full-s

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

## Online videos

A short talk about the history, current status and functionality on offer in this pipeline was given by Jose Espinosa-Carrasco ([@joseespinosa](https://github.com/joseespinosa)) on [26th July 2022](https://nf-co.re/events/2022/bytesize-chipseq) as part of the nf-core/bytesize series.

You can find numerous talks on the [nf-core events page](https://nf-co.re/events) from various topics including writing pipelines/modules in Nextflow DSL2, using nf-core tooling, running nf-core pipelines as well as more generic content like contributing to Github. Please check them out!

## Pipeline summary

1. Raw read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
2. Adapter trimming ([`Trim Galore!`](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/))
3. Alignment ([`BWA`](https://sourceforge.net/projects/bio-bwa/files/))
3. Choice of multiple aligners
1.([`BWA`](https://sourceforge.net/projects/bio-bwa/files/))
2.([`Chromap`](https://github.com/haowenz/chromap)). **For paired-end reads only working until mapping steps, see [here](https://github.com/nf-core/chipseq/issues/291)**
3.([`Bowtie2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml))
4.([`STAR`](https://github.com/alexdobin/STAR))
4. Mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))
5. Merge alignments from multiple libraries of the same sample ([`picard`](https://broadinstitute.github.io/picard/))
1. Re-mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))
2. Filtering to remove:
- reads mapping to blacklisted regions ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/), [`BEDTools`](https://github.com/arq5x/bedtools2/))
- reads that are marked as duplicates ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that arent marked as primary alignments ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that are not marked as primary alignments ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that are unmapped ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that map to multiple locations ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads containing > 4 mismatches ([`BAMTools`](https://github.com/pezmaster31/bamtools))
Expand All @@ -47,11 +57,11 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
5. Generate gene-body meta-profile from bigWig files ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html))
6. Calculate genome-wide IP enrichment relative to control ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html))
7. Calculate strand cross-correlation peak and ChIP-seq quality measures including NSC and RSC ([`phantompeakqualtools`](https://github.com/kundajelab/phantompeakqualtools))
8. Call broad/narrow peaks ([`MACS2`](https://github.com/taoliu/MACS))
8. Call broad/narrow peaks ([`MACS2`](https://github.com/macs3-project/MACS))
9. Annotate peaks relative to gene features ([`HOMER`](http://homer.ucsd.edu/homer/download.html))
10. Create consensus peakset across all samples and create tabular file to aid in the filtering of the data ([`BEDTools`](https://github.com/arq5x/bedtools2/))
11. Count reads in consensus peaks ([`featureCounts`](http://bioinf.wehi.edu.au/featureCounts/))
12. Differential binding analysis, PCA and clustering ([`R`](https://www.r-project.org/), [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html))
12. PCA and clustering ([`R`](https://www.r-project.org/), [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html))
6. Create IGV session file containing bigWig tracks, peaks and differential sites for data visualisation ([`IGV`](https://software.broadinstitute.org/software/igv/)).
7. Present QC for raw read, alignment, peak-calling and differential binding results ([`MultiQC`](http://multiqc.info/), [`R`](https://www.r-project.org/))

Expand All @@ -63,7 +73,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

3. Download the pipeline and test it on a minimal dataset with a single command:

```console
```bash
nextflow run nf-core/chipseq -profile test,YOURPROFILE --outdir <OUTDIR>
```

Expand All @@ -76,9 +86,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

4. Start running your own analysis!

<!-- TODO nf-core: Update the example "typical command" below used to run the pipeline -->

```console
```bash
nextflow run nf-core/chipseq --input samplesheet.csv --outdir <OUTDIR> --genome GRCh37 -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
```

Expand Down
75 changes: 46 additions & 29 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,15 @@ process {

withName: 'UNTAR_.*' {
ext.args2 = '--no-same-owner'
publishDir = [
path: { "${params.outdir}/genome/index" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}

withName: 'UNTAR_.*|BWA_INDEX|BOWTIE2_BUILD|STAR_GENOMEGENERATE' {
withName: 'BWA_INDEX|BOWTIE2_BUILD|STAR_GENOMEGENERATE' {
publishDir = [
path: { "${params.outdir}/genome/index" },
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -253,7 +259,7 @@ if (params.aligner == 'chromap') {
]
}
withName: CHROMAP_CHROMAP {
ext.args = '--preset chip --SAM'
ext.args = '-l 2000 --low-mem --SAM'
ext.prefix = { "${meta.id}.Lb" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/library" },
Expand Down Expand Up @@ -466,7 +472,7 @@ if (!params.skip_plot_profile) {
ext.args = 'scale-regions --regionBodyLength 1000 --beforeRegionStartLength 3000 --afterRegionStartLength 3000 --skipZeros --smartLabels'
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -475,7 +481,7 @@ if (!params.skip_plot_profile) {
withName: 'DEEPTOOLS_PLOTPROFILE' {
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -484,7 +490,7 @@ if (!params.skip_plot_profile) {
withName: 'DEEPTOOLS_PLOTHEATMAP' {
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -508,7 +514,7 @@ if (!params.skip_plot_fingerprint) {
].join(' ').trim() }
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotFingerprint" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotFingerprint" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -520,16 +526,17 @@ process {
withName: 'MACS2_CALLPEAK' {
ext.args = [
'--keep-dup all',
params.narrow_peak ? '' : "--broad --broad-cutoff ${params.broad_cutoff}",
params.save_macs_pileup ? '--bdg --SPMR' : '',
params.macs_fdr ? "--qvalue ${params.macs_fdr}" : '',
params.macs_pvalue ? "--pvalue ${params.macs_pvalue}" : ''
params.narrow_peak ? '' : "--broad --broad-cutoff ${params.broad_cutoff}",
params.save_macs_pileup ? '--bdg --SPMR' : '',
params.macs_fdr ? "--qvalue ${params.macs_fdr}" : '',
params.macs_pvalue ? "--pvalue ${params.macs_pvalue}" : '',
params.aligner == "chromap" ? "--format BAM" : ''
].join(' ').trim()
publishDir = [
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -542,7 +549,7 @@ process {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/qc'
].join('') },
].join('') },
enabled: false
]
}
Expand All @@ -569,11 +576,25 @@ if (!params.skip_peak_annotation) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'ANNOTATE_BOOLEAN_PEAKS' {
ext.prefix = { "${meta.id}_peaks" }
publishDir = [
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

}
}

if (!params.skip_peak_qc) {
Expand All @@ -598,7 +619,7 @@ if (!params.skip_peak_annotation) {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/qc'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -616,9 +637,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -630,9 +650,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -648,9 +667,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -673,9 +691,8 @@ if (!params.skip_consensus_peaks) {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}",
'/deseq2'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -691,7 +708,7 @@ if (!params.skip_igv) {
path: { [
"${params.outdir}/igv",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -705,9 +722,9 @@ if (!params.skip_multiqc) {
ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : ''
publishDir = [
path: { [
"${params.outdir}/multiqc",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
"${params.outdir}/multiqc",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand Down
Loading