Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplication for the ext.args of the STAR module #934

Merged

Conversation

MatthiasZepper
Copy link
Member

Potential for parameter clashes

The introduction of the --extra_star_align_args parameter in Version 3.10 (#907) of the pipeline introduced potential parameter clashes in the STAR module. If the same parameter is specified in params.extra_star_align_args and ext.args, they will both be retained in the module:

Error executing process > 'NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN_IGENOMES (S101)'


Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN_IGENOMES (S101)` terminated with an error exit status (102)

Command executed:

  STAR \
      --genomeDir star \
      --readFilesIn S101_trimmed.fq.gz  \
      --runThreadN 12 \
      --outFileNamePrefix S101. \
       \
      --sjdbGTFfile GRCh37_ERCC92.gtf \
      --outSAMattrRGline ID:S101 'SM:S101'   \
      --quantMode TranscriptomeSAM --twopassMode Basic --outSAMtype BAM Unsorted --readFilesCommand zcat --runRNGseed 0 --outFilterMultimapNmax 20 --alignSJDBoverhangMin 1 --outSAMattributes NH HI AS NM MD --quantTranscriptomeBan Singleend --outSAMstrandField intronMotif  --alignIntronMax 1000000 --alignIntronMin 20 --alignMatesGapMax 1000000 --alignSJoverhangMin 8 --outFilterMismatchNmax 999 --outFilterMultimapNmax 20 --outFilterType BySJout --outFilterMismatchNoverLmax 0.1 --clip3pAdapterSeq AAAAAAAA 
      
  EXITING: FATAL INPUT ERROR: duplicate parameter "outFilterMultimapNmax" in input "Command-Line"
  SOLUTION: keep only one definition of input parameters in each input source

Approach

For this error to occur, a user must provide a STAR parameter to --extra_star_align_args that has already been specified in the modules.config or a custom config.

rnaseq/conf/modules.config

Lines 534 to 549 in 6e1e448

process {
withName: '.*:ALIGN_STAR:STAR_ALIGN|.*:ALIGN_STAR:STAR_ALIGN_IGENOMES' {
ext.args = [
'--quantMode TranscriptomeSAM',
'--twopassMode Basic',
'--outSAMtype BAM Unsorted',
'--readFilesCommand zcat',
'--runRNGseed 0',
'--outFilterMultimapNmax 20',
'--alignSJDBoverhangMin 1',
'--outSAMattributes NH HI AS NM MD',
'--quantTranscriptomeBan Singleend',
'--outSAMstrandField intronMotif',
params.save_unaligned ? '--outReadsUnmapped Fastx' : '',
params.extra_star_align_args ?: ''
].join(' ').trim()

Now, the union of params.extra_star_align_args and ext.args is provided arguments to STAR.

$$ \text{ext.args} \cup \text{extra\_star\_align\_args}$$

A deduplication happens silently when needed.

Minimal reproducible example

Minimal reproducible example for the Groovy Web Console:

def extra_star_align_args =  '--outSAMstrandField intronMotif --outFilterMultimapNmax 50 --outFilterMultimapNmax 20'  
def extargs   = ['--quantMode TranscriptomeSAM','--outFilterMultimapNmax 20','--outSAMstrandField intronMotif', extra_star_align_args.split("\\s(?=--)") ?: ''].flatten().unique(false).join(' ').trim()  
System.out.println(extargs)
  1. .split("\\s(?=--)") splits the arguments to params.extra_star_align_args at each whitespace followed by two dashes. The dashes are retained with a (?=) zero-width positive lookahead.

  2. The resulting nested list is flattened with .flatten() and .unique(false) is used to create a new list of the union.

Limitations

This approach does not eliminate a parameter duplication with dissimilar specifications.

Providing --outFilterMultimapNmax 40 to --extra_star_align_args would still cause a parameter clash, because the parameter value in the config is 20.

I presume, this is the desired behaviour, since --extra_star_align_args is likely intended as a convenience parameter that should never take precedence over the module configuration.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs- [ ] If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@MatthiasZepper MatthiasZepper added the bug Something isn't working label Feb 1, 2023
@MatthiasZepper MatthiasZepper self-assigned this Feb 1, 2023
@github-actions
Copy link

github-actions bot commented Feb 1, 2023

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 595ca66

+| ✅ 153 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗   2 tests had warnings |!

❗ Test warnings:

  • nextflow_config - Config manifest.version should end in dev: '3.10.1'
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy

✅ Tests passed:

Run details

  • nf-core/tools version 2.7.2
  • Run at 2023-02-08 12:59:59

Copy link
Member

@drpatelh drpatelh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks @MatthiasZepper

CHANGELOG.md Outdated Show resolved Hide resolved
conf/modules.config Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants