Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved ext.args consolidation for STAR and TRIMGALORE #1248

Merged
merged 5 commits into from
Apr 18, 2024

Conversation

MatthiasZepper
Copy link
Member

@MatthiasZepper MatthiasZepper commented Mar 8, 2024

Occasionally, people were surprised that parameters provided to extra_star_align_args could still conflict with the default configuration (e.g. #1046).

As a convenience parameter, I felt that it should never take precedence over the actual module configuration, because it seemed more error-prone to silently overwrite the module config with something that (as the parameter name suggests) is meant to be extra. Hence, I preferred that a custom module config should be used.

However, I said that I would be open to reconsider and since now also @tdanhorn and @CharlotteAnne suggested that change, I gave in. This PR now improves the extra_args parameter consolidation so it can override presets from the config.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • CHANGELOG.md is updated.

Copy link

github-actions bot commented Mar 8, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 6ab2e44

+| ✅ 167 tests passed       |+
#| ❔   9 tests were ignored |#
!| ❗   7 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_exist - File is ignored: conf/modules.config
  • nextflow_config - Config default ignored: params.ribo_database_manifest
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore or pyproject.toml
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/rnaseq/rnaseq/.github/workflows/awstest.yml
  • multiqc_config - 'assets/multiqc_config.yml' not found

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-04-17 10:26:52

@MatthiasZepper
Copy link
Member Author

I did manually test the change with the test profile and providing extra_star_align_args = '--outFilterMultimapNmax 50', but it felt excessive to write a separate nf-test only for this one parameter and wasn't sure if I can muddle it through with an existing test without derailing those?

Screen Shot 2024-03-08 at 21 59 49

@tdanhorn
Copy link

Thank your for doing this, @MatthiasZepper. I think it follows the principle of "least surprise". Don't know how easy it would be, but if you wanted to avoid people inadvertently overwriting preset options, you could test for that and issue a warning. I don't consider it strictly necessary, though -- people who use this option should know what they are doing.

params.save_unaligned ? '--outReadsUnmapped Fastx' : '',
params.extra_star_align_args ? params.extra_star_align_args.split("\\s(?=--)") : ''
].flatten().unique(false).join(' ').trim() }
ext.args = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Principle makes sense.

Would it make more sense to factor out maps and merge from there? Like the following (untested).

ext.args = {
    // Function to convert argument strings into a map
    def argsToMap(String args) {
        args.split("\\s(?=--)").collectEntries {
            def (key, value) = it.trim().split(/\s+/, 2)
            [(key): value]
        }
    }

    // Initialize the map with preset arguments
    def preset_args_map = argsToMap('''
        --quantMode TranscriptomeSAM
        --twopassMode Basic
        --outSAMtype BAM Unsorted
        --readFilesCommand zcat
        --runRNGseed 0
        --outFilterMultimapNmax 20
        --alignSJDBoverhangMin 1
        --outSAMattributes NH HI AS NM MD
        --quantTranscriptomeBan Singleend
        --outSAMstrandField intronMotif
        ${params.save_unaligned ? '--outReadsUnmapped Fastx' : ''}
    '''.trim())

    // Convert extra arguments into a map
    def extra_args_map = params.extra_star_align_args ? argsToMap(params.extra_star_align_args) : [:]

    // Merge maps: extra arguments override preset arguments
    def final_args_map = preset_args_map + extra_args_map

    // Convert the map back to a list and then to a single string
    final_args_map.collect { key, value -> "${key} ${value}" }.join(' ').trim()
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way is fine with me.

Since I do know neither Java nor Groovy (why is Nextflow not pythonic :-/ ?!?) I was just happy that I could nudge the AI into coming up with something that worked. But it is true, using a map is an elegant solution, because it avoids repeatedly looping over the preset arguments for each extra argument.

@MatthiasZepper
Copy link
Member Author

Seems the tests are failing. Before I try to troubleshoot code in a language I do not know... do you, Jonathan, right away know where the problem is?

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad. These should do it.

But otherwise just flip back to your original, this is only a marginal improvement

subworkflows/local/align_star/nextflow.config Outdated Show resolved Hide resolved
Co-authored-by: Jonathan Manning <jonathan.manning@seqera.io>
@MatthiasZepper
Copy link
Member Author

Sorry, my bad. These should do it.

That looks much better! Thank you very much!

There are two tests failing, but to me, it doesn't seem that this is anyway related to this PR.

@MatthiasZepper MatthiasZepper merged commit dcd5659 into nf-core:dev Apr 18, 2024
184 checks passed
@MatthiasZepper MatthiasZepper deleted the ExtraArgsConsolidation branch April 18, 2024 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants