Reorganise arguments for clearer syntax #1091

adamrtalbot · 2023-10-10T08:54:16Z

Changes:

Grouped arguments into sections based on what they do
Reordered slightly to go in chronological order of the pipeline and set up
I think it's clearer for a new user
Note no actual text has changed, just the order. Git thinks this is a full re-write.

PR checklist

Changes: - Grouped arguments into sections based on _what they do_ - Reordered slightly to go in chronological order of the pipeline and set up - I think it's clearer for a new user Fixes nf-core#1090

github-actions · 2023-10-10T08:56:18Z

`nf-core lint` overall result: Failed ❌

Posted for pipeline commit dd45b04

+| ✅ 141 tests passed       |+
#| ❔   6 tests were ignored |#
!| ❗   3 tests had warnings |!
-| ❌   3 tests failed       |-

❌ Test failures:

files_unchanged - CODE_OF_CONDUCT.md does not match the template
files_unchanged - .github/CONTRIBUTING.md does not match the template
files_unchanged - .github/workflows/linting.yml does not match the template

❗ Test warnings:

files_exist - File not found: .github/workflows/awstest.yml
files_exist - File not found: .github/workflows/awsfulltest.yml
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

files_unchanged - File ignored due to lint config: assets/email_template.html
files_unchanged - File ignored due to lint config: assets/email_template.txt
files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy
files_unchanged - File ignored due to lint config: .gitignore or .prettierignore or pyproject.toml
actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/rnaseq/rnaseq/.github/workflows/awstest.yml
multiqc_config - multiqc_config

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-rnaseq_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-rnaseq_logo_light.png
files_exist - File found: docs/images/nf-core-rnaseq_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: lib/WorkflowRnaseq.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-rnaseq_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 3.13.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-rnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-rnaseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-rnaseq_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
readme - README Zenodo placeholder was replaced with DOI.
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (237 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: cloud_tests_small.yml
actions_schema_validation - Workflow validation passed: cloud_tests_full.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.10
Run at 2023-10-11 19:17:23

adamrtalbot · 2023-10-10T11:13:32Z

Blocked until #1078 or #1088 is merged.

adamrtalbot · 2023-10-10T11:17:24Z

Here's the new docs from --help:

nextflow run . -profile test,docker  --help --validationShowHiddenParams
N E X T F L O W  ~  version 23.09.2-edge
Launching `./main.nf` [naughty_joliot] DSL2 - revision: 37b7a7de80


------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/rnaseq v3.13.0dev
------------------------------------------------------
Typical pipeline command:

  nextflow run nf-core/rnaseq --input samplesheet.csv --genome GRCh37 -profile docker

Input/output options
  --input                            [string]  Path to comma-separated file containing information about the samples in the experiment.
  --outdir                           [string]  The output directory where the results will be saved. You have to use absolute paths to storage on Cloud 
                                               infrastructure. 
  --email                            [string]  Email address for completion summary.
  --multiqc_title                    [string]  MultiQC report title. Printed as page header, used for filename if not otherwise specified.

Reference genome options
  --genome                           [string]  Name of iGenomes reference.
  --fasta                            [string]  Path to FASTA genome file.
  --gtf                              [string]  Path to GTF annotation file.
  --gff                              [string]  Path to GFF3 annotation file.
  --gene_bed                         [string]  Path to BED file containing gene intervals. This will be created from the GTF file if not specified.
  --transcript_fasta                 [string]  Path to FASTA transcriptome file.
  --additional_fasta                 [string]  FASTA file to concatenate to genome FASTA file e.g. containing spike-in sequences.
  --splicesites                      [string]  Splice sites file required for HISAT2.
  --star_index                       [string]  Path to directory or tar.gz archive for pre-built STAR index.
  --hisat2_index                     [string]  Path to directory or tar.gz archive for pre-built HISAT2 index.
  --rsem_index                       [string]  Path to directory or tar.gz archive for pre-built RSEM index.
  --salmon_index                     [string]  Path to directory or tar.gz archive for pre-built Salmon index.
  --hisat2_build_memory              [string]  Minimum memory required to use splice sites and exons in the HiSAT2 index build process. [default: 
                                               200.GB] 
  --gencode                          [boolean] Specify if your GTF annotation is in GENCODE format.
  --gtf_extra_attributes             [string]  By default, the pipeline uses the `gene_name` field to obtain additional gene identifiers from the input GTF file 
                                               when running Salmon. [default: gene_name] 
  --gtf_group_features               [string]  Define the attribute type used to group features in the GTF file when running Salmon. [default: gene_id]
  --featurecounts_group_type         [string]  The attribute type used to group feature types in the GTF file when generating the biotype plot with 
                                               featureCounts. [default: gene_biotype] 
  --featurecounts_feature_type       [string]  By default, the pipeline assigns reads based on the 'exon' attribute within the GTF file. [default: exon]
  --igenomes_base                    [string]  Directory / URL base for iGenomes references. [default: s3://ngi-igenomes/igenomes]
  --igenomes_ignore                  [boolean] Do not load the iGenomes reference config.

Read trimming options
  --trimmer                          [string]  Specifies the trimming tool to use - available options are 'trimgalore' and 'fastp'. (accepted: trimgalore, 
                                               fastp) [default: trimgalore] 
  --extra_trimgalore_args            [string]  Extra arguments to pass to Trim Galore! command in addition to defaults defined by the pipeline.
  --extra_fastp_args                 [string]  Extra arguments to pass to fastp command in addition to defaults defined by the pipeline.
  --min_trimmed_reads                [integer] Minimum number of trimmed reads below which samples are removed from further processing. Some downstream steps in 
                                               the pipeline will fail if this threshold is too low. [default: 10000] 

Read filtering options
  --bbsplit_fasta_list               [string]  Path to comma-separated file containing a list of reference genomes to filter reads against with BBSplit. You 
                                               have to also explicitly set `--skip_bbsplit false` if you want to use BBSplit. 
  --bbsplit_index                    [string]  Path to directory or tar.gz archive for pre-built BBSplit index.
  --remove_ribo_rna                  [boolean] Enable the removal of reads derived from ribosomal RNA using SortMeRNA.
  --ribo_database_manifest           [string]  Text file containing paths to fasta files (one per line) that will be used to create the database for 
                                               SortMeRNA. [default: ${projectDir}/assets/rrna-db-defaults.txt] 

UMI options
  --with_umi                         [boolean] Enable UMI-based read deduplication.
  --umitools_extract_method          [string]  UMI pattern to use. Can be either 'string' (default) or 'regex'. [default: string]
  --umitools_bc_pattern              [string]  The UMI barcode pattern to use e.g. 'NNNNNN' indicates that the first 6 nucleotides of the read are from the 
                                               UMI. 
  --umitools_bc_pattern2             [string]  The UMI barcode pattern to use if the UMI is located in read 2.
  --umi_discard_read                 [integer] After UMI barcode extraction discard either R1 or R2 by setting this parameter to 1 or 2, respectively.
  --umitools_umi_separator           [string]  The character that separates the UMI in the read name. Most likely a colon if you skipped the extraction with 
                                               UMI-tools and used other software. 
  --umitools_grouping_method         [string]  Method to use to determine read groups by subsuming those with similar UMIs. All methods start by identifying the 
                                               reads with the same mapping position, but treat similar yet nonidentical UMIs differently. (accepted: unique, 
                                               percentile, cluster, adjacency, directional) [default: directional] 
  --umitools_dedup_stats             [boolean] Generate output stats when running "umi_tools dedup".

Alignment options
  --aligner                          [string]  Specifies the alignment algorithm to use - available options are 'star_salmon', 'star_rsem' and 'hisat2'. 
                                               (accepted: star_salmon, star_rsem, hisat2) [default: star_salmon] 
  --pseudo_aligner                   [string]  Specifies the pseudo aligner to use - available options are 'salmon'. Runs in addition to '--aligner'. 
                                               (accepted: salmon) 
  --bam_csi_index                    [boolean] Create a CSI index for BAM files instead of the traditional BAI index. This will be required for genomes with 
                                               larger chromosome sizes. 
  --star_ignore_sjdbgtf              [boolean] When using pre-built STAR indices do not re-extract and use splice junctions from the GTF file.
  --salmon_quant_libtype             [string]   Override Salmon library type inferred based on strandedness defined in meta object.
  --min_mapped_reads                 [number]  Minimum percentage of uniquely mapped reads below which samples are removed from further processing. 
                                               [default: 5] 
  --seq_center                       [string]  Sequencing center information to be added to read group of BAM files.
  --stringtie_ignore_gtf             [boolean] Perform reference-guided de novo assembly of transcripts using StringTie i.e. dont restrict to those in GTF 
                                               file. 
  --extra_star_align_args            [string]  Extra arguments to pass to STAR alignment command in addition to defaults defined by the pipeline. Only available 
                                               for the STAR-Salmon route. 
  --extra_salmon_quant_args          [string]  Extra arguments to pass to Salmon quant command in addition to defaults defined by the pipeline.

Optional outputs
  --save_merged_fastq                [boolean] Save FastQ files after merging re-sequenced libraries in the results directory.
  --save_umi_intermeds               [boolean] If this option is specified, intermediate FastQ and BAM files produced by UMI-tools are also saved in the results 
                                               directory. 
  --save_non_ribo_reads              [boolean] If this option is specified, intermediate FastQ files containing non-rRNA reads will be saved in the results 
                                               directory. 
  --save_bbsplit_reads               [boolean] If this option is specified, FastQ files split by reference will be saved in the results directory.
  --save_reference                   [boolean] If generated by the pipeline save the STAR index in the results directory.
  --save_trimmed                     [boolean] Save the trimmed FastQ files in the results directory.
  --save_align_intermeds             [boolean] Save the intermediate BAM files from the alignment step.
  --save_unaligned                   [boolean] Where possible, save unaligned reads from either STAR, HISAT2 or Salmon to the results directory.

Quality Control
  --deseq2_vst                       [boolean] Use vst transformation instead of rlog with DESeq2. [default: true]
  --rseqc_modules                    [string]  Specify the RSeQC modules to run. [default: 
                                               bam_stat,inner_distance,infer_experiment,junction_annotation,junction_saturation,read_distribution,read_duplication] 

Process skipping options
  --skip_bbsplit                     [boolean] Skip BBSplit for removal of non-reference genome reads. [default: true]
  --skip_umi_extract                 [boolean] Skip the UMI extraction from the read in case the UMIs have been moved to the headers in advance of the pipeline 
                                               run. 
  --skip_trimming                    [boolean] Skip the adapter trimming step.
  --skip_alignment                   [boolean] Skip all of the alignment-based processes within the pipeline.
  --skip_pseudo_alignment            [boolean] Skip all of the pseudo-alignment-based processes within the pipeline.
  --skip_markduplicates              [boolean] Skip picard MarkDuplicates step.
  --skip_bigwig                      [boolean] Skip bigWig file creation.
  --skip_stringtie                   [boolean] Skip StringTie.
  --skip_fastqc                      [boolean] Skip FastQC.
  --skip_preseq                      [boolean] Skip Preseq. [default: true]
  --skip_dupradar                    [boolean] Skip dupRadar.
  --skip_qualimap                    [boolean] Skip Qualimap.
  --skip_rseqc                       [boolean] Skip RSeQC.
  --skip_biotype_qc                  [boolean] Skip additional featureCounts process for biotype QC.
  --skip_deseq2_qc                   [boolean] Skip DESeq2 PCA and heatmap plotting.
  --skip_multiqc                     [boolean] Skip MultiQC.
  --skip_qc                          [boolean] Skip all QC steps except for MultiQC.

Institutional config options
  --custom_config_version            [string]  Git commit id for Institutional configs. [default: master]
  --custom_config_base               [string]  Base directory for Institutional configs. [default: 
                                               https://raw.githubusercontent.com/nf-core/configs/master] 
  --config_profile_name              [string]  Institutional config name.
  --config_profile_description       [string]  Institutional config description.
  --config_profile_contact           [string]  Institutional config contact information.
  --config_profile_url               [string]  Institutional config URL link.
  --test_data_base                   [string]  Base path / URL for data used in the test profiles [default: 
                                               https://raw.githubusercontent.com/nf-core/test-datasets/rnaseq3] 

Max job request options
  --max_cpus                         [integer] Maximum number of CPUs that can be requested for any single job. [default: 16]
  --max_memory                       [string]  Maximum amount of memory that can be requested for any single job. [default: 128.GB]
  --max_time                         [string]  Maximum amount of time that can be requested for any single job. [default: 240.h]

Generic options
  --help                             [boolean] Display help text.
  --version                          [boolean] Display version and exit.
  --publish_dir_mode                 [string]  Method used to save pipeline results to output directory. (accepted: symlink, rellink, link, copy, 
                                               copyNoFollow, move) [default: copy] 
  --email_on_fail                    [string]  Email address for completion summary, only when pipeline fails.
  --plaintext_email                  [boolean] Send plain-text email instead of HTML.
  --max_multiqc_email_size           [string]  File size limit when attaching MultiQC reports to summary emails. [default: 25.MB]
  --monochrome_logs                  [boolean] Do not use coloured log outputs.
  --hook_url                         [string]  Incoming hook URL for messaging service
  --multiqc_config                   [string]  Custom config file to supply to MultiQC.
  --multiqc_logo                     [string]  Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
  --multiqc_methods_description      [string]  Custom MultiQC yaml file containing HTML including a methods description.
  --validate_params                  [boolean] Boolean whether to validate parameters against the schema at runtime [default: true]
  --validationShowHiddenParams       [boolean] Show all params when using `--help`
  --validationFailUnrecognisedParams [boolean] Validation of parameters fails when an unrecognised parameter is found.
  --validationLenientMode            [boolean] Validation of parameters in lenient more.

------------------------------------------------------
If you use nf-core/rnaseq for your analysis please cite:

* The pipeline
  https://doi.org/10.5281/zenodo.1400710

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/rnaseq/blob/master/CITATIONS.md

drpatelh

LGTM!

Reorganise arguments for clearer syntax

b6f2d69

Changes: - Grouped arguments into sections based on _what they do_ - Reordered slightly to go in chronological order of the pipeline and set up - I think it's clearer for a new user Fixes nf-core#1090

drpatelh added 2 commits October 11, 2023 21:14

Update CHANGELOG.md

dd45b04

Merge branch 'dev' into 1090_tidy_cli_args

7f3f928

drpatelh approved these changes Oct 11, 2023

View reviewed changes

drpatelh merged commit ae2dea7 into nf-core:dev Oct 11, 2023
6 of 26 checks passed

drpatelh mentioned this pull request Oct 15, 2023

Tidy up 'skipping tools' and 'alignment' options #1090

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganise arguments for clearer syntax #1091

Reorganise arguments for clearer syntax #1091

adamrtalbot commented Oct 10, 2023 •

edited

github-actions bot commented Oct 10, 2023 •

edited

❌ Test failures:

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

adamrtalbot commented Oct 10, 2023 •

edited

adamrtalbot commented Oct 10, 2023

drpatelh left a comment

Reorganise arguments for clearer syntax #1091

Reorganise arguments for clearer syntax #1091

Conversation

adamrtalbot commented Oct 10, 2023 • edited

PR checklist

github-actions bot commented Oct 10, 2023 • edited

nf-core lint overall result: Failed ❌

❌ Test failures:

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

adamrtalbot commented Oct 10, 2023 • edited

adamrtalbot commented Oct 10, 2023

drpatelh left a comment

Choose a reason for hiding this comment

adamrtalbot commented Oct 10, 2023 •

edited

github-actions bot commented Oct 10, 2023 •

edited

`nf-core lint` overall result: Failed ❌

adamrtalbot commented Oct 10, 2023 •

edited