add VSEARCH cluster #622

a4000 · 2023-08-18T07:20:21Z

Addresses issue: #609
It's not LULU, but I figured VSEARCH cluster would be easier to add for ASV post-clustering because there is already an nf-core module (with biocontainer and bioconda).

I've added a test profile, but I haven't added a .test.snap file yet because I'm not sure what tool I should be using to get the md5 value.

PR checklist

This comment contains a description of changes (with reason).
If you've fixed a bug or added code that should be tested, add tests!
If you've added a new tool - have you followed the pipeline conventions in the contribution docs
Make sure your code lints (nf-core lint).
Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
Output Documentation in docs/output.md is updated.
CHANGELOG.md is updated.
README.md is updated (including new tool citations and authors/contributors).

github-actions · 2023-08-18T07:22:39Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 8ec859e

+| ✅ 152 tests passed       |+
#| ❔   3 tests were ignored |#
!| ❗   2 tests had warnings |!

❗ Test warnings:

readme - README did not have a Nextflow minimum version badge.
schema_lint - Parameter input is not defined in the correct subschema (input_output_options)

❔ Tests ignored:

files_exist - File is ignored: conf/igenomes.config
files_unchanged - File ignored due to lint config: .gitattributes
actions_ci - actions_ci

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-ampliseq_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-ampliseq_logo_light.png
files_exist - File found: docs/images/nf-core-ampliseq_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowAmpliseq.groovy
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-ampliseq_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 2.7.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-ampliseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-ampliseq_logo_light.png matches the template
files_unchanged - docs/images/nf-core-ampliseq_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreTemplate.groovy matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Zenodo placeholder was replaced with DOI.
pipeline_todos - No TODO strings found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (246 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: branch.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.9
Run at 2023-08-25 01:08:03

d4straub

Looks great!
Could you update the CHANGELOG as well?
I am myself not sure when the clustering should take place, directly after DADA2 ASV generation or rather after filters. Do you know of any advantage/disadvantage for the filter sequence?
About the md5sums, any program should do. But the right way to do it (havent done it myself yet) should be the one explained in slack, i.e. nf-test test --updateSnapshot in the pipeline code clone folder after installing https://github.com/askimed/nf-test

conf/test_vsearchcluster.config

nextflow_schema.json

docs/output.md

workflows/ampliseq.nf

conf/test_vsearchcluster.config

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

d4straub

Thanks for re-ordering! I found still a few points where to order the code though.
I also tested your branch it it seems fine to me except that it collects as FILTER_CLUSTERS.out.stats asv per sample instead of reads per sample. Please use read count stats in results/overall_summary.tsv (see comment below). edit: sorry, numbers should be identical I think, but still questionable whether ASV counts and read counts should be mixed.

docs/output.md

README.md

nextflow.config

workflows/ampliseq.nf

nextflow_schema.json

workflows/ampliseq.nf

d4straub

Great, thanks!
Just one more small comment, but I approve already.

tests/pipeline/test.nf.test

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

a4000 and others added 3 commits August 18, 2023 06:48

post clustering with VSEARCH

252f573

fixed syntax mistake

e7be358

Merge branch 'nf-core:dev' into dev

33b866a

a4000 requested a review from d4straub August 18, 2023 07:20

d4straub reviewed Aug 18, 2023

View reviewed changes

d4straub changed the title ~~Dev~~ add VSEARCH cluster Aug 18, 2023

a4000 and others added 5 commits August 19, 2023 20:36

Merge branch 'nf-core:dev' into dev

74467fb

Update conf/test_vsearchcluster.config

2b2132e

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

Update nextflow_schema.json

8a96ce6

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

followed suggestions made in pull request comments

dbaf504

modified to pass lint test

142470f

d4straub reviewed Aug 22, 2023

View reviewed changes

a4000 added 2 commits August 23, 2023 00:55

changes based on pull request comments

2d5afad

changed md5sum for overall_summary.tsv

3cf764e

d4straub approved these changes Aug 24, 2023

View reviewed changes

tests/pipeline/test.nf.test Outdated Show resolved Hide resolved

a4000 and others added 2 commits August 25, 2023 08:16

Update tests/pipeline/test.nf.test

3d9e904

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

added match vsearch_cluster

8ec859e

a4000 merged commit d86c0b1 into nf-core:dev Aug 25, 2023
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add VSEARCH cluster #622

add VSEARCH cluster #622

a4000 commented Aug 18, 2023 •

edited

Loading

github-actions bot commented Aug 18, 2023 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

d4straub left a comment •

edited

Loading

d4straub left a comment •

edited

Loading

d4straub left a comment

add VSEARCH cluster #622

add VSEARCH cluster #622

Conversation

a4000 commented Aug 18, 2023 • edited Loading

PR checklist

github-actions bot commented Aug 18, 2023 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

d4straub left a comment • edited Loading

Choose a reason for hiding this comment

d4straub left a comment • edited Loading

Choose a reason for hiding this comment

d4straub left a comment

Choose a reason for hiding this comment

a4000 commented Aug 18, 2023 •

edited

Loading

github-actions bot commented Aug 18, 2023 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️

d4straub left a comment •

edited

Loading

d4straub left a comment •

edited

Loading