Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganise pipeline tests into flat structure #1280

Merged
merged 3 commits into from
Apr 10, 2024
Merged

Conversation

adamrtalbot
Copy link
Contributor

@adamrtalbot adamrtalbot commented Apr 10, 2024

Changes:

  • Move all test files from tests/tests/ to test/*.nf.test
  • Makes structure more flat and discovery easier at the cost of looking more untidy.
  • To be discussed in PR!

The key question is, how do we want to organise the pipeline level tests for pipelines with lots of tests?

Answers below please...

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Changes:
 - Move all test files from tests/tests/ to test/*.nf.test
 - Makes structure more flat and discovery easier at the cost of looking more untidy.
 - To be discussed in PR!
@adamrtalbot adamrtalbot added this to the nf-test milestone Apr 10, 2024
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loving it

Copy link

github-actions bot commented Apr 10, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit c259f30

+| ✅ 167 tests passed       |+
#| ❔   9 tests were ignored |#
!| ❗   7 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • files_exist - File not found: .github/workflows/awstest.yml
  • files_exist - File not found: .github/workflows/awsfulltest.yml
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_exist - File is ignored: conf/modules.config
  • nextflow_config - Config default ignored: params.ribo_database_manifest
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore or pyproject.toml
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/rnaseq/rnaseq/.github/workflows/awstest.yml
  • multiqc_config - 'assets/multiqc_config.yml' not found

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-04-10 13:56:56

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, but missing most of the snaps, no?

@adamrtalbot
Copy link
Contributor Author

Looks fine, but missing most of the snaps, no?

There are no snaps.

You're welcome to write snaps for the pipeline tests, I gave up after about 10 hours.

@MatthiasZepper
Copy link
Member

how do we want to organise the pipeline level tests for pipelines with lots of tests?

Without knowing what has been discussed before...how many different tests do you expect a pipeline will have?

Since all modules and subworkflows should already have been tested, most unit tests are done (except for local ones) and the total number of tests should not be more than a dozen or so? And do you intend to stick with having one file per test?

BTW: Don't you want to get rid of the hardcoded https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/samplesheet/v3.10/samplesheet_test.csv in the process?

@adamrtalbot
Copy link
Contributor Author

Without knowing what has been discussed before...how many different tests do you expect a pipeline will have?

For most pipelines 3-10. For rnaseq 10-20, for Sarek 200+

Since all modules and subworkflows should already have been tested, most unit tests are done (except for local ones) and the total number of tests should not be more than a dozen or so? And do you intend to stick with having one file per test?

Yep, I've removed a few tests on the basis that the subworkflow covers the use case. The main reason for a pipeline level test is a big parameter like --skip_qc which is used throughout the pipeline.

BTW: Don't you want to get rid of the hardcoded https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/samplesheet/v3.10/samplesheet_test.csv in the process?

I can't actually remember why we didn't do this in the first place 🤔

@adamrtalbot
Copy link
Contributor Author

OK so params.pipelines_testdata_base_path doesn't work in a params block:

params {
    outdir = "$outputDir"
    input  = "${params.pipelines_testdata_base_path}/csv/samplesheet_test.csv"
}

But turns out we can just remove it and it retrieves it from the -profile test.

@MatthiasZepper
Copy link
Member

Without knowing what has been discussed before...how many different tests do you expect a pipeline will have?
For most pipelines 3-10. For rnaseq 10-20, for Sarek 200+

That number for Sarek sounds excessive, but would clearly argue against your proposed flat structure, then? At least if we are talking about so many files in a single folder....one might also not want to run each of them every time, so some sort of easy subset mechanism would be desirable.

I don't know about nf-test, but with pytest -k / -m once can for example easily select particular tests. If nf-test has nothing of that kind, subfolders could provide a convenient way to run selected tests only?

Yep, I've removed a few tests on the basis that the subworkflow covers the use case. The main reason for a pipeline level test is a big parameter like --skip_qc which is used throughout the pipeline.

Appreciated, and I agree that integration tests are paramount! Yet, a single workflow.success is not a very specific test condition. Is it planned to extend the tests later for more precise conditions, or is this what you wasted the 10h on specifically?

@edmundmiller
Copy link
Contributor

I like! Not sure if there will be too many snaps one day, but I think we cross that bridge when we get there.

You're welcome to write snaps for the pipeline tests, I gave up after about 10 hours.

Yeah this is how long hic took me as well.

I'm wondering if we make a compromise and just snapshot the featurecounts output or something? It's far enough down stream that any changes will get caught.

I also find myself copying the same list of things to snapshot and assert across files. I wonder if we could drop them in tests/lib/pipeline_outputs.groovy and just one line include and update all expected files in one go? This was a huge pain in methylseq(@sateeshperi)

@adamrtalbot
Copy link
Contributor Author

That number for Sarek sounds excessive, but would clearly argue against your proposed flat structure, then? At least if we are talking about so many files in a single folder....one might also not want to run each of them every time, so some sort of easy subset mechanism would be desirable.

I don't know about nf-test, but with pytest -k / -m once can for example easily select particular tests. If nf-test has nothing of that kind, subfolders could provide a convenient way to run selected tests only?

You can use tags, names or paths to select nf-test files. I originally grouped them into directories (tests/tests) but it looked a bit ugly. @maxulysse suggested this layout but I opened this PR as a starting point for a discussion. How do people want to organise their pipeline tests? Do lots of subfolders make sense or would a flatter structure be easier to understand?

Regarding snapshots, I'm going to work on that some other time. But yes, some form of specific snapshot will be needed. For now, we just want to recreate the existing pipeline tests.

@adamrtalbot
Copy link
Contributor Author

OK my decision so far, let's go with this flat structure. Individual pipeline developers can create some more granularity with tags/directories/naming if they like, we won't enforce a very strict structure here.

@adamrtalbot adamrtalbot merged commit 5833810 into nf-core:dev Apr 10, 2024
184 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants