Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds filtering out of FASTA for tools that don't support it #60

Merged
merged 7 commits into from Apr 20, 2022

Conversation

jfy133
Copy link
Member

@jfy133 jfy133 commented Apr 18, 2022

PR checklist

Closes #56

Basically if a file is marked as a 'fasta' file, will only select those which are NOT FASTA for tools such as MetaPhlAn3, and also spits out a warning that sample is being ignored.

Note a couple of things:

  • Centrifuge DOES support FASTA input, but nf-core modules needs to be updated
  • .filter() operator closure does NOT seem to support meta, reads -> syntax, gives some wierd error about call()

Also standardises warning messages

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
    • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
    • If necessary, also make a PR on the nf-core/taxprofiler branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@jfy133 jfy133 changed the title Adds filtering out of FASTA for tools that don't support Adds filtering out of FASTA for tools that don't support it Apr 18, 2022
@github-actions
Copy link

github-actions bot commented Apr 18, 2022

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 9fd94b1

+| ✅ 146 tests passed       |+
!| ❗  50 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in README.md: Write a 1-2 sentence summary of what data the pipeline is for and what it does
  • pipeline_todos - TODO string in README.md: Add full-sized test dataset and amend the paragraph below if applicable
  • pipeline_todos - TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in nextflow.config: Specify your pipeline's command line flags
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in WorkflowMain.groovy: Add Zenodo DOI for pipeline after first release
  • pipeline_todos - TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in taxprofiler.nf: Add all file path parameters for the pipeline to the list below
  • pipeline_todos - TODO string in ci.yml: You can customise CI pipeline run tests as required
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • schema_description - Ungrouped param in schema: databases
  • schema_description - Ungrouped param in schema: shortread_clipmerge_excludeunmerged
  • schema_description - Ungrouped param in schema: run_malt
  • schema_description - Ungrouped param in schema: malt_mode
  • schema_description - Ungrouped param in schema: run_kraken2
  • schema_description - Ungrouped param in schema: run_centrifuge
  • schema_description - Ungrouped param in schema: centrifuge_save_unaligned
  • schema_description - Ungrouped param in schema: centrifuge_save_aligned
  • schema_description - Ungrouped param in schema: centrifuge_sam_format
  • schema_description - Ungrouped param in schema: run_metaphlan3
  • schema_description - Ungrouped param in schema: shortread_clipmerge_tool
  • schema_description - Ungrouped param in schema: shortread_clipmerge_skipadaptertrim
  • schema_description - Ungrouped param in schema: shortread_clipmerge_mergepairs
  • schema_description - Ungrouped param in schema: shortread_clipmerge_adapter1
  • schema_description - Ungrouped param in schema: shortread_clipmerge_adapter2
  • schema_description - Ungrouped param in schema: shortread_clipmerge_minlength
  • schema_description - Ungrouped param in schema: save_preprocessed_reads
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_tool
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_bbduk_windowsize
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_bbduk_mask
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_entropy
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_prinseqplusplus_mode
  • schema_description - Ungrouped param in schema: shortread_complexityfilter_prinseqplusplus_dustscore
  • schema_description - Ungrouped param in schema: save_complexityfiltered_reads
  • schema_description - Ungrouped param in schema: save_runmerged_reads
  • schema_description - Ungrouped param in schema: perform_shortread_clipmerge
  • schema_description - Ungrouped param in schema: perform_longread_clip
  • schema_description - Ungrouped param in schema: perform_shortread_complexityfilter
  • schema_description - Ungrouped param in schema: perform_runmerging
  • schema_description - Ungrouped param in schema: perform_shortread_hostremoval
  • schema_description - Ungrouped param in schema: shortread_hostremoval_reference
  • schema_description - Ungrouped param in schema: shortread_hostremoval_index

✅ Tests passed:

Run details

  • nf-core/tools version 2.3.2
  • Run at 2022-04-20 07:26:52

@jfy133 jfy133 requested a review from a team April 18, 2022 05:35
Copy link
Collaborator

@Midnighter Midnighter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could benefit from some logging/error formatting facilities but otherwise this looks good.

subworkflows/local/profiling.nf Outdated Show resolved Hide resolved

ch_input_for_metaphlan3 = ch_input_for_profiling.metaphlan3
.filter{
if (it[0].is_fasta) log.warn "[nf-core/taxprofiler] MetaPhlAn3 currently does not accept FASTA files as input. Skipping MetaPhlAn3 for sample " + it[0].id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above

Suggested change
if (it[0].is_fasta) log.warn "[nf-core/taxprofiler] MetaPhlAn3 currently does not accept FASTA files as input. Skipping MetaPhlAn3 for sample " + it[0].id
if (it[0].is_fasta) log.warn "[nf-core/taxprofiler] MetaPhlAn3 currently does not accept FASTA files as input. Skipping MetaPhlAn3 for sample ${it[0].id}."

jfy133 and others added 2 commits April 20, 2022 09:15
Co-authored-by: Moritz E. Beber <midnighter@posteo.net>
@jfy133
Copy link
Member Author

jfy133 commented Apr 20, 2022

Ok actually, as this will require more work - I will revert the commit with the logging thing and make an issue - I agree the nf-core logging stuff would be much better here but I've not looked into it (nor Groovy stuff)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FASTA input files are not yet incorporated into the profiling workflow
2 participants