Skip to content

Hifiadapterfilt#7398

Closed
Ge94 wants to merge 4 commits intonf-core:masterfrom
Ge94:hifiadapterfilt
Closed

Hifiadapterfilt#7398
Ge94 wants to merge 4 commits intonf-core:masterfrom
Ge94:hifiadapterfilt

Conversation

@Ge94
Copy link
Copy Markdown
Contributor

@Ge94 Ge94 commented Jan 30, 2025

This PR was created to introduce hifiadapterfilt as a new module. It's purpose is to remove adapters that are specific to pacbio hifi reads.

PR checklist

Closes #7390

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@Ge94 Ge94 marked this pull request as ready for review January 30, 2025 16:12
@vagkaratzas vagkaratzas self-requested a review February 3, 2025 14:14
Copy link
Copy Markdown
Contributor

@vagkaratzas vagkaratzas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some minor comments; I understand that at the moment there is only one pacbio test dataset, that unfortunately doesn't have any adapters to filter. Would it be easy/possible to upload another pacbio minimal test dataset where adapters are getting filtered out (and probably also have fewer to non empty output files)? Also, even though the conda environment and the containers describe version 3.0.0, in the versions.yml, 3.0.1 is written out. Can you investigate why the mismatch?

Comment thread modules/nf-core/hifiadapterfilt/main.nf Outdated
Comment thread modules/nf-core/hifiadapterfilt/main.nf
Ge94 and others added 2 commits February 4, 2025 11:21
Co-authored-by: Evangelos Karatzas <32259775+vagkaratzas@users.noreply.github.com>
Co-authored-by: Evangelos Karatzas <32259775+vagkaratzas@users.noreply.github.com>
Comment on lines +4 to +6
[

]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test produces no snaps at all - maybe this can be changed?

Comment on lines +25 to +29
{ assert snapshot(process.out[0][1].findAll {
file(it).name == "test.filt.fastq.gz" &&
file(it).name == "test.contaminant.blastout" &&
file(it).name == "test.blocklist" }).match()},
{ assert path(process.out.stats[0].get(1)).text.contains('Number of adapter contaminated ccs reads: 0 (0% of total)') },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files seem to not be found in the snapshot?

"""
input[0] = [
[ id:'test' ],
file(params.test_data['homo_sapiens']['pacbio']['hifi'], checkIfExists: true)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change this to the newer path description? :)

file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true)

when:
task.ext.when == null || task.ext.when

script:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
script:
script:

def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
// The tool AUTOMATICALLY detects fastq files from the input folder, hence an explicit call of "fastq" is not needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@FloWuenne
Copy link
Copy Markdown
Contributor

Hi @Ge94,
do you think you will be able to work on this during the upcoming hackathon or do you need help getting this PR over the finish line? 😊

@FloWuenne FloWuenne added the awaiting-changes will be closed after 30 days label Mar 11, 2025
@FloWuenne FloWuenne removed the awaiting-changes will be closed after 30 days label Mar 11, 2025
@Ge94
Copy link
Copy Markdown
Contributor Author

Ge94 commented Mar 11, 2025

Hi @FloWuenne @famosab , thank you for your comments.

I paused this PR and its failing tests, as my main issue is finding or creating a meaningful sample to launch this tool on. At the moment, those that I found have no adapters to filter, hence the output is essentially the same as the input. I tried with all hifi files I could find in nf-core/test-datasets, but none suit my case. Do you have any recommendations or documentation I could follow to generate my own sample and add it to the repo?

Thank you 😊

@famosab
Copy link
Copy Markdown
Contributor

famosab commented Mar 11, 2025

Hi @FloWuenne @famosab , thank you for your comments.

I paused this PR and its failing tests, as my main issue is finding or creating a meaningful sample to launch this tool on. At the moment, those that I found have no adapters to filter, hence the output is essentially the same as the input. I tried with all hifi files I could find in nf-core/test-datasets, but none suit my case. Do you have any recommendations or documentation I could follow to generate my own sample and add it to the repo?

Thank you 😊

You can add new test data here: https://github.com/nf-core/test-datasets/blob/modules/README.md
For the creation i would suggest to see what is used on the tool developers side or maybe ask in the upcoming hackathon :) I am not familiar with the tool so thats the only pointer I can give.

@famosab famosab moved this from Bumped to Stay open in 2025 spring cleaning - modules (PRs) Mar 11, 2025
@atrigila atrigila added the awaiting-changes will be closed after 30 days label Mar 5, 2026
@SPPearce
Copy link
Copy Markdown
Contributor

Closing due to lack of response. Please reopen if you (or anyone else) have any plans to continue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-changes will be closed after 30 days

Projects

Development

Successfully merging this pull request may close these issues.

new module: hifiadapterfilt

6 participants