-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add VSEARCH cluster #622
add VSEARCH cluster #622
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Could you update the CHANGELOG as well?
I am myself not sure when the clustering should take place, directly after DADA2 ASV generation or rather after filters. Do you know of any advantage/disadvantage for the filter sequence?
About the md5sums, any program should do. But the right way to do it (havent done it myself yet) should be the one explained in slack, i.e. nf-test test --updateSnapshot
in the pipeline code clone folder after installing https://github.com/askimed/nf-test
Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>
Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for re-ordering! I found still a few points where to order the code though.
I also tested your branch it it seems fine to me except that it collects as edit: sorry, numbers should be identical I think, but still questionable whether ASV counts and read counts should be mixed.FILTER_CLUSTERS.out.stats
asv per sample instead of reads per sample. Please use read count stats in results/overall_summary.tsv
(see comment below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
Just one more small comment, but I approve already.
Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>
Addresses issue: #609
It's not LULU, but I figured VSEARCH cluster would be easier to add for ASV post-clustering because there is already an nf-core module (with biocontainer and bioconda).
I've added a test profile, but I haven't added a .test.snap file yet because I'm not sure what tool I should be using to get the md5 value.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).