-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make ASVs to close #18 #21
base: master
Are you sure you want to change the base?
Conversation
Current issues: The 100% matching means that fewer reads from each sample make it into OTUs. I'm not sure if this is a good thing... or just a limitation of Exact sequence variants. 🤷♂ Thoughts? EDIT: Also should we still be calling the output file |
exact sequence variants with no singletons, right? a drop in counts seems inevitable in that case.
could probably alter the output file names to reflect the header name change. |
That makes sense. Is there an elegant way to support |
maybe use
|
Got it. Should I include this each and every time we refer to those files? On a related note, how should I approach updating the report? |
Yes, where rules are re-used (where it makes sense). This will minimize potential issues with file naming and could make it simpler to support other strategies later.
I don't know everything that's changed, but clearly there's a large chunk of text that will be quite different. Maybe start with figuring out everything that needs to be altered then decide if we need an entirely separate script or if we can pass relevant args into the existing to set specific pieces. |
OK, here's my todo list:
Any advice on word choice? Is |
The first two bullets are related in that the second bullet solves the first (at least I think it does). I think |
I did a quick parameter sweep to explore non-exact matching. 🎯
Robert Edgar recommends using 97% for counting zOTUs, and 98% is mentioned in this vsearch thread. Joe, what do you think about counting up non-exact matches after building zOTUs? What threshold should we use? |
Joe, I'm getting ready to wrap this up. My current solution is to have two reports for the two pipelines. There is a lot of duplicate code, but it was easy to implement. This also makes it easy to add other much more divergent pipelines like I've also changed the counting step to use Finally, how do we update the docs? Does the |
Is there a clean way to do this? rule build_report:
input:
report_script = os.path.join(
os.path.dirname(os.path.abspath(workflow.snakefile)),
"scripts",
"build_report_OTU.py"
) if config.get("pipeline") == "OTU" else os.path.join(
os.path.dirname(os.path.abspath(workflow.snakefile)),
"scripts",
"build_report_ASV.py"), Additionally, the report scripts don't seem to be copied when hundo is installed. Am I using this section wrong? |
Yes, the docs folder is all the needs updating. RTD will rebuild when there are changes to the docs.
This is clean to me. Alternatively, if you wanted something shorter in the input block you can write a separate functions that returns the correct path. That'd move the code into a function and out of the input, but ultimately would look the same.
You just need to update the manifest (https://github.com/pnnl/hundo/blob/master/MANIFEST.in) |
So this bug with json parsing is holding up testing: I'll work on docs until then. |
oof, I can attempt to get to this but no promises. do you lack permission?
…On Mon, Mar 18, 2024 at 6:25 PM Colin J. Brislawn ***@***.***> wrote:
Assigned #21 <#21> to @brwnj
<https://github.com/brwnj>.
—
Reply to this email directly, view it on GitHub
<#21 (comment)>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF77PWT4GAVGK6DGRQJNXDYY6HY5AVCNFSM4HI4TYMKU5DIOJSWCZC7NNSXTWQAEJEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW4OZRGIYTMMRQHEZDGNRQ>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
You are good! I can include benchmarks to compare, then merge this myself. (I may need a hand distributing this on pip conda, but we can do that later.) Sorry to 'at' you on a Monday night. |
This is working now. To test:
Install
Tun
hundo download \ --database-dir /home/cbrislawn/hundo_annotation_references \ --reference-database silva cd example hundo annotate \ --filter-adapters qc_references/adapters.fa.gz \ --filter-contaminants qc_references/phix174.fa.gz \ --database-dir /home/cbrislawn/hundo_annotation_references \ --pipeline ASV \ --reference-database silva \ --out-dir mothur_sop_silva \ mothur_sop_data