-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggested edits for version 2 (Adding ete3 db, updating metaeuk, custom protein files) #30
Comments
Just following up on this. Any thoughts? If you don't have the bandwidth to implement please let me know and I'll try to find an alternative. Thanks. |
The biggest limiting factor is Right now, Another alternative to ete3 is just using I've tried a few different manual edits for this but other than hardcoding my actual path, it's unusable. |
Sorry I did not get back to you, I am currently quite busy as well but should be getting to this now. Adding the ETE database is a very good idea in my opinion and on my todo list. I will have a look at your suggestions and might come up with a solution to this issue. Updating metaEuk, I am not sure about as the training was done on metaeuk 4 and updating it might lead to wrong results. Providing custom protein files, you mean instead of providing the genome? That is already supported. Just pass the option Sorry again for the delay. |
Awesome, thanks for getting back to me. I took a couple of cracks at it (hence the fork) but I wasn't successful without hardcoding it. I hope some of the notes above help out if you decide to implement this. Didn't see the -AA option but that seems like exactly what I was looking for with that note! More than willing to test out some code for you if you decide to implement. |
So, I have looked into the ETE3 database inclusion. Thank you for the motivation. Let me know if this works. Should the usage of the provided ete database be optional or mandatory? I am leaning towards mandatory as then I do not need to add more code, but am open for comments. |
If it's already in |
I've been taking a deep dive into eukaryotic metagenomics and EukCC seems like a great tool to have in the repertoire. From the documentation, it seems that EukCC version 2 is still in development so I thought I could make a log of suggested changes or edits:
Then the following code could be edited:
From
base.py
from
__main__.py
:EDIT:
The commands above won't work because I falsely assumed
EUKCC2_DB
was an environment variable that was propagated through the different modules but that's not the case. I was able to modify the following:but was unable to figure out how to get the db path from
treehandler.py
(this function:tax_LCA
)Example error:
Is there a way to access the database path throughout all the scripts?
Apologies for bombarding your GitHub today. Finally got to a point in my pipeline that I've been working on for over a year and the EukCC bit is a critical stage.
The text was updated successfully, but these errors were encountered: