-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-write metagenome_contributions.py #1
Comments
|
Hi @gavinmdouglas , I updated the local clone today and noticed that metagenome_pipeline.py is taking a long time to run (job has been running for 8 hours and still running). I guess it's because it's calculating stratified output? Is there a way to turn off this option? Thank you so much. Best, |
Hey Jamie, This is definitely a problem, thanks for pointing this out. I have re-written how the stratified data is output and it is much faster now. I haven't added an option yet for non-stratified output only. Thanks, Gavin |
Thanks so much Gavin! It is blazing fast now, but I noticed there is much less number of lines in the pred_metagenome_unstrat.tsv compared to the OUT_PREFIX.genefamilies.biom.tsv in a previous version running with the same data (853 lines vs 3333 lines respectively). Also strangely, in pred_metagenome_strat.tsv, when I check which sequences are mapped to the EC's, only a few (9 out of 485 sequences) are used/output. For example:
Coincidentally these sequences are the very first ones in my data. Could this be a bug (i.e. not all output was written) or PICRUSt2 only mapped a few of my sequences to genes? Best, |
Just to follow-up, the problem was gone as of the latest clone of PICRUSt2 yesterday. |
metagenome_contributions.py
no longer in this repository - there could be better way to output the contributions that could leverage the likelihoods outputted in R by the discrete hidden-state prediction methods.The text was updated successfully, but these errors were encountered: