-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add option for seeding rarefaction? #55
Comments
I really like this idea - if you are doing more global analysis like PCoA plots, I don't think the results would be affected all that much. However, in recent work I've been doing on machine learning with feature tables, I've seen that correlations and p-values can be significantly affected just be re-running rarefaction, especially if the community is particularly diverse. I also think this would allow a more definitive assessment of if rarefaction was making a difference. Aka seed the tables two different ways, run your analysis, and then compare across the tables what was different. Perhaps more importantly this means that you could provide a user or collaborator with the raw table and get to exactly the same rarefied table, without having to send along intermediate files. This seems helpful in the context of when a database is being used, like making sure if you pulled out studies from QIITA and rarified them, that you'd always get the same table. |
Tacking on to what @lkursell mentioned, I've found that ancom results can also differ depending on rarefaction iteration. |
It looks like this is now possible, as setting a random seed has been enabled in biom-format @wasade would you by any chance be interested in exposing that option in q2-feature-table? Or could you let us know when the next release of biom-format is planned so that we can coordinate this issue? |
Hey @nbokulich, the next release will happen as soon as I can get enough time to make it happen. I had actually intended to release a week or two ago, but it keeps getting bumped. It's relatively high on my priorities but just not yet at the top. Is this time sensitive for |
thanks @wasade ! The next release of QIIME 2 is in May (PRs must be merged by May 5) so we could add this feature to q2-feature-table in that release if you cut the new release of |
Great, thank you, that’s helpful to know.
… On Apr 25, 2023, at 10:17 PM, Nicholas Bokulich ***@***.***> wrote:
thanks @wasade <https://github.com/wasade> ! The next release of QIIME 2 is in May (PRs must be merged by May 5) so we could add this feature to q2-feature-table in that release if you cut the new release of biom-format before then. So there's opportunity but not urgency I'd say.
—
Reply to this email directly, view it on GitHub <#55 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADTZMQAVYDG2LHTFOJXCFLXDCVXXANCNFSM4CVUICGA>.
You are receiving this because you were mentioned.
|
Improvement Description
A forum user suggested that we add support for seeding rarefaction, which is an interesting idea for supporting reproducibility, though I'm not certain what the specific use cases would be.
Questions
Are there times where we would want to perfectly replicate rarefaction results? If so, we'd need the seed to be logged into the artifact's provenance.
References
suggested
The text was updated successfully, but these errors were encountered: