Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correct the precision in expected taxonomy files #3

Closed
gregcaporaso opened this issue May 4, 2016 · 3 comments
Closed

correct the precision in expected taxonomy files #3

gregcaporaso opened this issue May 4, 2016 · 3 comments

Comments

@gregcaporaso
Copy link
Member

gregcaporaso commented May 4, 2016

@nbokulich, when @jairideout and I were testing this we discovered that the relative abundances are off in some of the expected taxonomy files. We wanted to confirm that the taxa abundances in each sample sum to 1.0 (to seven decimal places, the unitetest.assertFloatEqual default). We found that in some cases they're not equal even to two decimal places. For example, in mock-3, the sum of the values in sample HMPMockV1.2.Staggered1 is 1.02. Would you be able to look into this? We have a test file that you can run now which will help you identify these samples (run python tests/check_data_integrity.py in this repository - this is Python 3 only).

Note that this issue is causing the current build to fail - I think it's important to have this block people from using the data for now.

@nbokulich
Copy link
Contributor

sounds like this is an issue with the rounding... I will look into this.
Thanks for noticing!

Is this just an issue with greengenes, Silva, or both?

On Wed, May 4, 2016 at 4:41 PM, Greg Caporaso notifications@github.com
wrote:

@nbokulich https://github.com/nbokulich, when @jairideout
https://github.com/jairideout and I were testing this we discovered
that the relative abundances are off in some of the expected taxonomy
files. We wanted to confirm that the taxa abundances in each sample sum to
1.0 (to seven decimal places, the unitetest.assertFloatEqual default). We
found that in some cases they're not equal even to two decimal places. For
example, in mock-3, the sum of the values in one of the samples is 1.02.
Would you be able to look into this? We have a test file that you can run
now which will help you identify these samples (run python
test/check_data_integrity.py in this repository - this is Python 3 only).


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/gregcaporaso/mockrobiota/issues/3

@gregcaporaso
Copy link
Member Author

It's an issue with a bunch of them, to varying degrees. I'm realizing now that a table that sums the values for all dataset/database/version combinations would help. Either @jairideout or I can generate this for you today.

jairideout added a commit that referenced this issue May 9, 2016
ENH: fixes #8, #3. includes modifications made by @nbokulich in #7
@jairideout
Copy link
Collaborator

Fixed in #9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants