-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compatibility with bracken #2
Comments
Yes. The 5/31/2016 update on the Bracken site says:
kraken-biom expects Kraken report files, so there should be no problem using Bracken output with kraken-biom. But please let me know if you have any issues with it. |
Thanks for getting back to me. I compared the report file from Bracken and the biom files generated by kraken-biom tool. There seems to be a large issue here. Bracken redistributes reads from higher to lower taxon levels and thus the report files it generates contain "zero" reads directly assigned to taxon levels above species (S). This would be the third column in the report file. The second column now contains a cumulative sum of the reads from S to D. I am guessing that in order to allow your code to process Bracken output you would have to process the second column instead of the third column as it is done in the Kraken output. I have pasted a sample for comparison of both kraken and bracken outputs: Kraken Output: 26.12 9918653 9918653 U 0 unclassified Bracken Output: 26.12 9918653 9918653 U 0 unclassified |
The third column still contains the reads assigned directly to species, which after running Bracken are the only assigned reads. So the only change you should need to make when running kraken-biom would be to set --max to S. That way both min and max are set to Species and only the species-assigned reads will be extracted from the report file and written to the BIOM table. |
That sounds good to me. Thanks for taking your time to review it. By the way, I get the following error when I use --max S argument: ERROR: Max and Min ranks are out of order: S < S The default command should still work fine without the --max argument, since taxons with zero values across all samples are discarded by kraken-biom. Cheers, |
Happy to help. And thanks for passing along the error message. Setting both to S is actually a part of the test suite for the parsing, but I apparently did't test it before it gets there. It looks like I have a |
Will this tool still be compatible with the kreport output from bracken ?
The text was updated successfully, but these errors were encountered: