-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KMCP's MetaPhlAn output doesn't follow the MetaPhlAn file format #34
Comments
Thanks for reporting this. I did not notice this for such a long time ...
Skipping ranks should not cause that. Can you attach a file? Another way is using taxonkit cami-filter (without setting |
Sure! TAXPASTA filed in the step it checks the composition, specifically, at the part it checks if all taxa within a given rank sum up to 100%. I summed up the abundances manually and I saw that some ranks had summed abundances lower than that. ERR7569999.metaphlan.txt |
I see. Some ref genomes' lineages do not have all the 7 ranks, which is quiet normal I think. Maybe ask taxpasta to support this?
|
The MetaPhlAn output generated by KMCP is not the same as the one generated by MetaPhlAn. In the KMCP output, the taxid column only contains the taxid of the lowest taxonomic rank (e.g.
1224
), while the one generated by MetaPhlAn contains the full lineage, separated by|
(e.g.2|1224
).This makes the KMCP output incompatible with TAXPASTA.Actually, the TAXPASTA error is due to rank not summing up to 100% (due to lineages genomes skipping some ranks).The text was updated successfully, but these errors were encountered: