Profiling output table interpretation #22

lborcard · 2022-11-07T12:52:16Z

Dear Shenwei,

Thank you very much for your very nice tool, we are trying to understand how to interpret the output table in KMCP format.

If the output table contains more than one ref per species based on which parameter should we choose the best hit?
According to your manual the percentage column refers to Relative abundance of the reference however, we are not sure how this value is calculated. Could you give us more details about this metric?

thank you very much,

best,

Loïc

shenwei356 · 2022-11-07T14:19:51Z

Thanks for using KMCP.

If the output table contains more than one ref per species based on which parameter should we choose the best hit?

The real genome in samples may match more than one reference, we can't tell which one is the truth. But the similarity score (column score, the 90th percentile of k-mer coverage of all uniquely matched reads) may be an index to show which one is more similar to the real genome.

According to your manual the percentage column refers to Relative abundance of the reference however, we are not sure how this value is calculated. Could you give us more details about this metric?

First, the coverage (column coverage) of each matched reference genome is computed by dividing the total bases of matched reads with the genome size (the total bases of either complete genome or unfinished genomes like MAGs with plasmid sequences filtered out). Then the relative abundance of one species is computed by dividing the sum of genome coverages of this species with the sum of genome coverages of all genomes. At last, the relative abundances of taxa at each rank are the sum of percentages of all the child taxa.

lborcard · 2022-11-07T15:33:49Z

thank you for the swift reply, if we have several refs with a score of 100 what would be the second metric to use to filter them? would coverage be a good one to use?

shenwei356 · 2022-11-07T16:17:08Z

I think so.

shenwei356 closed this as completed Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profiling output table interpretation #22

Profiling output table interpretation #22

lborcard commented Nov 7, 2022

shenwei356 commented Nov 7, 2022

lborcard commented Nov 7, 2022

shenwei356 commented Nov 7, 2022

Profiling output table interpretation #22

Profiling output table interpretation #22

Comments

lborcard commented Nov 7, 2022

shenwei356 commented Nov 7, 2022

lborcard commented Nov 7, 2022

shenwei356 commented Nov 7, 2022