Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proportion reported by classify_ctgs.py described incorrectly in README? #7

Open
rsharris opened this issue Sep 16, 2019 · 0 comments

Comments

@rsharris
Copy link
Contributor

The README says "the proportion shared between each contig with a female reference is computed."

Maybe I am wrong about the rest of this, but it seems like that contradicts what the code does.

In both classify_fm_male_mode() and classify_fm_mode() it looks like what is reported is (C-F) / C, where C is the number of kmers in the contig (with duplicates counted as often as they appear and all-N kmers not counted) and F is the number of kmers in the contig and also in the female reference.

So a proportion reported as 1.0 would mean none of the contig's kmers were found in the female reference. So that would be evidence that the contig is from something not found in female — presumably male specific.

A proportion reported as 0.0 would mean all of the contig's kmers were found in the female reference. Evidence that the contig is not male specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant