Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calculate_haplotype_statistics.py slightly differs from block headers #124

Open
magnitov opened this issue May 4, 2022 · 0 comments
Open

Comments

@magnitov
Copy link

magnitov commented May 4, 2022

Hi @vibansal, I have a question about the calculate_haplotype_statistics.py script. I noticed that the phased count and num snps max blk reported by the script are different from those in BLOCK headers of my .hap file I use. For instance, if I sum the total number of phased SNVs and check the number of SNVs in the largest block in .hap file, I get slightly different counts as compared to the script output.

If I sum the phased field for all blocks I get the following number: 189701. My largest block header is as following:

BLOCK: offset: 12 len: 189252 phased: 188348 SPAN: 248704444 fragments 663113

However, the output from calculate_haplotype_statistics.py gives the following numbers with -i on:

phased count: 188484
num snps max blk: 188057

I wonder if there is some kind of filter implemented in the script that causes this?

Best,
Mikhail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant