Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phasing chromosome X is wrong when males are encoded as haploids in input VCF/BCF #99

Open
dtaliun opened this issue Apr 22, 2024 · 0 comments

Comments

@dtaliun
Copy link

dtaliun commented Apr 22, 2024

Hi,

Phasing results for chromosome X non-PAR regions are wrong when providing input VCF/BCF files with haploid male genotypes represented as 0 and 1 instead of 0/0 and 1/1.

The issue is in the lines 89-91 in the genotype_reader::readGenotypes():

for(int32_t i = 0 ; i < 2 * n_main_samples ; i += 2) {
	bool a0 = (bcf_gt_allele(main_buffer[i+0])==1);
	bool a1 = (bcf_gt_allele(main_buffer[i+1])==1);
	bool mi = (main_buffer[i+0] == bcf_gt_missing || main_buffer[i+1] == bcf_gt_missing);

When the genotype is haploid, the second value (i.e. a1) will hold bcf_int32_vector_end, denoting the end of genotype. Thus, current code will transform GT = '1' to GT = '1/0', which is heterozygous. All heterozygous calls will be replaced by missing values, which will completely invalidate phasing results.

I suggest the following change to support both haploid male gemotype coding (i.e. GT=0 and 1; and GT = 0/0 and 1/1) in input VCF/BCF:

for(int32_t i = 0 ; i < 2 * n_main_samples ; i += 2) {
	bool a0 = (bcf_gt_allele(main_buffer[i+0])==1);
	bool a1 = (main_buffer[i + 1] == bcf_int32_vector_end) ? a0 : (bcf_gt_allele(main_buffer[i+1])==1);
        bool mi = (main_buffer[i + 1] == bcf_int32_vector_end) ? (main_buffer[i+0] == bcf_gt_missing) : (main_buffer[i+0] == bcf_gt_missing || main_buffer[i+1] == bcf_gt_missing);

P.S. I caught this when I got an error similar to Issue #33 . Then, I noticed that phase_common phased many of haploid samples as hets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant