You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is a test data (unphased VCF file converted to BGEN using qctool v2) Archive 2.zip
I want to import all BGEN genotypes into numeric matrix, so I wrote the following function
using BGEN, VCFTools
functionconvert_gt(t::Type{T}, b::Bgen) where T <:Real
n =n_samples(b)
p =n_variants(b)
G =Matrix{t}(undef, p, n)
# loop over each variant
i =1for v initerator(b; from_bgen_start=true)
dose =minor_allele_dosage!(b, v; T=t)
copyto!(@view(G[i, :]), dose)
i +=1endreturn G
end
But imported genotype matrix does not agree with VCF file:
This seems to be because BGEN is checking which allele is the minor allele (so a 2 is swapped with 0 and 0 is swapped with 2 compared to VCF file)
idx =findall(skipmissing(Gtrue .!= Gtest)) # index where Gtrue and Gtest does not agree
julia> [Gtrue[idx] Gtest[idx]]
66811×2 Matrix{Union{Missing, Float64}}:2.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.00.02.02.00.02.00.02.00.02.00.02.00.02.00.02.00.0⋮2.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.02.00.00.02.00.02.00.02.00.02.0
Is there a way to instead read all ALT alleles as 1 and all REF allele as 0?
The text was updated successfully, but these errors were encountered:
Thanks for the tip of using ref_allele_dosage!. Now this works:
using BGEN, VCFTools
functionconvert_gt(t::Type{T}, b::Bgen) where T <:Real
n =n_samples(b)
p =n_variants(b)
G =Matrix{t}(undef, p, n)
# loop over each variant
i =1for v initerator(b; from_bgen_start=true)
dose =ref_allele_dosage!(b, v; T=t) # this reads REF allele as 1
BGEN.alt_dosage!(dose, v.genotypes.preamble) # switch 2 and 0 (ie treat ALT as 1)copyto!(@view(G[i, :]), dose)
i +=1endreturn G
end
Gtest =convert_gt(Float64, Bgen("target.typedOnly.masked.bgen"))
Gtrue = VCFTools.convert_gt(Float64, "target.typedOnly.masked.vcf.gz", trans=true)
julia>all(skipmissing(Gtrue .== Gtest))
true
Here is a test data (unphased VCF file converted to BGEN using qctool v2)
Archive 2.zip
I want to import all BGEN genotypes into numeric matrix, so I wrote the following function
But imported genotype matrix does not agree with VCF file:
This seems to be because BGEN is checking which allele is the minor allele (so a 2 is swapped with 0 and 0 is swapped with 2 compared to VCF file)
Is there a way to instead read all ALT alleles as 1 and all REF allele as 0?
The text was updated successfully, but these errors were encountered: