You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In evaluating Hail to see whether it fits my use case (a variant frequency database) I ran into an issue with importing VCF files from GIAB. It turns out that these use type String for the PS##FORMAT entry.
Subsequently, Hail fails to import these with the error:
is.hail.utils.HailException: HG001.vcf.gz:column 492: invalid character 'P' in integer literal
This is because of the default behaviour of htsjdk to "repair" these according to the VCF "standard".
Hi folks,
In evaluating Hail to see whether it fits my use case (a variant frequency database) I ran into an issue with importing VCF files from GIAB. It turns out that these use type
String
for thePS
##FORMAT
entry.Subsequently, Hail fails to import these with the error:
This is because of the default behaviour of
htsjdk
to "repair" these according to the VCF "standard".htsjdk
exposescodec.disableOnTheFlyModifications
to toggle this behaviour which can be called from somewhere around https://github.com/hail-is/hail/blob/master/hail/src/main/scala/is/hail/io/vcf/LoadVCF.scala#L1143.Ideally I would like to expose this toggle also at the
import_vcf
method of Hail.I'll create a PR to do so accordingly ASAP.
Comments/questions?
Thanks!
Regards,
Mark
The text was updated successfully, but these errors were encountered: