-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
<*> allele vs <NON_REF> #352
Comments
Historically, samtools was the first caller to start using the unspecified allele and originally it was |
Interesting, the history seems complicated. I did bit of digging to try and understand it.
Unfortunately GATK never got the message to change to It might be worth considering adding back |
Given that we have consistently said that gVCFs are NOT deliverable, I
think that we have an opportunity to change the <NON_REF> allele at EVERY
GATK version...
…On Tue, Oct 16, 2018 at 4:53 AM Petr Danecek ***@***.***> wrote:
Historically, samtools was the first caller to start using the unspecified
allele and originally it was *. After this was codified in the
specification as <*>, samtools/bcftools started producing <*> instead. I
don't know why GATK uses <NON_REF> instead of <*>.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#352 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACnk0klF-KcFTAOygOlmd8GnUpOMeFYfks5ulZ6IgaJpZM4XdDxd>
.
|
@yfarjoun my best google-fu could not find a list of supported or unsupported deliverables for GATK, but searching for ‘<NON_REF>’ did yield a non-trivial number of tools that rely or leverage the allele. So likely there are gVCFs that are stored that have this allele, waiting to be combined in joint calling mid-project, or otherwise. I just don’t buy your argument. I’d argue that GATK switch over to the spec and perhaps add an option to output the old allele for a while. |
@nh13 I agree that GATK hasn't public about what is or isn't a deliverable, but we have said that there should be a consistency between the version that created the gVCF and the one that combines/genotypes them. Regardless, we agree on the conclusion, that GATK can switch to the spec, rather than the spec changing. |
Yes, that would be great if GATK could switch to the spec. @lbergelson You are right, completely forgot about the |
@yfarjoun & @lbergelson I would think it's probably feasible to have GATK recognize |
It is, of course, feasible, but I'm actually concerned about |
Ah, yes, good point. |
To make it even more confusing, it seems like at some point we used |
@yfarjoun It can't be that hard, you just did it! ;-) |
I must admit, having I think there is ample opportunity for ambiguity in the v4.3 spec regarding how to interpret 1.6.1, Fixed fields, ALT:
5.5, Representing unspecified alleles and REF-only blocks (gVCF):
If the spec docs were more clear about this distinction, and GATK et al. tools can be updated to be spec-consistent, I guess I could live with it, though it still might make me a bit queasy... |
* Adding <NON_REF> to the spec and recommending its use over the <*> allele which is easily confused with *. * closes samtools#352
* Adding <NON_REF> to the spec and recommending its use over the <*> allele which is easily confused with *. * closes samtools#352
In the VCF 4.3 spec the
<*>
is specified as the unspecified alt allele.GATK uses the
<NON_REF>
allele instead for this purpose and treats<*>
as a deprecated version of the*
allele. This is obviously not matching the spec. Are there other software suites that generate GVCFs that use<*>
?@pd3 I'm wondering why
<*>
was selected as the canonical unspecified alt. Do other common callers produce gVCFs with<*>
?The text was updated successfully, but these errors were encountered: