-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNV, STRs, somatic Var Rep Group concept needed? #39
Comments
Also, bear in mind, that while this "group" concept may seem to be similar to genotype (or haplotype) it is different in that haplotype and genotype represent a "complete" set of variants that must all co-occur. This concept is more of an "OR" than and "AND" of grouped variants. |
@larrybabb I think this moves into the variant annotation area, by mixing cases the need of variant type representation (do we have a proper name for that?) with variant instance representation. Maybe we should just separate the ways variant types / equivalencies are represented from the instance == case... specific representation, into really different approaches? So we would have:
For instance:
|
Malachi Griffith added a summarization of distinct types of variants found in CiVIC in Var Anno repo issue 13. |
@mbaudis we will setup a call for this discussion as it may be too complex to fully separate all the concerns effectively in an issue thread. But to respond to the three instances at the bottom of your comment above I’m not sure I agree the following 2 kinds of variants as equivalent in bullet 1 The relationship between these two is a subset superset relationship (I believe). The notion of a set or class of variants was recently spotlighted by Malachi on the Var Anno call as types of Var reps that would be needed to support the “subject” attribute of many of the somatic interp types. I also see the similarity of this pattern in regards to using copy number ranges to define a set of cnvs which all share a common interp. Finally I would say that I agree with your third bullet in regards to queries needing the ability to query Var class types and/or copy count quantities. However we haven’t yet demonstrated how these kind of qualifying attributes will be bundled with variant concepts needed to build objects that can support the role of Var Anno “subjects”. |
@larrybabb I tried to put too much in the sentence (bullet 1); my note was on the "any deletion of one", and the "any deletion of all" as two different types of equivalence. In imprecise CNV reports (i.e. w/o phasing), the homozygous deletion would "self compose":
Without full allelic reconstruction it would not be sure how the 0 comes about; could be
... and so would be reported as 3 different variants (11, 0000000, 11). So this is a case where we get some meaningful outcome description (yes, there is a homozygous deletion) w/o knowing about the specific alleles. See example of array based data here. But such a (widespread, simplistic) model does not cover the composition of multiple variants, just reports the outcome of this composition. The problem is that we have to accommodate both; but maybe not necessarily in all scenarios. And maybe really thinking this through could help to reduce complexity for each of those implementations. |
Please use #46 for a consolidated discussion of CNV requirements. |
CNVs, microsatellites and a variety of somatic variant representations have given rise to the notion of defining a variant grouping that is a set of variant instances (not necessarily equivalent) which can be used to for annotations, assertions, interpretations, evidence collection, etc...
In our modeling to date we have intentionally been focusing on the most atomic representations, rightfully so. However, with the advent of the copy number discussion, we have introduce the notion of providing a range for the quantity of copies for copy number gain variants.
All previous examples (afaik) have focused on defining a very specific instance of a variant (i.e. allele, haplotype, genotype). We sort of got into this realm of a "set" or "group" of instances when discussion PGx haplotypes as defined by CPIC/PharmGKB, but we never really resolved the concern.
Question...
To focus on CNVs and micro-satellites for now, what does it mean to specify a range of copy numbers (i,e. from 5 to 20 -or- more than 47)?
Possible answer..
A CNV instance is a specific number of copies of a given region of a chromosome. The region of the chromosome that has a non-negative number of copies, is the instance of the sequence. So, to specify a "range" of copies is essentially saying any one of the "instances" in this range belongs in this group.
For example,
If you wanted to specify that a given interpretation is valid for any copy between 4 and 10 of region 1000 to 2000 on chromosome 1 then you are saying that any specific copy instance between 4 copies and 10 copies would be covered by that interpretation.
Interpretation 1...
Variant Group : NC_00001.10:1000..2000 (4 to 10 copies)
Pathogenicity: Uncertain Significance
Interpretation 2...
Variant Group : NC_00001.10:1000..2000 (>10 copies)
Condition: Condition X
Pathogenicity: Pathogenic
Case 1 specific finding...
Variant found: NC_000001.10:1000..20000 (6 copies)
Result: interp 1 above matches and the assertion may potentially be used to inform the patient's results.
Case 2 specific finding...
Variant found: NC_000001.10:1000..2000 (20 copies)
Result: interp 2 above matches and the assertion may potentially be used to inform the patient's results.
Hopefully, this highlights the distinction between defining "variants" that are "sets" or "groups" verses "instances" and the need to be able to do both in order to collect knowledge and associate it with actual findings.
This can also be applied to microsatellites, which are short tandem repeats that often get expressed as a range as well as in the HTT gene for Huntington's disease. see ClinVar NM_002111.6(HTT):c.52CAG(27_35).
Individual assay findings produce a specific count of the tandem repeats and then determine if the fall into the variant group defined by NM_002111.6(HTT):c.52CAG(27_35) or some other group that may have a different interpretation.
As we explore variant representations, let's determine if we need to be separating the notion of atomic, specific, instance representations from group or set representations and provide a clean separation, if so.
The text was updated successfully, but these errors were encountered: