Variant disappears with clinical filter - why? #3858

KickiLagerstedt · 2023-03-17T15:41:14Z

https://scout.scilifelab.se/cust002/F0051063/e7b97e909eedb79078bdf38fb7b18903

Gene annotation: splicing
Function annotation: splice_polypyrimidine_tract_variant

KickiLagerstedt · 2023-03-17T15:42:44Z

https://scout.scilifelab.se/cust002/F0051801/sv/variants/29235c154511427ab50bcf6eca43cf94

Gene annotation: splicing
Function annotation: splice_polypyrimidine_tract_variant

dnil · 2023-03-17T17:04:24Z

The most direct answer is that splice_polypyrimidine_tract_variant is not included in the clinical filter on its own. It is fairly recently added to the SO_TERMS (we added it on our side this autumn, VEP a bit earlier, and MIP a little later). We do not have it as a SEVERE_SO_TERM. Arguably, it should not be - a polypy variant kind of needs more careful splice model inspection, or other molecular experimental data, to determine if it is really relevant on its own - but we are always happy to discuss.

Then there are some complications for both these variants.
1, the deletion is fairly large, and while it encompasses only UTR exons on the transcript VEP selects, it does touch coding exons in alternative transcripts, and I would have needed to spend much longer to figure out if it was potentially disease relevant or not. Unless you have a strong opinion, I would suggest we don't do anything about this one right now, but we are aware that consequence annotation of SVs can be tricky. Keeping the annotation gene models set up to date is important, and some progress on that is being made currently (see Schug, https://github.com/Clinical-Genomics/schug).

2, the SNV is a -13 splice change with a very high SpliceAI Acceptor Gain delta.
It is specifically for this kind of intronic variant that we really depend on ClinVar to lift variants on the lists.
It has a Conflicting ClinVar annotation, but its revision status is not at the "TRUSTED" level. I think this is unfortunate as it really only has one old VUS annotation, and then several LP and P annotations later. A community driven (re)evaluation seems due. Seeing this, we could also consider actually just letting all the ClinVar annotations through the clinical filter, but we already get many silly Conflicting LB/B showing and would get even more. A better solution would be to dig deeper into ClinVar at annotation time. We have discussed this a few times, but not really started a project on it. It would not necessarily be a quick one, as we don't think this level of info is easily available from secondary sources, but would have to be parsed out from the ClinVar db data dumps first.

I would lean towards the latter solution here. But, even in the meantime, we could also try skipping the TRUSTED level filter for a while and see how it feels?

dnil · 2023-03-20T09:01:17Z

Ok, we had a local discussion, and think that
1, this is largely on VEP/ENSEMBL and the hg19 transcript set. We do know this is not really maintained, and they wisely spend their effort on later genome builds. This adds one more example where it would have been great to have hg38: probably would just have worked, and if not, we would have a good case to update ENSEMBL.

2, deep diving clinvar seems like the good future solution, but presumably happens with nf-core/rd (ping @jemten?).
For now we agree that removing the TRUSTED requirement may bring more noise than we are ready for at the moment.

jemten · 2023-03-20T09:34:55Z

Currently we are trying to limit the development of MIP in favour of the new pipeline. Happy to discuss projects to refine the rankmodel/annotation for the new pipeline though.

KickiLagerstedt · 2023-03-22T15:49:04Z

One more: Founder-mutaiton - duplication of exon 13 i BRCA1

https://scout.scilifelab.se/cust002/F0052362/sv/variants/3b4ecd35753a948e7e197a17abaf784c

dnil · 2023-03-22T17:53:49Z

Thank you for the persistence! This one I cannot explain at a glance - I'm suspecting #3534 which gives splice_polypyrimidine_tract_variant slightly higher score than coding_sequence_variant - which is in clinical filter - in combination with filtering on most_severe_consequence rather than all transcript consequences. We'll check tomorrow!

dnil · 2023-03-30T12:19:50Z

Here is another one from LK:
https://scout.scilifelab.se/cust003/23112/sv/variants/9578e6c3eb244c4a28cf1e3614800571

Again, the reason seems to be a combination of not getting severe enough terms that it ought to have, presumably because VEP can not easily tell how much is lost, it is still at least on par with an actual splice site loss.

Let's go through the SO-term order again. I think we follow the VEP order fairly well in (https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html) but there is also a step in MIP where the most_severe_consequence is selected that can be checked. It is annoying that polypyrimidine tract region modifiers get a higher impact than 'coding_sequence_variant'. I can see how they occasionally do, but the bulk of them don't.

If we still can't find any errors, one option is to evaluate including splice_polypyrimidine_tract_variant in the clinical filter, making sure we don't get too many hits left.

jemten · 2023-03-31T08:43:35Z

I believe that there is a need to have the "most severe consequence" synchronised this across the board. Right now MIP uses the list from VEP to decide what features are the most sever ones and I think scout is using the same order. For MIP this means that we will rank and annotate the variant using splice_polypyrimidine_tract_variant rather than coding_sequence_variant. If we in the clinical filter don't believe this to be true, we should change either in MIP/Scout or in the clinical filter.

dnil · 2023-03-31T09:13:42Z

A lazy and rather practical approach would be to simply add splice_polypyrimidine_tract_variant and possibly a couple of the other consequences SO-ranked above coding_sequence_variant to the clinical filter. Its hard to really fault VEP or SO for this; a polypyrimidine tract change can occasionally be very detrimental. Just not often. And a coding sequence change may not be so important - a lot of missense variation is not. What is missing here is a class in between coding_sequence_change and transcript_ablation, like making feature_truncation of one or more coding exons, say exon_ablation..

dnil · 2023-03-31T09:21:15Z

Or, actually we can fault VEP a little for not being precise. The docs description of transcript_ablation is "A feature ablation whereby the deleted region includes a transcript feature". The way I (and VEP I believe) use this is for a deletion of a whole transcript, but it could have been just part of a transcript feature instead, which would have solved the first and last of these four examples. And same with transcript_amplification for example 3.

dnil · 2023-04-06T06:05:26Z

Here is another one:
https://scout.scilifelab.se/cust002/F0050060/sv/variants/b4ce07fb3a862cb537cb63fbba7c6331

KickiLagerstedt changed the title ~~Variant dissappers with clinical filter - why?~~ Variant disappers with clinical filter - why? Mar 17, 2023

KickiLagerstedt changed the title ~~Variant disappers with clinical filter - why?~~ Variant disappears with clinical filter - why? Mar 17, 2023

dnil added the Under discussion label Mar 20, 2023

dnil added a commit that referenced this issue Apr 6, 2023

Add blogpost. Fix #3858. Fix #3752.

2e0e286

dnil closed this as completed in b78e598 Apr 11, 2023

dnil mentioned this issue Apr 24, 2023

most severe consequence polypyrimidine stretch Clinical-Genomics/MIP#2031

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variant disappears with clinical filter - why? #3858

Variant disappears with clinical filter - why? #3858

KickiLagerstedt commented Mar 17, 2023

KickiLagerstedt commented Mar 17, 2023

dnil commented Mar 17, 2023

dnil commented Mar 20, 2023

jemten commented Mar 20, 2023

KickiLagerstedt commented Mar 22, 2023

dnil commented Mar 22, 2023

dnil commented Mar 30, 2023

jemten commented Mar 31, 2023

dnil commented Mar 31, 2023 •

edited

Loading

dnil commented Mar 31, 2023

dnil commented Apr 6, 2023

Variant disappears with clinical filter - why? #3858

Variant disappears with clinical filter - why? #3858

Comments

KickiLagerstedt commented Mar 17, 2023

KickiLagerstedt commented Mar 17, 2023

dnil commented Mar 17, 2023

dnil commented Mar 20, 2023

jemten commented Mar 20, 2023

KickiLagerstedt commented Mar 22, 2023

dnil commented Mar 22, 2023

dnil commented Mar 30, 2023

jemten commented Mar 31, 2023

dnil commented Mar 31, 2023 • edited Loading

dnil commented Mar 31, 2023

dnil commented Apr 6, 2023

dnil commented Mar 31, 2023 •

edited

Loading