Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SV variants and large SNVs collide in variant_id #4682

Closed
dnil opened this issue Jun 19, 2024 · 0 comments
Closed

SV variants and large SNVs collide in variant_id #4682

dnil opened this issue Jun 19, 2024 · 0 comments
Assignees
Labels

Comments

@dnil
Copy link
Collaborator

dnil commented Jun 19, 2024

Describe the bug
While normally describing SVs with position and type, e.g. chr1_123456_T_DUP or such, an SV caller (manta) may elect to report small variants, and if the variant is small enough, even represent the full allele sequence, e.g. chr1_123456_T_TT. These variants typically also occur in the SNV/INDEL file, giving the same variant id. Depending on what file is parsed first, the variant will only appear in that type variant list. This is normally not a problem, but sometimes an institute may elect not to analyse variants of a certain type, e.g. only SNVs, not SVs, in which case the variant could be missed.

To Reproduce

  1. Load a case with some collisions, e.g. engaginggrouse. Note how there are ID warnings:
2024-06-19 09:32:06 hasta.scilifelab.se scout.adapter.mongo.variant_loader[12314] WARNING Variant 900f6c92533766eabeb10a8573297f93 already exists in database - modifying
2024-06-19 09:32:06 hasta.scilifelab.se scout.adapter.mongo.variant_loader[12314] WARNING Variant 8fb3f02b57613d36aefa0b690eeb8e8f already exists in database - modifying
  1. Note how the variant is now only shown on the variant type page that was parsed first, here the cancer_sv.

Expected behavior
We would (arguably, this is non-obvious) still like to see the duplicated variants on both type views. A workaround might be to ensure SVs are loaded last and modify colliding small SV calls to be of the less informative, big SV type. We should carefully check that the matching causatives behaviour is still acceptable. The variant could now be tallied in two different views, which might get a bit complex.

Additional context
Manta, which is the caller we currently primarily see with this behaviour, does not seem to have an option to force long-sv style entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant