Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation Service needs to be extended to support multiple alternate alleles #198

Open
nalinigans opened this issue Feb 3, 2022 · 1 comment
Assignees

Comments

@nalinigans
Copy link
Member

This is something to keep in mind when we start using this functionality.

Originally posted by @jPleyte in #186 (comment)

@nalinigans nalinigans changed the title Annotation Service needs to be extended to support multiple alleles Annotation Service needs to be extended to support multiple alternate alleles Feb 4, 2022
@jPleyte
Copy link

jPleyte commented Feb 4, 2022

When there are multiple alleles there end up being multiple ways to write the variant call, especially as the length of the ALT and REF sequence grows. Rather than looking comparing the multiple alternate alleles with the reference value, we should normalise them.

Karthic's note on this:

I'm assuming by match, you mean a simple string comparison, which (as you note) will not return any matches.
We face the same issue when combining multiple VCFs (combining samples each of which might be multi-allelic). We solve this in the same way you describe - normalizing the ALT alleles and then doing string matching. There's optimized code in GenomicsDB for normalizing and matching normalized alleles. It's tuned specifically for the combined VCF case, but if there are other uses I can try to make the interface to the module friendlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants