Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExAC frequencies are missing for some indels in the maf2maf output #91

Closed
ShwetaCh opened this issue Oct 25, 2016 · 2 comments
Closed
Assignees

Comments

@ShwetaCh
Copy link
Member

Hi Cyriac,

In the maf2maf VEP output, indels such as 5 56177848 TCAA T are not getting populated with ExAC frequencies. However, the ExAC vcf ExAC_nonTCGA.r0.3.1.sites.vep.vcf does have this allele present in the file but in a non-normalized format : 5 56177848 . TCAACAACAACAA TCAACAACAA, and probably why it is not getting recognized.

-Shweta

@ckandoth ckandoth self-assigned this Oct 25, 2016
@ckandoth
Copy link
Collaborator

Thanks for reporting this. This seems to be a bug in the ExAC plugin. See this line in their code. A fix seems simple, and we can send them a pull request. For each variant, tabix is used to pull overlapping lines in the ExAC VCF, and alleles are loaded in a list called @vcf_alleles. These are compared one-by-one against the input variant allele. But before that, they need to be normalized using a simple regex, or something like bcftools norm.

@ckandoth
Copy link
Collaborator

ckandoth commented Nov 3, 2016

This is now handled in 86e58e3

@ckandoth ckandoth closed this as completed Nov 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants