New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes parsing variant annotations for multi-allelic rows #1346

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
@majkiw
Contributor

majkiw commented Jan 10, 2017

When I tried to run the code in master I had java.lang.ArrayIndexOutOfBoundsException exceptions in org.bdgenomics.adam.converters.VariantContextConverter#fromArrayExtractor whenever multi-allelic row was encountered.
It turned out the index was off by 1 when accessing multivalued annotations.

For example in gvcf_dir/gvcf_multiallelic.g.vcf there is a line:
chr22 18030096 . TAAA T,TA,TAA,<NON_REF> 564.73 . BaseQRankSum=-0.133;ClippingRankSum=-1.438;DP=114;MLEAC=0,1,1,0;MLEAF=0.00,0.500,0.500,0.00;MQ=69.72;MQ0=0;MQRankSum=-0.686;ReadPosRankSum=-0.013 GT:AD:DP:GQ:PL 2/3:13,3,17,17,0:50:86:602,508,1628,86,678,553,137,342,0,281,467,744,353,309,659

ALT is T,TA,TAA,<NON_REF> with alt allele indexes (as returned by vc.getAlleleIndex(allele)): 1,2,3,4
When accessing annotation MLEAC=0,1,1,0 it would load indexes 1,2,3 for T,TA,TAA resulting in 1,1,0.
However there is never a value for reference allele here! So instead it should be loading indexes 0,1,2 resulting in values 0,1,1.

Incidentally test using gvcf_dir/gvcf_multiallelic.g.vcf wasn't breaking for you because there was always extra 0 as the last value corresponding to <NON_REF> - which was effecting in silent shift of values.
This becomes much easier to break in "real" VCF where there is no <NON_REF> anymore.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 10, 2017

Can one of the admins verify this patch?

Can one of the admins verify this patch?

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 10, 2017

Member

Jenkins, test this please

Member

heuermh commented Jan 10, 2017

Jenkins, test this please

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 10, 2017

Member

Thanks for reporting this, @majkiw!

The test looks good; give me a bit to review the fix, to make sure there aren't other cases where we're off. That's probably the only method in VariantContextConverter that wasn't rewritten in the refactor.

If you could extract a test file from a non-gVCF with multiple alts, that would be helpful. Additionally we should probably craft one with some pathological cases.

Member

heuermh commented Jan 10, 2017

Thanks for reporting this, @majkiw!

The test looks good; give me a bit to review the fix, to make sure there aren't other cases where we're off. That's probably the only method in VariantContextConverter that wasn't rewritten in the refactor.

If you could extract a test file from a non-gVCF with multiple alts, that would be helpful. Additionally we should probably craft one with some pathological cases.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 10, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1726/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1726/
Test PASSed.

@heuermh heuermh modified the milestones: 0.21.1, 0.22.0 Jan 10, 2017

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 11, 2017

Member

Thanks for the bugfix, @majkiw! @heuermh, I ran out of time today to make a review pass but this is first in line on my docket tomorrow.

Member

fnothaft commented Jan 11, 2017

Thanks for the bugfix, @majkiw! @heuermh, I ran out of time today to make a review pass but this is first in line on my docket tomorrow.

@fnothaft

LGTM! Thank you for the fix, @majkiw! I will merge this manually.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 11, 2017

Member

Merged manually as 5f0ef28. Thank you for the contribution, @majkiw!

Member

fnothaft commented Jan 11, 2017

Merged manually as 5f0ef28. Thank you for the contribution, @majkiw!

@fnothaft fnothaft closed this Jan 11, 2017

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 11, 2017

Member

Thank you, @majkiw!

Member

heuermh commented Jan 11, 2017

Thank you, @majkiw!

@majkiw

This comment has been minimized.

Show comment
Hide comment
@majkiw

majkiw Jan 12, 2017

Contributor

Thank you guys for processing this PR so quickly :)

Contributor

majkiw commented Jan 12, 2017

Thank you guys for processing this PR so quickly :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment