New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference model import code is borked #1568

Closed
fnothaft opened this Issue Jun 15, 2017 · 0 comments

Comments

Projects
1 participant
@fnothaft
Member

fnothaft commented Jun 15, 2017

These three genotypes are all split from a single gVCF line with 3 variants and a symbolic model:

{"variant": {"contigName": "chr22", "start": 18030095, "end": 18030099, "names": [], "referenceAllele": "TAAA", "alternateAllele": "T", "filtersApplied": false, "filtersPassed": null, "filtersFailed": [], "\
annotation": {"ancestralAllele": null, "alleleCount": null, "readDepth": null, "forwardReadDepth": null, "reverseReadDepth": null, "referenceReadDepth": null, "referenceForwardReadDepth": null, "referenceRe\
verseReadDepth": null, "alleleFrequency": null, "cigar": null, "dbSnp": null, "hapMap2": null, "hapMap3": null, "validated": null, "thousandGenomes": null, "somatic": false, "transcriptEffects": [], "attrib\
utes": {"MQRankSum": "-0.686", "MQ": "69.72", "MLEAC": "0", "BaseQRankSum": "-0.133", "ClippingRankSum": "-1.438", "MLEAF": "0.00", "ReadPosRankSum": "-0.013", "DP": "114", "MQ0": "0"}}}, "contigName": "chr\
22", "start": 18030095, "end": 18030099, "variantCallingAnnotations": {"filtersApplied": true, "filtersPassed": true, "filtersFailed": [], "downsampled": null, "baseQRankSum": null, "fisherStrandBiasPValue"\
: null, "rmsMapQ": null, "mapq0Reads": null, "mqRankSum": null, "readPositionRankSum": null, "genotypePriors": [], "genotypePosteriors": [], "vqslod": null, "culprit": null, "attributes": {}}, "sampleId": "\
NA12878i", "sampleDescription": null, "processingDescription": null, "alleles": ["OTHER_ALT", "OTHER_ALT"], "expectedAlleleDosage": null, "referenceReadDepth": 13, "alternateReadDepth": 3, "readDepth": 50, \
"minReadDepth": null, "genotypeQuality": 86, "genotypeLikelihoods": [0.0, 0.0, 0.0], "nonReferenceLikelihoods": [0.0, 0.0, 0.0], "strandBiasComponents": [], "splitFromMultiAllelic": true, "phased": false, "\
phaseSetId": null, "phaseQuality": null}
{"variant": {"contigName": "chr22", "start": 18030095, "end": 18030099, "names": [], "referenceAllele": "TAAA", "alternateAllele": "TA", "filtersApplied": false, "filtersPassed": null, "filtersFailed": [], \
"annotation": {"ancestralAllele": null, "alleleCount": null, "readDepth": null, "forwardReadDepth": null, "reverseReadDepth": null, "referenceReadDepth": null, "referenceForwardReadDepth": null, "referenceR\
everseReadDepth": null, "alleleFrequency": null, "cigar": null, "dbSnp": null, "hapMap2": null, "hapMap3": null, "validated": null, "thousandGenomes": null, "somatic": false, "transcriptEffects": [], "attri\
butes": {"MQRankSum": "-0.686", "MQ": "69.72", "MLEAC": "1", "BaseQRankSum": "-0.133", "ClippingRankSum": "-1.438", "MLEAF": "0.500", "ReadPosRankSum": "-0.013", "DP": "114", "MQ0": "0"}}}, "contigName": "c\
hr22", "start": 18030095, "end": 18030099, "variantCallingAnnotations": {"filtersApplied": true, "filtersPassed": true, "filtersFailed": [], "downsampled": null, "baseQRankSum": null, "fisherStrandBiasPValu\
e": null, "rmsMapQ": null, "mapq0Reads": null, "mqRankSum": null, "readPositionRankSum": null, "genotypePriors": [], "genotypePosteriors": [], "vqslod": null, "culprit": null, "attributes": {}}, "sampleId":\
 "NA12878i", "sampleDescription": null, "processingDescription": null, "alleles": ["ALT", "OTHER_ALT"], "expectedAlleleDosage": null, "referenceReadDepth": 13, "alternateReadDepth": 17, "readDepth": 50, "mi\
nReadDepth": null, "genotypeQuality": 86, "genotypeLikelihoods": [0.0, -2.5118865E-9, 0.0], "nonReferenceLikelihoods": [0.0, "-Infinity", 0.0], "strandBiasComponents": [], "splitFromMultiAllelic": true, "ph\
ased": false, "phaseSetId": null, "phaseQuality": null}
{"variant": {"contigName": "chr22", "start": 18030095, "end": 18030099, "names": [], "referenceAllele": "TAAA", "alternateAllele": "TAA", "filtersApplied": false, "filtersPassed": null, "filtersFailed": [],\
 "annotation": {"ancestralAllele": null, "alleleCount": null, "readDepth": null, "forwardReadDepth": null, "reverseReadDepth": null, "referenceReadDepth": null, "referenceForwardReadDepth": null, "reference\
ReverseReadDepth": null, "alleleFrequency": null, "cigar": null, "dbSnp": null, "hapMap2": null, "hapMap3": null, "validated": null, "thousandGenomes": null, "somatic": false, "transcriptEffects": [], "attr\
ibutes": {"MQRankSum": "-0.686", "MQ": "69.72", "MLEAC": "1", "BaseQRankSum": "-0.133", "ClippingRankSum": "-1.438", "MLEAF": "0.500", "ReadPosRankSum": "-0.013", "DP": "114", "MQ0": "0"}}}, "contigName": "\
chr22", "start": 18030095, "end": 18030099, "variantCallingAnnotations": {"filtersApplied": true, "filtersPassed": true, "filtersFailed": [], "downsampled": null, "baseQRankSum": null, "fisherStrandBiasPVal\
ue": null, "rmsMapQ": null, "mapq0Reads": null, "mqRankSum": null, "readPositionRankSum": null, "genotypePriors": [], "genotypePosteriors": [], "vqslod": null, "culprit": null, "attributes": {}}, "sampleId"\
: "NA12878i", "sampleDescription": null, "processingDescription": null, "alleles": ["OTHER_ALT", "ALT"], "expectedAlleleDosage": null, "referenceReadDepth": 13, "alternateReadDepth": 17, "readDepth": 50, "m\
inReadDepth": null, "genotypeQuality": 86, "genotypeLikelihoods": [0.0, -1.9984014E-14, 0.0], "nonReferenceLikelihoods": [0.0, 0.0, 0.0], "strandBiasComponents": [], "splitFromMultiAllelic": true, "phased":\
 false, "phaseSetId": null, "phaseQuality": null}

Alas, they have different non-reference likelihoods, which just ain't right. These are from adam-core/src/test/resources/gvcf_dir/gvcf_multiallelic.g.vcf.

@fnothaft fnothaft added the bug label Jun 15, 2017

@fnothaft fnothaft added this to the 0.23.0 milestone Jun 15, 2017

@fnothaft fnothaft self-assigned this Jun 15, 2017

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 19, 2017

[ADAM-1568] Fix gVCF reference model import/export code.
Resolves bigdatagenomics#1568. Fixes a bug where the wrong non-reference allele index was set
for multiallelic sites with >1 known alternate allele and a non-reference model.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 19, 2017

[ADAM-1568] Fix gVCF reference model import/export code.
Resolves bigdatagenomics#1568. Fixes a bug where the wrong non-reference allele index was set
for multiallelic sites with >1 known alternate allele and a non-reference model.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 19, 2017

[ADAM-1568] Fix gVCF reference model import/export code.
Resolves bigdatagenomics#1568. Fixes a bug where the wrong non-reference allele index was set
for multiallelic sites with >1 known alternate allele and a non-reference model.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-1568] Fix gVCF reference model import/export code.
Resolves bigdatagenomics#1568. Fixes a bug where the wrong non-reference allele index was set
for multiallelic sites with >1 known alternate allele and a non-reference model.

@heuermh heuermh closed this in 34feb3d Jun 22, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment