-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep information for <NON_REF> alleles #2116
Conversation
Test FAILed. Build result: FAILUREGitHub pull request #2116 of commit efbbe95 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > git rev-parse origin/pr/2116/merge^{commit} # timeout=10 > git branch -a -v --no-abbrev --contains 6ebbd52 # timeout=10Checking out Revision 6ebbd52 (origin/pr/2116/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 6ebbd52b75b4382b4f5995d6a288e1fa6f6d3f36First time build. Skipping changelog.Triggering ADAM-prb ? 2.7.5,2.11,2.2.2,ubuntuADAM-prb ? 2.7.5,2.11,2.2.2,ubuntu completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'Test FAILed. |
Test FAILed. Build result: FAILUREGitHub pull request #2116 of commit 36199e6 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > git rev-parse origin/pr/2116/merge^{commit} # timeout=10 > git branch -a -v --no-abbrev --contains c37240f # timeout=10Checking out Revision c37240f (origin/pr/2116/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f c37240fc448ce498a0bb7a44aa75247b5edd5045First time build. Skipping changelog.Triggering ADAM-prb ? 2.7.5,2.11,2.2.2,ubuntuADAM-prb ? 2.7.5,2.11,2.2.2,ubuntu completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'Test FAILed. |
Hello @karenfeng! As an alternative to this approach, have you considered splitting |
@karenfeng @fnothaft Would it be possible to start a discussion around this issue? I've done major refactoring of genotypes in #2117 but have yet to commit changes to non ref fields in that pull request, waiting for this to be resolved. |
Ping @karenfeng @fnothaft for discussion |
Closing as WontFix for lack of discussion. |
INFO and FORMAT fields with Number=<A/R/G> are not properly stored in ADAM for the <NON_REF> allele. This is a problem with both default and non-default fields. During conversion to ADAM, most fields lose any information related to the <NON_REF> allele. This is with the exception of PL, for which the majority of values is stored between
genotypeLikelihoods
andnonRefLikelihoods
. However, this still drops the value for the ALT/<NON_REF> genotype.As a result of this bug, the output list for most fields with Number=<A/R/G> does not contain enough values in the resulting gVCF (those with Number=A should have 2 but have 1, those with Number=R should have 3 but have 2, and those with Number=G should have 10 but have 6). For PL, the missing value for the ALT/<NON_REF> genotype contains
3223
.To fix this problem, this PR does the following:
VariantAnnotation.attributes
for INFO fields, andVariantCallingAnnotations.attributes
for FORMAT fields).