New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1682] Add variant quality field. #1684

Merged
merged 3 commits into from Oct 11, 2017

Conversation

Projects
4 participants
@fnothaft
Member

fnothaft commented Aug 25, 2017

Resolves #1682. Depends on bigdatagenomics/bdg-formats#147.

@fnothaft fnothaft added this to the 0.23.0 milestone Aug 25, 2017

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Aug 25, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2330/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1684/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 5ecbb19 # timeout=10Checking out Revision 5ecbb19 (origin/pr/1684/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 5ecbb19b61ef95dfbdab2aa0e835bceae1473b3aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.1.0,centosADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.1.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Aug 25, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2330/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1684/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 5ecbb19 # timeout=10Checking out Revision 5ecbb19 (origin/pr/1684/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 5ecbb19b61ef95dfbdab2aa0e835bceae1473b3aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.1.0,centosADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.1.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Aug 25, 2017

Member

LGTM. Could you add an additional log.info line at or around

https://github.com/fnothaft/adam/blob/bd697e38e681a1402acc4f9a91da1a3532ccaa6b/adam-core/src/main/scala/org/bdgenomics/adam/converters/VariantContextConverter.scala#L311

to complement this one

https://github.com/fnothaft/adam/blob/bd697e38e681a1402acc4f9a91da1a3532ccaa6b/adam-core/src/main/scala/org/bdgenomics/adam/converters/VariantContextConverter.scala#L275

with a message about splitting alternate alleles, and how some fields such as Variant.quality may no longer be particularly relevant or correct?

Member

heuermh commented Aug 25, 2017

LGTM. Could you add an additional log.info line at or around

https://github.com/fnothaft/adam/blob/bd697e38e681a1402acc4f9a91da1a3532ccaa6b/adam-core/src/main/scala/org/bdgenomics/adam/converters/VariantContextConverter.scala#L311

to complement this one

https://github.com/fnothaft/adam/blob/bd697e38e681a1402acc4f9a91da1a3532ccaa6b/adam-core/src/main/scala/org/bdgenomics/adam/converters/VariantContextConverter.scala#L275

with a message about splitting alternate alleles, and how some fields such as Variant.quality may no longer be particularly relevant or correct?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 5, 2017

Member

Jenkins, test this please.

Member

fnothaft commented Sep 5, 2017

Jenkins, test this please.

@fnothaft fnothaft referenced this pull request Sep 5, 2017

Closed

Release version 0.11.2 #148

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 5, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2351/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1684/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 5124f5b # timeout=10Checking out Revision 5124f5b (origin/pr/1684/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 5124f5b68422d1663052015858ec05753c4a687cFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.11,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,2.1.0,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.1.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Sep 5, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2351/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1684/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 5124f5b # timeout=10Checking out Revision 5124f5b (origin/pr/1684/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 5124f5b68422d1663052015858ec05753c4a687cFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.11,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,2.1.0,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.1.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 5, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2352/
Test PASSed.

AmplabJenkins commented Sep 5, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2352/
Test PASSed.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Sep 5, 2017

Coverage Status

Coverage decreased (-0.2%) to 83.382% when pulling ea8a23d on fnothaft:issues/1682-variant-quality into 511f925 on bigdatagenomics:master.

coveralls commented Sep 5, 2017

Coverage Status

Coverage decreased (-0.2%) to 83.382% when pulling ea8a23d on fnothaft:issues/1682-variant-quality into 511f925 on bigdatagenomics:master.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 5, 2017

Member

This is not mergeable until the 0.11.2 release of bdg-formats is pushed.

Member

fnothaft commented Sep 5, 2017

This is not mergeable until the 0.11.2 release of bdg-formats is pushed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 6, 2017

Member

This is good to go from my side.

@heuermh WRT your comment:

LGTM. Could you add an additional log.info line at or around...

I just pushed a second commit that actually removes the original log.info. I was working with someone recently who was seeing really slow performance when working with VCF data from ADAM. A bit of profiling revealed that specific line to be the culprit. This also leads to a lot of log spamming. Since the info that line outputs could be obtained by looking at the input and output of the conversion, I feel like it isn't worth the configuration hassle of keeping it in place.

Member

fnothaft commented Sep 6, 2017

This is good to go from my side.

@heuermh WRT your comment:

LGTM. Could you add an additional log.info line at or around...

I just pushed a second commit that actually removes the original log.info. I was working with someone recently who was seeing really slow performance when working with VCF data from ADAM. A bit of profiling revealed that specific line to be the culprit. This also leads to a lot of log spamming. Since the info that line outputs could be obtained by looking at the input and output of the conversion, I feel like it isn't worth the configuration hassle of keeping it in place.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Sep 6, 2017

Coverage Status

Coverage increased (+0.1%) to 83.568% when pulling 7085e7b on fnothaft:issues/1682-variant-quality into 51efbaf on bigdatagenomics:master.

coveralls commented Sep 6, 2017

Coverage Status

Coverage increased (+0.1%) to 83.568% when pulling 7085e7b on fnothaft:issues/1682-variant-quality into 51efbaf on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 6, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2353/
Test PASSed.

AmplabJenkins commented Sep 6, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2353/
Test PASSed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Sep 6, 2017

Member

I just pushed a second commit that actually removes the original log.info. I was working with someone recently who was seeing really slow performance when working with VCF data from ADAM. A bit of profiling revealed that specific line to be the culprit. This also leads to a lot of log spamming. Since the info that line outputs could be obtained by looking at the input and output of the conversion, I feel like it isn't worth the configuration hassle of keeping it in place.

+1 to removing that log line.

We need something somewhere that the Variant.quality field is suspect when a multi-allelic row is split. Is that documentation? What if we emitted a single log message only on the first instance? Added a new INFO binary flag? Added splitFromMultiAllelic field to Variant or VariantAnnotation?

Additionally, and we may want to scan across the codebase for other instances of this, log.info(String) does not have the log.isInfoEnabled() guard, whereas log.info(String, Object) and other slf4j parameterized message methods do, and our wrapper logInfo(String) does.

https://github.com/bigdatagenomics/utils/blob/master/utils-misc/src/main/scala/org/bdgenomics/utils/misc/Logging.scala#L43

Member

heuermh commented Sep 6, 2017

I just pushed a second commit that actually removes the original log.info. I was working with someone recently who was seeing really slow performance when working with VCF data from ADAM. A bit of profiling revealed that specific line to be the culprit. This also leads to a lot of log spamming. Since the info that line outputs could be obtained by looking at the input and output of the conversion, I feel like it isn't worth the configuration hassle of keeping it in place.

+1 to removing that log line.

We need something somewhere that the Variant.quality field is suspect when a multi-allelic row is split. Is that documentation? What if we emitted a single log message only on the first instance? Added a new INFO binary flag? Added splitFromMultiAllelic field to Variant or VariantAnnotation?

Additionally, and we may want to scan across the codebase for other instances of this, log.info(String) does not have the log.isInfoEnabled() guard, whereas log.info(String, Object) and other slf4j parameterized message methods do, and our wrapper logInfo(String) does.

https://github.com/bigdatagenomics/utils/blob/master/utils-misc/src/main/scala/org/bdgenomics/utils/misc/Logging.scala#L43

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2379/
Test PASSed.

AmplabJenkins commented Sep 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2379/
Test PASSed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Oct 9, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2413/
Test PASSed.

AmplabJenkins commented Oct 9, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2413/
Test PASSed.

@heuermh

heuermh approved these changes Oct 9, 2017

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Oct 11, 2017

Member

bdg-formats 0.11.3 has been released, will update when it shows up on Maven Central

Member

heuermh commented Oct 11, 2017

bdg-formats 0.11.3 has been released, will update when it shows up on Maven Central

@fnothaft

This comment has been minimized.

Show comment
Hide comment
Member

fnothaft commented Oct 11, 2017

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Oct 11, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2425/
Test PASSed.

AmplabJenkins commented Oct 11, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2425/
Test PASSed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Oct 11, 2017

Member

Rebase and merge as three commits, or would you like me to squash?

Member

heuermh commented Oct 11, 2017

Rebase and merge as three commits, or would you like me to squash?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Oct 11, 2017

Member

Please keep as 3.

Member

fnothaft commented Oct 11, 2017

Please keep as 3.

@heuermh heuermh merged commit d9420dd into bigdatagenomics:master Oct 11, 2017

1 of 2 checks passed

codacy/pr Not so good... This pull request quality could be better.
Details
default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Oct 11, 2017

Member

Thank you, @fnothaft

Member

heuermh commented Oct 11, 2017

Thank you, @fnothaft

@fnothaft fnothaft deleted the fnothaft:issues/1682-variant-quality branch Oct 11, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment