"upgrade" adam to forked "0.20.0" #581

ryan-williams · 2016-09-15T06:16:15Z

this ports us on to most of the non-trivial changes that have happened in ADAM since 0.19.0, using my forked "0.20.0".

ADAM's VCF-writing now requires Sample.sampleId to match Genotype.sampleId, and I am using the "sampleName" pulled from readsets.args.Base / ReadSets rather than the loaded-from-data sampleName on Read.

The latter is now unused so I removed it. It might make sense to keep it around explicitly in some way… seems like what ADAM now does with RecordGroups is the right thing, maybe we'll want to use that if we care about that data again.

This change is

coveralls · 2016-09-16T04:53:04Z

Coverage increased (+0.02%) to 78.593% when pulling 99caf8f on ryan-williams:adam into a6771ee on hammerlab:master.

coveralls · 2016-09-16T05:07:42Z

Coverage increased (+8.06%) to 86.628% when pulling bbad0c4 on ryan-williams:adam into a6771ee on hammerlab:master.

arahuja

LGTM, I'd say just wait for/ask for a new ADAM release then managing one though

arahuja · 2016-09-16T17:51:28Z

src/main/scala/org/hammerlab/guacamole/variants/CalledSomaticAllele.scala

@@ -41,6 +41,9 @@ case class CalledSomaticAllele(sampleName: SampleName,
      .newBuilder
      .setAlleles(seqAsJavaList(Seq(GenotypeAllele.Ref, GenotypeAllele.Alt)))
      .setSampleId(sampleName)
+      .setContigName(contigName)


Weird, what was happening before?

i don't know, otomh!

the semantics of the downstream adam and htsjdk models here have changed in this upgrade, so a few things have to specified differently than they were before.

it's also possible that these aren't needed now either; I added them while debugging an issue where variant's INFO fields were coming out blank, which turned out to be due to the sampleId not being set consistently throughout.

I can dig in further if you want.

arahuja · 2016-09-16T17:53:25Z

pom.xml

@@ -16,16 +16,11 @@
  <name>guacamole: variant caller</name>

  <properties>
-    <adam.version>0.19.0</adam.version>


Any reason you don't like these as properties?

I'm going with the heuristic that if it's used twice, it should be factored out, otherwise it should just be on the dependency-declaration.

Factoring out versions that only appear once seems more verbose and less clear, to me.

lmk if you disagree.

I'd agree that it's not worth it to factor out for a single dependency, but once there seems useful, easy to get a read on the major dependencies and their version ... but probably fine as is.

Yea, I don't think any of these qualify as "major dependencies".

The two that do, spark and hadoop, coincidentally (or not!) are used in multiple places.

arahuja · 2016-09-16T17:55:26Z

src/test/resources/tough.golden.vcf

@@ -16,7 +16,92 @@
 ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
 ##INFO=<ID=VQSLOD,Number=1,Type=Float,Description="Log odds ratio of being a true variant versus being false under the trained gaussian mixture model">
 ##INFO=<ID=culprit,Number=1,Type=String,Description="The annotation which was the worst performing in the Gaussian mixture model, likely the reason why the variant was filtered out">
-#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	tumor.chr20
+##contig=<ID=1,length=249250621,assembly=GRCh37>


Is this just what is generated now? Otherwise, don't really need all the contigs listed?

yea, it's what is generated now.

if there was a more narrow a way to test this i'd be open to it, but this seems like the most straightforward way to have an end-to-end test.

No that makes sense, OOC - did these positions match the test positions before or are they different/whatever was output? It's possible more positions were output than the test positions.

This predates this PR, but all 25 positions that used to be checked in SomaticStandardRealDataSuite are here, as well as 9 more.

coveralls · 2016-09-16T18:56:18Z

Coverage increased (+11.5%) to 90.123% when pulling d195e6e on ryan-williams:adam into a3018fd on hammerlab:master.

arahuja · 2016-09-16T19:04:27Z

Coverage increased (+11.5%)

?? Does that seem right?

ryan-williams · 2016-09-16T21:19:58Z

No, coveralls is clearly confused on this PR.

It just occurred to me that it's likely to do with my caching target here; coverage-info from previous runs is hanging around.

Let me either make sure that doesn't happen, or remove that change.

In general I think zinc is reliable enough at this point that we could save a lot of time not recompiling everything every build, and only rarely (if ever) having to clear caches; having repros for such zinc bugs as cause us to need to clear caches could be useful anyway.

travis apparently caps out at 3gb

coveralls · 2016-09-16T21:36:35Z

Coverage increased (+0.3%) to 78.953% when pulling 7e1a114 on ryan-williams:adam into a3018fd on hammerlab:master.

ryan-williams · 2016-09-16T21:45:04Z

OK, I think I fixed coveralls.

ryan-williams · 2016-09-16T22:56:49Z

I'd say just wait for/ask for a new ADAM release then managing one though

There's no "managing" to be done; we have the equivalent of a SNAPSHOT but that won't break under us.

ryan-williams force-pushed the adam branch from 58f5f02 to 19ebbc3 Compare September 15, 2016 21:24

ryan-williams force-pushed the adam branch from 99caf8f to bbad0c4 Compare September 16, 2016 04:54

arahuja reviewed Sep 16, 2016

View reviewed changes

ryan-williams added 3 commits September 16, 2016 18:20

pass sampleName to Pileup

7af0873

"upgrade" adam to forked "0.20.0"

5ba5d83

remove Read.sampleName

d81f14a

ryan-williams force-pushed the adam branch from bbad0c4 to d195e6e Compare September 16, 2016 18:39

ryan-williams added 2 commits September 16, 2016 21:22

experiment with travis caching build artifacts

8137dd7

reduce test memory

7e1a114

travis apparently caps out at 3gb

ryan-williams force-pushed the adam branch from d195e6e to 7e1a114 Compare September 16, 2016 21:23

ryan-williams merged commit 2f86006 into hammerlab:master Sep 16, 2016

ryan-williams deleted the adam branch September 16, 2016 23:18

ryan-williams mentioned this pull request Sep 16, 2016

Scripts overhaul #579

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"upgrade" adam to forked "0.20.0" #581

"upgrade" adam to forked "0.20.0" #581

ryan-williams commented Sep 15, 2016 •

edited

Loading

coveralls commented Sep 16, 2016

coveralls commented Sep 16, 2016

arahuja left a comment

arahuja Sep 16, 2016

ryan-williams Sep 16, 2016

arahuja Sep 16, 2016

ryan-williams Sep 16, 2016

arahuja Sep 16, 2016

ryan-williams Sep 16, 2016

arahuja Sep 16, 2016

ryan-williams Sep 16, 2016

arahuja Sep 16, 2016

ryan-williams Sep 16, 2016

coveralls commented Sep 16, 2016

arahuja commented Sep 16, 2016

ryan-williams commented Sep 16, 2016

coveralls commented Sep 16, 2016

ryan-williams commented Sep 16, 2016 •

edited

Loading

ryan-williams commented Sep 16, 2016

"upgrade" adam to forked "0.20.0" #581

"upgrade" adam to forked "0.20.0" #581

Conversation

ryan-williams commented Sep 15, 2016 • edited Loading

coveralls commented Sep 16, 2016

coveralls commented Sep 16, 2016

arahuja left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Sep 16, 2016

arahuja commented Sep 16, 2016

ryan-williams commented Sep 16, 2016

coveralls commented Sep 16, 2016

ryan-williams commented Sep 16, 2016 • edited Loading

ryan-williams commented Sep 16, 2016

ryan-williams commented Sep 15, 2016 •

edited

Loading

ryan-williams commented Sep 16, 2016 •

edited

Loading