Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing unit test coverage for VariantContextConverter #1276

Merged
merged 2 commits into from Nov 18, 2016

Conversation

@heuermh
Copy link
Member

heuermh commented Nov 16, 2016

No description provided.

@AmplabJenkins
Copy link

AmplabJenkins commented Nov 17, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1617/

Build result: ABORTED

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1276/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 603d840 # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1276/merge^{commit} # timeout=10Checking out Revision 603d840 (origin/pr/1276/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 603d8409297f5eeba7f30846362c4933efeacaf5First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in ABORTED, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh
Copy link
Member Author

heuermh commented Nov 17, 2016

If I ignore the hanging unit test then I see VCF header-related exceptions

- don't lose any variants when piping as VCF !!! IGNORED !!!
2016-11-16 17:19:15 ERROR Utils:95 - Aborting task
java.lang.IllegalStateException: Key IndelQD found in VariantContext field FILTER at 1:14397 but this key isn't defined in the VCFHeader.  We require all VCFs to have complete VCF headers by default.
    at htsjdk.variant.vcf.VCFEncoder.fieldIsMissingFromHeaderError(VCFEncoder.java:173)
    at htsjdk.variant.vcf.VCFEncoder.getFilterString(VCFEncoder.java:154)
    at htsjdk.variant.vcf.VCFEncoder.encode(VCFEncoder.java:106)
    at htsjdk.variant.variantcontext.writer.VCFWriter.add(VCFWriter.java:222)
    at org.seqdoop.hadoop_bam.VCFRecordWriter.writeRecord(VCFRecordWriter.java:140)
    at org.seqdoop.hadoop_bam.KeyIgnoringVCFRecordWriter.write(KeyIgnoringVCFRecordWriter.java:60)
    at org.seqdoop.hadoop_bam.KeyIgnoringVCFRecordWriter.write(KeyIgnoringVCFRecordWriter.java:38)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1113)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1111)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1277)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1119)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1091)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
@fnothaft
Copy link
Member

fnothaft commented Nov 17, 2016

We'll need #1260 + a bit more to fix the header lines issue...

Copy link
Member

fnothaft left a comment

That hang is kind of odd, but I have a guess. I might change the tee /dev/null command in VariantContextRDDSuite to tee to a file and see what you're writing out. I'm thinking that what's happening is we're writing a VCF with a header that is missing a FILTER line for the IndelQD filter. When we read that back from the pipe, we are probably getting an IllegalStateException from tribble/htsjdk RE: the header line. I'm guessing then that this is causing the writer hang to exit but while blocking the piping thread pool from shutting down. (Yeah, that's a bug. Sigh!) Can you test this hypothesis? If that looks right on, open an issue and I'll fix the pipe problems.

@@ -152,6 +152,22 @@ class ADAMContextSuite extends ADAMFunSuite {
assert(vcs.size === 6)

val vc = vcs.head

/*

This comment has been minimized.

Copy link
@fnothaft

fnothaft Nov 17, 2016

Member

If all's the same to you, I'd nix this comment.

case (true, true) => vcb.passFilters
}

val somatic: java.lang.Boolean = Option(variant.getSomatic).getOrElse(false)

This comment has been minimized.

Copy link
@fnothaft

fnothaft Nov 17, 2016

Member

I'd lose the : java.lang.Boolean. Is there a reason you need it?

This comment has been minimized.

Copy link
@heuermh

heuermh Nov 17, 2016

Author Member

Yeah it wouldn't compile without it. Odd that the lines above were ok.

@heuermh
Copy link
Member Author

heuermh commented Nov 17, 2016

Yes, I believe that is what is happening with the hang. Teeing to another file results in an empty file.

@AmplabJenkins
Copy link

AmplabJenkins commented Nov 18, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1631/
Test PASSed.

test("Convert somatic htsjdk site-only SNV to ADAM") {
val converter = new VariantContextConverter

// not sure why this doesn't work

This comment has been minimized.

Copy link
@fnothaft

fnothaft Nov 18, 2016

Member

This one too.

@heuermh heuermh force-pushed the heuermh:vcc-coverage branch from ad0c0a8 to 842f3df Nov 18, 2016
@heuermh heuermh force-pushed the heuermh:vcc-coverage branch from 842f3df to 447ca9a Nov 18, 2016
@AmplabJenkins
Copy link

AmplabJenkins commented Nov 18, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1632/
Test PASSed.

@fnothaft fnothaft merged commit e0979a9 into bigdatagenomics:master Nov 18, 2016
1 check passed
1 check passed
default Merged build finished.
Details
@heuermh heuermh deleted the heuermh:vcc-coverage branch Nov 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.