New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1334] Clean up serialization issues in Broadcast region join. #1336

Merged
merged 3 commits into from Jan 6, 2017

Conversation

Projects
3 participants
@fnothaft
Member

fnothaft commented Jan 3, 2017

Resolves #1334. Eliminates type erasure problems by having different concrete implementations per each broadcast. Depends on bigdatagenomics/utils#97.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 3, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1715/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 9b8e3ed # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision 9b8e3ed (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 9b8e3edf595ec4270648b72a986c9726ba1085afFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Jan 3, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1715/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 9b8e3ed # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision 9b8e3ed (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 9b8e3edf595ec4270648b72a986c9726ba1085afFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 4, 2017

Member

Jenkins, retest this please.

Member

fnothaft commented Jan 4, 2017

Jenkins, retest this please.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 4, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1718/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 9b8e3ed # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision 9b8e3ed (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 9b8e3edf595ec4270648b72a986c9726ba1085afFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Jan 4, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1718/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 9b8e3ed # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision 9b8e3ed (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 9b8e3edf595ec4270648b72a986c9726ba1085afFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 4, 2017

Member

I assume you would like this in 0.21.0? Feel free to set the milestone.

Member

heuermh commented Jan 4, 2017

I assume you would like this in 0.21.0? Feel free to set the milestone.

@fnothaft fnothaft added this to the 0.21.0 milestone Jan 4, 2017

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 4, 2017

Member

Ah, yes! We need it for the Variant DB challenge. I've just set the milestone to 0.21.0. I'm going to fix the build failure (it's a small issue where the move_to_xyz scripts need to be updated), and once the build passes, I'll cut a utils release so that we can remove the snapshot dependency.

Member

fnothaft commented Jan 4, 2017

Ah, yes! We need it for the Variant DB challenge. I've just set the milestone to 0.21.0. I'm going to fix the build failure (it's a small issue where the move_to_xyz scripts need to be updated), and once the build passes, I'll cut a utils release so that we can remove the snapshot dependency.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 4, 2017

Member

Sounds good, thanks!

Member

heuermh commented Jan 4, 2017

Sounds good, thanks!

@fnothaft fnothaft referenced this pull request Jan 5, 2017

Open

WIP ADAM queries. #6

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 5, 2017

Member

Just pushed a fix for the move_to_xyz scripts. This will not be ready to merge though, until the utils release is cut (which I will do after this passes).

Member

fnothaft commented Jan 5, 2017

Just pushed a fix for the move_to_xyz scripts. This will not be ready to merge though, until the utils release is cut (which I will do after this passes).

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 5, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1720/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains fa154df # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision fa154df (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f fa154dfaa75c1ab94a9583693abb1703a9803f21First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Jan 5, 2017

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1720/

Build result: FAILURE

[...truncated 3 lines...]Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prbWiping out workspace first.Cloning the remote Git repositoryCloning repository https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git init /home/jenkins/workspace/ADAM-prb # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/heads/:refs/remotes/origin/ # timeout=15 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10 > /home/jenkins/git2/bin/git config --add remote.origin.fetch +refs/heads/:refs/remotes/origin/ # timeout=10 > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1336/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains fa154df # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1336/merge^{commit} # timeout=10Checking out Revision fa154df (origin/pr/1336/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f fa154dfaa75c1ab94a9583693abb1703a9803f21First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh

heuermh approved these changes Jan 5, 2017

import scala.collection.JavaConversions._
import scala.math.max
import scala.reflect.ClassTag
private[adam] case class NucleotideContigFragmentArray(

This comment has been minimized.

@heuermh

heuermh Jan 5, 2017

Member

I think I know the reason already, why do we need concrete classes for each of these?

@heuermh

heuermh Jan 5, 2017

Member

I think I know the reason already, why do we need concrete classes for each of these?

This comment has been minimized.

@fnothaft

fnothaft Jan 5, 2017

Member

Type erasure at serialization time. Since we previously only had generics, we were getting errors that were "last registered IntervalArray class wins". It was great. Uplifting, really.

@fnothaft

fnothaft Jan 5, 2017

Member

Type erasure at serialization time. Since we previously only had generics, we were getting errors that were "last registered IntervalArray class wins". It was great. Uplifting, really.

@@ -74,6 +82,15 @@ trait TreeRegionJoin[T, U] {
*/
case class InnerTreeRegionJoin[T: ClassTag, U: ClassTag]() extends RegionJoin[T, U, T, U] with TreeRegionJoin[T, U] {
def broadcastAndJoin(tree: IntervalArray[ReferenceRegion, T],

This comment has been minimized.

@heuermh

heuermh Jan 5, 2017

Member

Yeah I like this API design...

@heuermh

heuermh Jan 5, 2017

Member

Yeah I like this API design...

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 5, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1721/
Test PASSed.

AmplabJenkins commented Jan 5, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1721/
Test PASSed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 5, 2017

Member

I will cut a bdg-utils release tomorrow AM.

Member

fnothaft commented Jan 5, 2017

I will cut a bdg-utils release tomorrow AM.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 5, 2017

Member

Is this on HEAD related?

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam
$ ./bin/adam-shell
...
scala> val variants = sc.loadVariants("/Users/heuermh/working/adam/*.adam/*")
variants: org.bdgenomics.adam.rdd.variant.VariantRDD = 
VariantRDD(MapPartitionsRDD[1] at map at ADAMContext.scala:388,SequenceDictionary{
1->249250621, 0
2->249250621, 1
13->249250621, 2},WrappedArray(FILTER=<ID=IndelFS,Description="FS > 200.0">, FILTER=<ID=IndelQD,Description="QD < 2.0">, FILTER=<ID=IndelReadPosRankSum,Description="ReadPosRankSum < -20.0">, FILTER=<ID=LowQual,Description="Low quality">, FILTER=<ID=VQSRTrancheSNP99.50to99.60,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -0.5377 <= x < -0.1787">, FILTER=<ID=VQSRTrancheSNP99.60to99.70,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -1.0634 <= x < -0.5377">, FILTER=<ID=VQSRTrancheSNP99.70to99.80,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -1.7119 <...

scala> variants.rdd.collect.head
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2017-01-05 10:35:30 ERROR Executor:95 - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2017-01-05 10:35:30 ERROR Executor:95 - Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2017-01-05 10:35:30 WARN  TaskSetManager:70 - Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

2017-01-05 10:35:30 ERROR TaskSetManager:74 - Task 1 in stage 0.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at scala.Option.foreach(Option.scala:236)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
	at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
	at $iwC$$iwC$$iwC$$iwC.<init>(<console>:48)
	at $iwC$$iwC$$iwC.<init>(<console>:50)
	at $iwC$$iwC.<init>(<console>:52)
	at $iwC.<init>(<console>:54)
	at <init>(<console>:56)
	at .<init>(<console>:60)
	at .<clinit>(<console>)
	at .<init>(<console>:7)
	at .<clinit>(<console>)
	at $print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
	at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
	at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
	at org.apache.spark.repl.Main$.main(Main.scala:31)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Member

heuermh commented Jan 5, 2017

Is this on HEAD related?

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam
$ ./bin/adam-shell
...
scala> val variants = sc.loadVariants("/Users/heuermh/working/adam/*.adam/*")
variants: org.bdgenomics.adam.rdd.variant.VariantRDD = 
VariantRDD(MapPartitionsRDD[1] at map at ADAMContext.scala:388,SequenceDictionary{
1->249250621, 0
2->249250621, 1
13->249250621, 2},WrappedArray(FILTER=<ID=IndelFS,Description="FS > 200.0">, FILTER=<ID=IndelQD,Description="QD < 2.0">, FILTER=<ID=IndelReadPosRankSum,Description="ReadPosRankSum < -20.0">, FILTER=<ID=LowQual,Description="Low quality">, FILTER=<ID=VQSRTrancheSNP99.50to99.60,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -0.5377 <= x < -0.1787">, FILTER=<ID=VQSRTrancheSNP99.60to99.70,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -1.0634 <= x < -0.5377">, FILTER=<ID=VQSRTrancheSNP99.70to99.80,Description="Truth sensitivity tranche level for SNP model at VQS Lod: -1.7119 <...

scala> variants.rdd.collect.head
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2017-01-05 10:35:30 ERROR Executor:95 - Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2017-01-05 10:35:30 ERROR Executor:95 - Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
2017-01-05 10:35:30 WARN  TaskSetManager:70 - Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

2017-01-05 10:35:30 ERROR TaskSetManager:74 - Task 1 in stage 0.0 failed 1 times; aborting job
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in stage 0.0 (TID 1, localhost): java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at scala.Option.foreach(Option.scala:236)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
	at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
	at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
	at $iwC$$iwC$$iwC$$iwC.<init>(<console>:48)
	at $iwC$$iwC$$iwC.<init>(<console>:50)
	at $iwC$$iwC.<init>(<console>:52)
	at $iwC.<init>(<console>:54)
	at <init>(<console>:56)
	at .<init>(<console>:60)
	at .<clinit>(<console>)
	at .<init>(<console>:7)
	at .<clinit>(<console>)
	at $print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
	at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
	at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
	at org.apache.spark.repl.Main$.main(Main.scala:31)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ArrayStoreException: org.bdgenomics.formats.avro.Genotype
	at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
	at scala.Array$.slowcopy(Array.scala:81)
	at scala.Array$.copy(Array.scala:107)
	at scala.collection.mutable.ResizableArray$class.copyToArray(ResizableArray.scala:77)
	at scala.collection.mutable.ArrayBuffer.copyToArray(ArrayBuffer.scala:47)
	at scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
	at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
	at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:927)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 5, 2017

Member

Just cut the new bdg-utils release. Once that pushes to maven, I will update this, and we should be able to merge it and cut the ADAM release.

Member

fnothaft commented Jan 5, 2017

Just cut the new bdg-utils release. Once that pushes to maven, I will update this, and we should be able to merge it and cut the ADAM release.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 5, 2017

Member

bdg-utils 0.2.11 is now available on Maven Central. Since posting the stack trace above, I've been looking into other things. Should I investigate it further this evening?

Member

heuermh commented Jan 5, 2017

bdg-utils 0.2.11 is now available on Maven Central. Since posting the stack trace above, I've been looking into other things. Should I investigate it further this evening?

fnothaft added some commits Jan 3, 2017

[ADAM-1334] Clean up serialization issues in Broadcast region join.
Resolves #1334. Eliminates type erasure problems by having different concrete
implementations per each broadcast. Depends on
bigdatagenomics/utils#97.
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 6, 2017

Member

bdg-utils 0.2.11 is now available on Maven Central. Since posting the stack trace above, I've been looking into other things. Should I investigate it further this evening?

I've just updated to point at the 0.2.11 release. Let me see if I can repro that issue you ran into on my side.

Member

fnothaft commented Jan 6, 2017

bdg-utils 0.2.11 is now available on Maven Central. Since posting the stack trace above, I've been looking into other things. Should I investigate it further this evening?

I've just updated to point at the 0.2.11 release. Let me see if I can repro that issue you ran into on my side.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 6, 2017

Member

Oh, I see what's going on in your example. If you want to call sc.loadVariants, you need to provide the -only_variants flag when running vcf2adam. So, you should either change:

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam

to

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam -only_variants
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam -only_variants

Or, change:

val variants = sc.loadVariants("/Users/heuermh/working/adam/*.adam/*")

to either

val variants = sc.loadGenotypes("/Users/heuermh/working/adam/*.adam/*")
  .toVariantContextRDD
  .toVariantRDD

or, more simply:

val genotypes = sc.loadGenotypes("/Users/heuermh/working/adam/*.adam/*")

;)

Member

fnothaft commented Jan 6, 2017

Oh, I see what's going on in your example. If you want to call sc.loadVariants, you need to provide the -only_variants flag when running vcf2adam. So, you should either change:

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam

to

$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/small.vcf small.adam -only_variants
$ ./bin/adam-submit vcf2adam adam-core/src/test/resources/sorted.vcf sorted.adam -only_variants

Or, change:

val variants = sc.loadVariants("/Users/heuermh/working/adam/*.adam/*")

to either

val variants = sc.loadGenotypes("/Users/heuermh/working/adam/*.adam/*")
  .toVariantContextRDD
  .toVariantRDD

or, more simply:

val genotypes = sc.loadGenotypes("/Users/heuermh/working/adam/*.adam/*")

;)

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jan 6, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1722/
Test PASSed.

AmplabJenkins commented Jan 6, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1722/
Test PASSed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 6, 2017

Member

Of course, that does it!

I look forward to wiping out the -only_variants flag in #1327. :)

Member

heuermh commented Jan 6, 2017

Of course, that does it!

I look forward to wiping out the -only_variants flag in #1327. :)

@heuermh heuermh merged commit 5dcd70b into bigdatagenomics:master Jan 6, 2017

1 check passed

default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 6, 2017

Member

Thank you, @fnothaft!

Member

heuermh commented Jan 6, 2017

Thank you, @fnothaft!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment