Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException in ReadsPipelineSparkIntegrationTest #5680

Open
lbergelson opened this issue Feb 14, 2019 · 5 comments
Open

Comments

@lbergelson
Copy link
Member

lbergelson commented Feb 14, 2019

We've seen at least 1 non-deterministically occurring instance of ConcurrentModificationException while running the ReadsPipelineSparkIntegrationTest.testReadsPipelineSpark[5]

It seems like there is a race condition somewhere.

testReadsPipelineSpark[5](ReadsPipeline(bam='/home/travis/build/broadinstitute/gatk/src/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.tiny.unaligned.bam', args='--align --bwa-mem-index-image /home/travis/build/broadinstitute/gatk/src/test/resources/large/human_g1k_v37.20.21.fasta.img --known-sites src/test/resources/org/broadinstitute/hellbender/tools/BQSR/dbsnp_138.b37.20.10m-10m100.vcf'))
com.esotericsoftware.kryo.KryoException: java.util.ConcurrentModificationException
Serialization trace:
classes (sun.misc.Launcher$AppClassLoader)
classLoader (org.apache.hadoop.conf.Configuration)
conf (org.apache.hadoop.hdfs.DistributedFileSystem)
fs (hdfs.jsr203.HadoopFileSystem)
hdfs (hdfs.jsr203.HadoopPath)
path (htsjdk.samtools.seekablestream.SeekablePathStream)
seekableStream (htsjdk.tribble.TribbleIndexedFeatureReader)
featureReader (org.broadinstitute.hellbender.engine.FeatureDataSource)
featureSources (org.broadinstitute.hellbender.engine.FeatureManager)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:606)
	at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:109)
	at com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:39)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
	at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628)
	at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:209)
	at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:140)
	at org.apache.spark.serializer.SerializerManager.dataSerializeStream(SerializerManager.scala:170)
	at org.apache.spark.storage.BlockManager$$anonfun$dropFromMemory$3.apply(BlockManager.scala:1382)
	at org.apache.spark.storage.BlockManager$$anonfun$dropFromMemory$3.apply(BlockManager.scala:1377)
	at org.apache.spark.storage.DiskStore.put(DiskStore.scala:69)
	at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1377)
	at org.apache.spark.storage.memory.MemoryStore.org$apache$spark$storage$memory$MemoryStore$$dropBlock$1(MemoryStore.scala:524)
	at org.apache.spark.storage.memory.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:545)
	at org.apache.spark.storage.memory.MemoryStore$$anonfun$evictBlocksToFreeSpace$2.apply(MemoryStore.scala:539)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at org.apache.spark.storage.memory.MemoryStore.evictBlocksToFreeSpace(MemoryStore.scala:539)
	at org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:92)
	at org.apache.spark.memory.StorageMemoryPool.acquireMemory(StorageMemoryPool.scala:73)
	at org.apache.spark.memory.UnifiedMemoryManager.acquireStorageMemory(UnifiedMemoryManager.scala:179)
	at org.apache.spark.memory.UnifiedMemoryManager.acquireUnrollMemory(UnifiedMemoryManager.scala:186)
	at org.apache.spark.storage.memory.MemoryStore.reserveUnrollMemoryForThisTask(MemoryStore.scala:582)
	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:223)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
	at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:792)
	at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1350)
	at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:122)
	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56)
	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1488)
	at org.apache.spark.api.java.JavaSparkContext.broadcast(JavaSparkContext.scala:650)
	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.processAssemblyRegions(HaplotypeCallerSpark.java:191)
	at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.callVariantsWithHaplotypeCallerAndWriteOutput(HaplotypeCallerSpark.java:316)
	at org.broadinstitute.hellbender.tools.spark.pipelines.ReadsPipelineSpark.runTool(ReadsPipelineSpark.java:224)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:528)
	at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:30)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
	at org.broadinstitute.hellbender.Main.instanceMain(Main.java:148)
	at org.broadinstitute.hellbender.Main.instanceMain(Main.java:189)
	at org.broadinstitute.hellbender.CommandLineProgramTest.runCommandLine(CommandLineProgramTest.java:27)
	at org.broadinstitute.hellbender.tools.spark.pipelines.ReadsPipelineSparkIntegrationTest.testReadsPipelineSpark(ReadsPipelineSparkIntegrationTest.java:125)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:108)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:661)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:869)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1193)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:126)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
	at org.testng.TestRunner.privateRun(TestRunner.java:744)
	at org.testng.TestRunner.run(TestRunner.java:602)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:380)
	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:375)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
	at org.testng.SuiteRunner.run(SuiteRunner.java:289)
	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1301)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:1226)
	at org.testng.TestNG.runSuites(TestNG.java:1144)
	at org.testng.TestNG.run(TestNG.java:1115)
	at org.gradle.api.internal.tasks.testing.testng.TestNGTestClassProcessor.runTests(TestNGTestClassProcessor.java:129)
	at org.gradle.api.internal.tasks.testing.testng.TestNGTestClassProcessor.stop(TestNGTestClassProcessor.java:88)
	at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.stop(SuiteTestClassProcessor.java:61)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
	at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
	at com.sun.proxy.$Proxy2.stop(Unknown Source)
	at org.gradle.api.internal.tasks.testing.worker.TestWorker.stop(TestWorker.java:120)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
	at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
	at org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:377)
	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:54)
	at org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:40)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.ConcurrentModificationException
	at java.util.Vector$Itr.checkForComodification(Vector.java:1184)
	at java.util.Vector$Itr.next(Vector.java:1137)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:92)
	at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
	at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552)
	at com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80)
	... 120 more

@jamesemery @tomwhite I've seen this once, so it may be a super rare one that we're just hitting now, or something newly introduced. Not sure there's anything to do until we see it more often, but thought I'd record it in case it keeps coming back.

@lbergelson
Copy link
Member Author

We've seen this at least twice now. Seems like something we should investigate.

@cmnbroad
Copy link
Collaborator

cmnbroad commented Feb 26, 2019

Saw it again here (now restarted). If I'm reading the serialization stack in the right order:

    Serialization trace:
    classes (sun.misc.Launcher$AppClassLoader)
    classLoader (org.apache.hadoop.conf.Configuration)
    conf (org.apache.hadoop.hdfs.DistributedFileSystem)
    fs (hdfs.jsr203.HadoopFileSystem)
    hdfs (hdfs.jsr203.HadoopPath)
    path (htsjdk.samtools.seekablestream.SeekablePathStream)
    seekableStream (htsjdk.tribble.TribbleIndexedFeatureReader)
    featureReader (org.broadinstitute.hellbender.engine.FeatureDataSource)
    featureSources (org.broadinstitute.hellbender.engine.FeatureManager)

it looks like we're trying to serialize a ClassLoader. The FieldSerializer does appear to use a ClassLoader to load classes during serialization.

@samuelklee
Copy link
Contributor

Seems like I'm getting this in almost 50% of my builds. Master is failing because of it---will restart.

@tomwhite
Copy link
Contributor

tomwhite commented Mar 4, 2019

I haven't been able to reproduce this locally by running ReadsPipelineSparkIntegrationTest repeatedly. In fact, it looks like another test is interacting with this one, since the stack trace references HDFS paths, but this test doesn't use HDFS at all.

Another oddity: TribbleIndexedFeatureReader implies it's reading a vcf.idx file, but HaplotypeCallerSpark, where the exception occurs, is not reading any VCF files (although BQSR does earlier in the pipeline for known sites).

Also, we shouldn't be serializing FeatureDataSource objects with remote resources any more, since we use Spark --files to copy them to the worker nodes (see https://github.com/broadinstitute/gatk/blob/master/src/main/java/org/broadinstitute/hellbender/engine/spark/GATKSparkTool.java#L699). So we shouldn't be seeing FeatureDataSource trying to serializing with an HDFS path.

Has anyone seen this running locally on their machine, or only on Travis?

@cmnbroad
Copy link
Collaborator

cmnbroad commented Aug 4, 2020

I think this is happening because were trying to serialize the class loader sun.misc.Launcher$AppClassLoader), which appears to be reached through the graph by way of via https://github.com/damiencarol/jsr203-hadoop/blob/master/src/main/java/hdfs/jsr203/HadoopFileSystem.java#L82. We probably need to short circuit that with a custom serializer for one of these:

Serialization trace:
classes (sun.misc.Launcher$AppClassLoader)
classLoader (org.apache.hadoop.conf.Configuration)
conf (org.apache.hadoop.hdfs.DistributedFileSystem)
fs (hdfs.jsr203.HadoopFileSystem)
hdfs (hdfs.jsr203.HadoopPath)
path (htsjdk.samtools.seekablestream.SeekablePathStream)
seekableStream (htsjdk.tribble.TribbleIndexedFeatureReader)
featureReader (org.broadinstitute.hellbender.engine.FeatureDataSource)
featureSources (org.broadinstitute.hellbender.engine.FeatureManager)

See, for instance, dbpedia/distributed-extraction-framework#9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants