New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcf2adam -print_metrics throws IllegalStateException on Spark 1.5.2 or later #902

Closed
heuermh opened this Issue Dec 22, 2015 · 5 comments

Comments

Projects
None yet
2 participants
@heuermh
Member

heuermh commented Dec 22, 2015

vcf2adam -print_metrics fails for me on Spark version 1.5.2

$ ./bin/adam-submit \
  --driver-memory 2G --num-executors 2 --executor-cores 1 --executor-memory 4G \
  -- \
  vcf2adam -only_variants -print_metrics dbsnp_138.hg19.vcf dbsnp_138.hg19.adam
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/usr/local/bin/spark-submit
2015-12-22 17:48:41 ERROR Utils:96 - uncaught error in thread SparkListenerBus, stopping SparkContext
java.lang.AbstractMethodError
    at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)
    at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Command body threw exception:
java.lang.IllegalStateException: SparkContext has been shutdown
Exception in thread "main" java.lang.IllegalStateException: SparkContext has been shutdown
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1816)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1055)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:998)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:938)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
    at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:930)
    at org.apache.spark.rdd.InstrumentedPairRDDFunctions.saveAsNewAPIHadoopFile(InstrumentedPairRDDFunctions.scala:487)
    at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply$mcV$sp(ADAMRDDFunctions.scala:75)
    at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply(ADAMRDDFunctions.scala:60)
    at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply(ADAMRDDFunctions.scala:60)
    at org.apache.spark.rdd.Timer.time(Timer.scala:52)
    at org.bdgenomics.adam.rdd.ADAMRDDFunctions.adamParquetSave(ADAMRDDFunctions.scala:60)
    at org.bdgenomics.adam.rdd.ADAMRDDFunctions.adamParquetSave(ADAMRDDFunctions.scala:46)
    at org.bdgenomics.adam.cli.Vcf2ADAM.run(Vcf2ADAM.scala:79)
    at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:54)
    at org.bdgenomics.adam.cli.Vcf2ADAM.run(Vcf2ADAM.scala:58)
    at org.bdgenomics.adam.cli.ADAMMain.apply(ADAMMain.scala:137)
    at org.bdgenomics.adam.cli.ADAMMain$.main(ADAMMain.scala:77)
    at org.bdgenomics.adam.cli.ADAMMain.main(ADAMMain.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Works fine without -print_metrics.

Environment

$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

$ spark-submit --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.5.2
      /_/

Type --help for more information.

$ ./bin/adam-submit --version
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/usr/local/bin/spark-submit

       e         888~-_          e             e    e
      d8b        888   \        d8b           d8b  d8b
     /Y88b       888    |      /Y88b         d888bdY88b
    /  Y88b      888    |     /  Y88b       / Y88Y Y888b
   /____Y88b     888   /     /____Y88b     /   YY   Y888b
  /      Y88b    888_-~     /      Y88b   /          Y888b

ADAM version: 0.18.3-SNAPSHOT
Commit: 94e92dd7b59fb3b7e3bad6b9c35ff37f7f181ade Build: 2015-12-14
Built for: Scala 2.10 and Hadoop 2.2.0
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 7, 2016

Member

Same with newly released Spark 1.6.0 and Hadoop 2.6.0.

Member

heuermh commented Jan 7, 2016

Same with newly released Spark 1.6.0 and Hadoop 2.6.0.

@heuermh heuermh changed the title from vcf2adam -print_metrics throws IllegalStateException on Spark 1.5.2 to vcf2adam -print_metrics throws IllegalStateException on Spark 1.5.2 or later Jan 7, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jan 7, 2016

Member

@heuermh this is a problem upstream in bdg-utils. I think they just changed the method signature of getCallSite or something of that sort. I'll work on a fix upstream and cut a new utils release.

Member

fnothaft commented Jan 7, 2016

@heuermh this is a problem upstream in bdg-utils. I think they just changed the method signature of getCallSite or something of that sort. I'll work on a fix upstream and cut a new utils release.

@fnothaft fnothaft added this to the 0.19.0 milestone Jan 7, 2016

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jan 21, 2016

Member

For the record, I created a local branch where I updated Spark and Hadoop dependency versions in utils and then used the utils SNAPSHOT dependency in ADAM. It no longer fails with the same exception, it just does not print any metrics to stdout.

Member

heuermh commented Jan 21, 2016

For the record, I created a local branch where I updated Spark and Hadoop dependency versions in utils and then used the utils SNAPSHOT dependency in ADAM. It no longer fails with the same exception, it just does not print any metrics to stdout.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Feb 16, 2016

Member

The cause of the original exception above is a new method onBlockUpdated(blockUpdated: SparkListenerBlockUpdated) added to SparkListener prior to the 1.5.2 release, see

https://github.com/heuermh/spark/blame/v1.5.2/core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala#L221

Our MetricsListener extends this class.

There were changes to Utils.getCallSite recently but I don't see yet how they might break what we've been doing

heuermh/spark@1fd6ed9
heuermh/spark@d7fc69a

Member

heuermh commented Feb 16, 2016

The cause of the original exception above is a new method onBlockUpdated(blockUpdated: SparkListenerBlockUpdated) added to SparkListener prior to the 1.5.2 release, see

https://github.com/heuermh/spark/blame/v1.5.2/core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala#L221

Our MetricsListener extends this class.

There were changes to Utils.getCallSite recently but I don't see yet how they might break what we've been doing

heuermh/spark@1fd6ed9
heuermh/spark@d7fc69a

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Apr 7, 2016

Member

Closed by #961

Member

heuermh commented Apr 7, 2016

Closed by #961

@heuermh heuermh closed this Apr 7, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment