java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class #862

mikals0ft · 2016-10-07T21:43:19Z

Hi,
I am using version 5.0 of this library to create DataFrames from my ElasticSearch cluster. I am able to create the DataFrame, but as soon as I run an action on it like take or count, it gives me the following Class Not Found error:

java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at org.elasticsearch.spark.rdd.AbstractEsRDDIterator.(AbstractEsRDDIterator.scala:10)
at org.elasticsearch.spark.sql.ScalaEsRowRDDIterator.(ScalaEsRowRDD.scala:31)
at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:27)
at org.elasticsearch.spark.sql.ScalaEsRowRDD.compute(ScalaEsRowRDD.scala:20)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: scala.collection.GenTraversableOnce$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 21 more

I'm guessing this is because the latest version of the library doesn't support Scala 2.11. Can somebody please confirm that this is true? Thanks.

Note: I am using Spark 2.0.0

akoira · 2016-10-13T17:12:37Z

I had a similar issue on spark 2.0.1 and elasticsearch-hadoop 2.4.0.
I replaced file SPARK_HOME/jars/elasticsearch-hadoop-2.4.0.jar with elasticsearch-spark_2.11-2.4.0.jar
which I got here https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-spark_2.11

funkomatic · 2016-10-25T08:42:08Z

Just a comment to say that this issue occurs with 5.0.0-rc1 but not with 5.0.0-beta1.
So there might be a possible regression between the two releases.
On my side I had to rollback to beta1 in order to make the reading work with spark 2.0.0.
I sincerely hope the next release will fix the behavior.

jbaiera · 2016-10-25T16:02:41Z

This exception almost always indicates that the ES-Hadoop jar is expecting an incompatible version of Scala. Scala 2.11 is not backwards compatible with 2.10. Because of this, we release two versions: one for each version of Scala. The Spark artifact names (elasticsearch-spark-20_2.11-5.0.0-rc1.jar) are formatted with the version of Spark that it supports (20 means 2.0+), the version of Scala it supports (2.11 is Scala 2.11.x) and the version of ES-Hadoop (5.0.0-rc1).

funkomatic · 2016-10-26T07:06:15Z

Hi, thanks for the info I'll try this library instead of the hadoop rc1.

qindeqiang · 2017-08-17T06:58:56Z

@michaelqiu94 My ES Cluster is 5.5.1，I have download the newest jar to resolve the issue.The following is download url:https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-spark-20_2.11/5.5.1.

ebuildy · 2018-02-27T18:06:31Z

To see the Spark scala version, run spark-shell (since spark-submit --version doesnt display it).

ghost · 2018-05-31T11:02:23Z

I got this exception when I used elasticsearch-hadoop:2.4.5 against the spark version spark-core:2.2.0.

Since Spark 2.0 uses scala_2.11 by default and we may have to rebuild spark if we want to use scala_2.10, I resorted to use spark:1.x to which uses scala_2.10 by default. It fixed the error.

jbaiera · 2018-06-01T15:04:11Z

@karthikeyanpa90 You can use Scala 2.10 with Spark 2.X and up. You just need to use the compatibility jar that we release instead of the regular version: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/install.html#_minimalistic_binaries

mikals0ft closed this as completed Oct 8, 2016

jbaiera added duplicate :Spark v5.0.0-rc1 labels Oct 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class #862

java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class #862

mikals0ft commented Oct 7, 2016

akoira commented Oct 13, 2016

funkomatic commented Oct 25, 2016

jbaiera commented Oct 25, 2016

funkomatic commented Oct 26, 2016

qindeqiang commented Aug 17, 2017

ebuildy commented Feb 27, 2018

ghost commented May 31, 2018

jbaiera commented Jun 1, 2018

java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class #862

java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class #862

Comments

mikals0ft commented Oct 7, 2016

akoira commented Oct 13, 2016

funkomatic commented Oct 25, 2016

jbaiera commented Oct 25, 2016

funkomatic commented Oct 26, 2016

qindeqiang commented Aug 17, 2017

ebuildy commented Feb 27, 2018

ghost commented May 31, 2018

jbaiera commented Jun 1, 2018