You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue Description:
I hope I am addressing this topic in the right place.
I am trying to connect to an elasticsearch database using spark and my code snippet looks like this:
When calling df = reader.load("my_index") I get the following error:
py4j.protocol.Py4JJavaError: An error occurred while calling o45.load.: java.lang.NoClassDefFoundError: scala/Product$class
at org.elasticsearch.spark.sql.ElasticsearchRelation.<init>(DefaultSource.scala:191)
at org.elasticsearch.spark.sql.DefaultSource.createRelation(DefaultSource.scala:93)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:188)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: scala.Product$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 20 more
The error is caused, because the elasticsearch-hadoop connector still uses Scala Version 2.11, while Spark/Pyspark >=3.0 uses Scala version 2.12. For this reason, I am no longer able to use elasticsearch-hadoop to connect to my elasticsearch database, as I have different Scala Versions.
This problem would be solved, if I would downgrade to spark version <3.0, however I cannot do this, because it will generate other issues.
When will elasticsearch-hadoop support Scala 2.12? Or are there other workarounds to solve this kind of dependency issues?
Issue Description:
I hope I am addressing this topic in the right place.
I am trying to connect to an elasticsearch database using spark and my code snippet looks like this:
When calling df = reader.load("my_index") I get the following error:
The error is caused, because the elasticsearch-hadoop connector still uses Scala Version 2.11, while Spark/Pyspark >=3.0 uses Scala version 2.12. For this reason, I am no longer able to use elasticsearch-hadoop to connect to my elasticsearch database, as I have different Scala Versions.
This problem would be solved, if I would downgrade to spark version <3.0, however I cannot do this, because it will generate other issues.
When will elasticsearch-hadoop support Scala 2.12? Or are there other workarounds to solve this kind of dependency issues?
Version Info
OS: : Fedora
JVM : openjdk 11.0.14.1
Hadoop/Spark: 3.2.1
ES-Hadoop : 8.1.2
Thank you!
The text was updated successfully, but these errors were encountered: