Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyspark fails to write with 5.0.0-beta1 #852

Closed
jimmyjones2 opened this issue Sep 25, 2016 · 3 comments
Closed

pyspark fails to write with 5.0.0-beta1 #852

jimmyjones2 opened this issue Sep 25, 2016 · 3 comments

Comments

@jimmyjones2
Copy link

Using spark-1.6.2-bin-hadoop2.6 and corresponding elasticsearch versions.

2.4.0 works as expected:

./bin/pyspark --driver-class-path=../elasticsearch-hadoop-2.4.0/dist/elasticsearch-hadoop-2.4.0.jar
>>> df = sqlContext.createDataFrame([('abcd','123')])
>>> df.write.format("org.elasticsearch.spark.sql").save("abc/def")
curl localhost:9200/abc/_search
{"took":4,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"abc","_type":"def","_id":"AVdhZCT7WlyV98PK0-om","_score":1.0,"_source":{"_1":"abcd","_2":"123"}}]}}

However with 5.0.0-beta1:

./bin/pyspark --driver-class-path=../elasticsearch-hadoop-5.0.0-beta1/dist/elasticsearch-hadoop-5.0.0-beta1.jar
>>> df = sqlContext.createDataFrame([('abcd','123')])
>>> df.write.format("org.elasticsearch.spark.sql").save("abc/def")
16/09/25 13:51:31 INFO Version: Elasticsearch Hadoop v5.0.0-beta1 [508575a379]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jimmy/Downloads/spark-1.6.2-bin-hadoop2.6/python/pyspark/sql/readwriter.py", line 397, in save
    self._jwrite.save(path)
  File "/home/jimmy/Downloads/spark-1.6.2-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
  File "/home/jimmy/Downloads/spark-1.6.2-bin-hadoop2.6/python/pyspark/sql/utils.py", line 45, in deco
    return f(*a, **kw)
  File "/home/jimmy/Downloads/spark-1.6.2-bin-hadoop2.6/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o33.save.
: java.lang.AbstractMethodError: org.elasticsearch.spark.sql.DefaultSource.createRelation(Lorg/apache/spark/sql/SQLContext;Lorg/apache/spark/sql/SaveMode;Lscala/collection/immutable/Map;Lorg/apache/spark/sql/DataFrame;)Lorg/apache/spark/sql/sources/BaseRelation;
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:745)
@jbaiera
Copy link
Member

jbaiera commented Sep 28, 2016

I think this may just be the backwards compatibility break between Spark 2.0 and Spark 1.6. The ES-Hadoop 5.0.0 beta jar is only compatible with Spark 2.0. When using Spark 1.3-1.6 give the BWC release a try.

@jimmyjones2
Copy link
Author

jimmyjones2 commented Sep 28, 2016

@jbaiera Sure is, works fine with Spark 2.0 or on 1.6 with the the Scala 2.10 version of the jar you linked to. Thanks!

@jbaiera
Copy link
Member

jbaiera commented Oct 3, 2016

@jimmyjones2 Good to hear, Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants