You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to use the BigQuery connector in a HDInsights cluster with Spark 2.1.0 (Hortonworks Data Platform 2.6). If I run my job locally it works fine, but if I deploy it to the cluster (via Livy, but this should't matter here) I got:
ERROR ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: com.google.common.base.Splitter.splitToList(Ljava/lang/CharSequence;)Ljava/util/List;
java.lang.NoSuchMethodError: com.google.common.base.Splitter.splitToList(Ljava/lang/CharSequence;)Ljava/util/List;
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase$ParentTimestampUpdateIncludePredicate.create(GoogleHadoopFileSystemBase.java:789)
...
I had similar issues in the past with Spark and third party libraries (especially libraries which use Google Guava). As far as I know, usually the best solution is to explicitly shade the conflicting libraries. I use maven as build tool thus I use maven shade plugin to shade the conflicting libraries:
But with the same error message. I also tried to set spark.{driver, executor}.userClassPathFirst=true, also without success. Then I got something like:
Caused by: java.lang.RuntimeException: java.lang.ClassCastException: cannot assign instance of scala.concurrent.duration.FiniteDuration to field org.apache.spark.rpc.RpcTimeout.duration of type scala.concurrent.duration.FiniteDuration in instance of org.apache.spark.rpc.RpcTimeout
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
...
which is not thrown in Google BigQuery connector. Is this another error, which would mean that this has nothing to do with the lib conflict problem from above? Or does this also originate from the library issue?
I don't have ideas anymore... Does anybody have some more ideas what could go wrong here or what I am missing? I'm happy about any tips or ideas! Thank you very much.
The text was updated successfully, but these errors were encountered:
I solved the problem by simply use the maven shade plugin the right way... I had a wrong XML format. Maven doesn't complain about that, but I read the shade plugin documentation again and noticed how to use it the right way.
I want to use the BigQuery connector in a HDInsights cluster with Spark 2.1.0 (Hortonworks Data Platform 2.6). If I run my job locally it works fine, but if I deploy it to the cluster (via Livy, but this should't matter here) I got:
I had similar issues in the past with Spark and third party libraries (especially libraries which use Google Guava). As far as I know, usually the best solution is to explicitly shade the conflicting libraries. I use maven as build tool thus I use maven shade plugin to shade the conflicting libraries:
A next try was to add and explcit relocate Guava via the maven plugin:
But with the same error message. I also tried to set spark.{driver, executor}.userClassPathFirst=true, also without success. Then I got something like:
which is not thrown in Google BigQuery connector. Is this another error, which would mean that this has nothing to do with the lib conflict problem from above? Or does this also originate from the library issue?
I don't have ideas anymore... Does anybody have some more ideas what could go wrong here or what I am missing? I'm happy about any tips or ideas! Thank you very much.
The text was updated successfully, but these errors were encountered: