Skip to content

Caused by: java.lang.NoSuchMethodError: org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(Lorg/apache/avro/generic/GenericRecord;Ljava/lang/String;Z)Ljava/lang/Object; #2302

@rahultoall

Description

@rahultoall

hi,

I am facing issue when i try to sync my hudi table to hive using the spark DataSource Api.
Spark version - 2.4.7
spark-avro - spark-avro_2.11-2.4.7
hudi-spark - hudi-spark-bundle_2.11-0.6.0

i have set following properties in spark conf as well
spark.serializer = org.apache.spark.serializer.KryoSerializer
spark.sql.hive.convertMetastoreParquet = false

Also i have added the hudi-hadoop-mr-bundle-0.6.0.jar in Hive's aux-path

following is the snippet i used to write a dataframe to hudi syncing to hive

df.write.format("hudi")
.option(DataSourceWriteOptions.TABLE_TYPE_OPT_KEY, DataSourceWriteOptions.COW_TABLE_TYPE_OPT_VAL)
.option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "trip_id")
.option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, "createdDate")
.option(HoodieWriteConfig.TABLE_NAME, "trips_hive")
.option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY,trips_hive)
.option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY,"true")
.option(DataSourceWriteOptions.HIVE_URL_OPT_KEY,"jdbc:hive2://<hive_ip>:10000")
.option(DataSourceWriteOptions.HIVE_USER_OPT_KEY,"<hive_username>")
.option(DataSourceWriteOptions.HIVE_PASS_OPT_KEY,"<hive_password>")
.option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY,"org.apache.hudi.hive.NonPartitionedExtractor")
.option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY,"org.apache.hudi.keygen.NonpartitionedKeyGenerator")
.option(DataSourceWriteOptions.OPERATION_OPT_KEY, DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL)
.option("hoodie.upsert.shuffle.parallelism", "4")
.mode(SaveMode.Append)
.save()

The dataframe that i tried to insert is

+-------+----------+-----------+------------------+
|trip_id|route_type|destination| createdDate|
+-------+----------+-----------+------------------+
| 1001| B| New York|2020-12-7 12:30:33|
| 1002| C| New Jersey|2020-12-7 12:30:33|
| 1003| D|Los Angeles|2020-12-7 12:30:33|
| 1004| E| Las Vegas|2020-12-7 12:30:33|
| 1005| F| Tucson|2020-12-7 12:30:33|
| 1004| E| Las Vegas|2020-12-7 12:30:38|
+-------+----------+-----------+------------------+

i get the following exception when i tried to execute the above

753927 [Executor task launch worker for task 6] ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 3.0 (TID 6)
java.lang.NoSuchMethodError: org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(Lorg/apache/avro/generic/GenericRecord;Ljava/lang/String;Z)Ljava/lang/Object;
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:140)
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:394)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$31.apply(RDD.scala:1409)
at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$31.apply(RDD.scala:1409)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
753951 [task-result-getter-0] WARN org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 3.0 (TID 6, localhost, executor driver): java.lang.NoSuchMethodError: org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(Lorg/apache/avro/generic/GenericRecord;Ljava/lang/String;Z)Ljava/lang/Object;
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:140)
at org.apache.hudi.HoodieSparkSqlWriter$$anonfun$1.apply(HoodieSparkSqlWriter.scala:139)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:394)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at scala.collection.AbstractIterator.to(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1334)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1334)
at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$31.apply(RDD.scala:1409)
at org.apache.spark.rdd.RDD$$anonfun$take$1$$anonfun$31.apply(RDD.scala:1409)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

When i exlored the jars i found both the jars hudi-spark-bundle_2.11-0.6.0 and hudi-hadoop-mr-bundle-0.6.0.jar contains
HoodieAvroUtils class.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions