Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUPPORT]Encountered java.lang.ClassNotFoundException when using hudi through spark-sql #11655

Closed
wardlican opened this issue Jul 19, 2024 · 1 comment

Comments

@wardlican
Copy link

wardlican commented Jul 19, 2024

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

java.lang.ClassNotFoundException when using hudi through spark-sql: org.apache.spark.sql.avro.HoodieAvroSerializer
image

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

  1. put hudi-spark3.4-bundle_2.12-0.14.1.jar to spark/jars/
  2. bin/spark-sql --master yarn
    --deploy-mode client
    --num-executors 2
    --executor-memory 1g
    --executor-cores 2
    --jars /opt/spark/jars/hudi-spark3.4-bundle_2.12-0.14.1.jar
    --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
    --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
    --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
    --conf 'spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar'

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.14.1

  • Spark version :3.4.2

  • Hive version :3.1.3

  • Hadoop version :

  • Storage (HDFS/S3/GCS..) :

  • Running on Docker? (yes/no) :

Additional context

Add any other context about the problem here.

Stacktrace

024-07-19 15:44:50,801 [WARN] [main] Cannot use org.apache.spark.sql.hudi.HoodieSparkSessionExtension to configure session extensions. (org.apache.spark.sql.SparkSession(org.apache.spark.internal.Logging.logWarning:93))
java.lang.NoClassDefFoundError: org/apache/spark/sql/avro/HoodieAvroSerializer
        at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:?]
        at java.lang.Class.privateGetDeclaredConstructors(Class.java:3137) ~[?:?]
        at java.lang.Class.getConstructor0(Class.java:3342) ~[?:?]
        at java.lang.Class.newInstance(Class.java:556) ~[?:?]
        at org.apache.hudi.SparkAdapterSupport$.sparkAdapter$lzycompute(SparkAdapterSupport.scala:49) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.hudi.SparkAdapterSupport$.sparkAdapter(SparkAdapterSupport.scala:35) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.hudi.SparkAdapterSupport.sparkAdapter(SparkAdapterSupport.scala:29) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.hudi.SparkAdapterSupport.sparkAdapter$(SparkAdapterSupport.scala:29) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.sparkAdapter$lzycompute(HoodieSparkSessionExtension.scala:28) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.sparkAdapter(HoodieSparkSessionExtension.scala:28) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.apply(HoodieSparkSessionExtension.scala:54) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.apply(HoodieSparkSessionExtension.scala:28) ~[hudi-spark3.4-bundle_2.12-0.14.1.jar:0.14.1]
        at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1(SparkSession.scala:1297) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1$adapted(SparkSession.scala:1292) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) ~[scala-library-2.12.17.jar:?]
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) ~[scala-library-2.12.17.jar:?]
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) ~[scala-library-2.12.17.jar:?]
        at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$applyExtensions(SparkSession.scala:1292) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1033) ~[spark-sql_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:64) ~[spark-hive-thriftserver_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:358) ~[spark-hive-thriftserver_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:174) ~[spark-hive-thriftserver_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) ~[spark-hive-thriftserver_2.12-3.4.2.jar:3.4.2]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) ~[spark-core_2.12-3.4.2.jar:3.4.2]
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ~[spark-core_2.12-3.4.2.jar:3.4.2]
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.avro.HoodieAvroSerializer
        at jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581) ~[?:?]
        at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) ~[?:?]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:522) ~[?:?]
        ... 35 more
@ad1happy2go
Copy link
Collaborator

@wardlican Can you let us know what you are trying to do. This 0.14.1 spark 3.4 bundle is officially certified and we didn't get any such error.Also provide your env details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants