-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] Apache Hudi 0.14.0 with AWS Glue (to use Custom Jar ) #10358
Comments
Discussion you can find slack post https://apache-hudi.slack.com/archives/C4D716NPQ/p1702936427363959?thread_ts=1694714610.810039&cid=C4D716NPQ |
@soumilshah1995 You are downloading the wrong hudi bundle jar. Glue 4 supports spark 3.3. So you should download the once here - |
I will try this out |
I'm still encountering the same issue. I downloaded the JAR files and uploaded them to S3 as instructed in the ticket. I also utilized the Glue job mentioned earlier, but the error persists. : org.apache.hudi.internal.schema.HoodieSchemaException: Failed to convert struct type to avro schema: StructType(StructField(emp_id,LongType,true), StructField(employee_name,StringType,true), StructField(department,StringType,true), StructField(state,StringType,true), StructField(salary,LongType,true), StructField(age,LongType,true), StructField(bonus,LongType,true), StructField(ts,LongType,true))
|
Can you paste entire stack trace what you getting now? Last time it was clearly saying that Spark 3_1 adapter class bot found as we were using spark3 jars which is for spark 3.1 version and glue had 3.3 |
Sure
|
will be making video for community to use Hudi 0.14 on Glue :D |
Thanks @soumilshah1995 |
Anytime @ad1happy2go you guys are real champ just trying to learn more from you guys :D |
guys any pointer on how to provide these jars when provisioning using CloudFormation Templates. |
I use server less that makes it very easy |
@kamaagar I have been following up with AWS teams no one suggests to bring your own JAR. EMR 6+ versions have their own Hudi jar in place. So my suggestion is don't waste time in bringing any extra jar. Even I am facing same problem and checking how to fix it. |
Hudi 0.14 on AWS Glue
Overview
This project aims to use Hudi 0.14 on AWS Glue, leveraging Glue 4. The default Glue setup supports Hudi but uses an older version. The objective is to use the specified Hudi version with Glue 4.
Prerequisites
Setup Instructions
Dependent JARs path:
s3://XXX/jar/hudi-spark3-bundle_2.12-0.14.0.jar,s3://XXX/jar/spark-avro_2.12-3.5.0.jar
Job parameters info:
--extra-jars
:s3://jXXX/jar/hudi-spark3-bundle_2.12-0.14.0.jar,s3://jt-dev-datateam/jar/spark-avro_2.12-3.5.0.jar
--additional-python-modules
:faker==11.3.0
Test code
Error
The text was updated successfully, but these errors were encountered: