New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasource Writer throws error on resolving struct fields #1034
Comments
hi, @alphairys, according to your describion, I can not get the error. My steps are bellow.
|
@lamber-ken , thanks for looking into this. Turns out the bundled hudi package with EMR is the source of this. I was following the instructions here: https://aws.amazon.com/blogs/aws/new-insert-update-delete-data-on-s3-with-amazon-emr-and-apache-hudi/ Launching spark-shell with the latest package as you did (with |
@alphairys - Would you mind dropping a mail to emr-hudi@amazon.com or rbhartia@amazon.com with the steps to reproduce this problem on EMR? |
@alphairys - We were able to reproduce the problem and see that it is happening due to the struct type in there. Most likely it is a function of EMR using - Spark 2.4 with spark-avro library. Will update you back on this thread once we have a fix for the same. Thank you |
@rbhartia , thanks for the update. Will be on the lookout for the fix. |
This has been landed and verified! |
Issue
I have a dataframe with the following schema that I would like to write out as a hudi table.
Schema
Write out to hudi table:
Running this, i get the following error:
Full Error Log
The text was updated successfully, but these errors were encountered: