-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrangler to support Hudi/Iceberg datasets read/write #1470
Comments
Thanks for raising this, two options are currently available in the library to handle CDC operations:
Delta Lake and Hudi are not on our roadmap at the moment because they lack native support in AWS Glue, that being said PRs are always welcome if you have a specific implementation in mind :) |
Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. |
With the release of Glue 4.0, it appears there is "support for Apache Hudi, Apache Iceberg, and Delta Lake formats" with AWS Glue. Will this make implementing this feature possible to implement now? |
@jaidisido is there any way to use |
When apache/iceberg#6564 is implemented might be possible to write in Iceberg format natively using python, without any help from external processing systems like Spark/Athena/Trino. |
Without Athena, could we have a more seamless integration for Wrangler on all transactional formats e.g.
|
HI @cdelamocepsa it is now possible to write into Iceberg using Athena since release 3.1: Athena Iceberg tutorial. |
@kukushking I see that you need to specify a I'm concerned about the efficiency of this, do you have any inputs in how will it behave in terms of latency/cost? |
Is your idea related to a problem? Please describe.
No
Describe the solution you'd like
It would be good to have support for CDC data lake formats like Apache Hudi, Apache Iceberg or Detla Lake format.
P.S. Please do not attach files as it's considered a security risk. Add code snippets directly in the message body as much as possible.
The text was updated successfully, but these errors were encountered: