-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DynamoDB catalog in Iceberg #12173
Conversation
a802ffc
to
8d1a22e
Compare
8d1a22e
to
0ddd7a5
Compare
Do we have / want to have some compatibility tests with Spark? |
I guess it's hard to test with dockerized DynamoDB at this time. Sent PR apache/iceberg#4726 to Iceberg repository. |
@ebyhr looking forward to this, thanks for the changes! I gave this branch a try, and it might be missing
With this small addition, I was able to query an Iceberg DynamoDB catalog, although I hit another issue when trying to actually query some data from a table created from Spark. This might be unrelated/off-topic so feel free to ignore. After selecting a catalog and a schema,
The coordinator logs do indicate that
This doesn't occur for tables created from Trino: edit: actually, my Spark-written tables are |
@peay Thanks for letting us know. I guess the failure comes from other dependency change. I will restart working and confirm the issue after Iceberg community releases the next version. |
Now the iceberg has officially released v14 with support for overriding the DynamoDB endpoint is there any plan to soon promote this PR to ready? |
@ebyhr I gave this PR another try, and I still can't read tables created by Spark. I assumed above that it was an Iceberg format version issue but after testing out that hypothesis, that doesn't seem to be the case. One thing I've observed is that when creating or writing to a table from Trino with the DynamoDB catalog, two JSON table metadata are written:
Both metadata files are written at the same timestamp when creating or committing to the table. However, this does not occur when writing with Spark 3.2.1 + Iceberg 0.13.1 and the DynamoDB catalog: a single table metadata file is written every time, and it does not have |
Closing as I have no bandwidth to continue this PR. |
Description
Support DynamoDB catalog in Iceberg
Fixes #9953
Documentation
(x) Sufficient documentation is included in this PR.
Release notes
(x) Release notes entries required with the following suggested text: