Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta lake S3 exclusive write reconciliation #23145

Merged
merged 5 commits into from
Sep 4, 2024

Conversation

wendigo
Copy link
Contributor

@wendigo wendigo commented Aug 27, 2024

https://docs.aws.amazon.com/AmazonS3/latest/userguide/conditional-requests.html#conditional-writes

This allows for the faster, concurrent write reconciliation without the custom locking mechanism
that was implemented for S3 when it lacked exclusive write support. Since this is now supported,
we should use faster path if possible.

Enabled in tests for S3 and Minio tests. Doesn't work in s3mock and local-stack (support added 29/08/2024).

Release notes: Add support for lock-less Delta Lake write reconciliation on S3

@cla-bot cla-bot bot added the cla-signed label Aug 27, 2024
@wendigo wendigo requested review from electrum and anusudarsan and removed request for electrum August 27, 2024 13:42
@github-actions github-actions bot added the delta-lake Delta Lake connector label Aug 27, 2024
@github-actions github-actions bot added the hive Hive connector label Aug 27, 2024
@anusudarsan anusudarsan self-requested a review August 27, 2024 15:42
@wendigo wendigo changed the title Add support for exclusive writes on S3 Delta lake S3 exclusive write reconciliation Aug 27, 2024
@findinpath
Copy link
Contributor

The PR is quite important for Delta Lake users on top of AWS S3. Pls add a more lengthy description of what it does and add also release notes to it. 🙏

Maybe we should point out that the hypothesis 1. from delta-io/delta#39 has fallen now.
cc @tdas

@wendigo
Copy link
Contributor Author

wendigo commented Aug 27, 2024

@electrum should we enable S3 exclusive write support by default if it's supported by both Minio and AWS?

@wendigo
Copy link
Contributor Author

wendigo commented Aug 27, 2024

Seems to be working :)

@losipiuk
Copy link
Member

Fix comment typo LGTM ;)

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract renames to separate commit.
I cannot see what changes are made to test file

@wendigo wendigo force-pushed the serafin/s3-exclusive-write branch 2 times, most recently from d9d7d3e to 3ef770b Compare September 2, 2024 11:45
@wendigo
Copy link
Contributor Author

wendigo commented Sep 2, 2024

@losipiuk PTAL

This allows for the faster, concurrent write reconciliation without the custom locking mechanism
that was implemented for S3 when it lacked exclusive write support. Since this is now supported,
we should use faster path if possible.
Copy link
Contributor

@findinpath findinpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.

Very much looking forward to this addition.

Please give the news on #delta-lake Slack channel :)

@wendigo
Copy link
Contributor Author

wendigo commented Sep 4, 2024

@findinpath I'll remove TODOs before merging this change

@wendigo wendigo merged commit e326419 into master Sep 4, 2024
3 of 12 checks passed
@wendigo wendigo deleted the serafin/s3-exclusive-write branch September 4, 2024 13:00
@github-actions github-actions bot added this to the 456 milestone Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector hive Hive connector
Development

Successfully merging this pull request may close these issues.

6 participants