Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Fix for deleting files when commiting transactions with multiple branches #6634

Merged
merged 1 commit into from Jan 23, 2023

Conversation

amogh-jahagirdar
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar commented Jan 20, 2023

Fix for #6632

The way intermediate snapshots in transactions are tracked is incorrect in the presence of branching.

Moving to draft as I read more of the code to make sure this handles different failure cases properly and will add tests if determined this is the right way to fix this issue.

@amogh-jahagirdar amogh-jahagirdar force-pushed the branch-txns-fix branch 2 times, most recently from e383f32 to 870de44 Compare January 23, 2023 05:19
@github-actions github-actions bot removed the API label Jan 23, 2023
@amogh-jahagirdar amogh-jahagirdar changed the title Core, API: Fix for tracking intermediate snapshots when a transaction spans multiple branches Core: Fix for tracking intermediate snapshots when a transaction spans multiple branches Jan 23, 2023
@amogh-jahagirdar amogh-jahagirdar changed the title Core: Fix for tracking intermediate snapshots when a transaction spans multiple branches Core: Fix for deleting files when commiting transactions with multiple branches Jan 23, 2023
@rdblue rdblue marked this pull request as ready for review January 23, 2023 17:56
@rdblue
Copy link
Contributor

rdblue commented Jan 23, 2023

@amogh-jahagirdar, this looks great! I think it's about ready to merge but there are a couple of naming nits.

[Priority 2] Spec: Snapshot tagging and branching automation moved this from In progress to Reviewer approved Jan 23, 2023
@amogh-jahagirdar
Copy link
Contributor Author

Thanks @rdblue for the review! since the focus of this PR was properly determining the set of committed files when committing a transaction I think we should be good. Regardless I think we probably want some more coverage on transactions spanning multiple branches, I can raise that in a separate PR?

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, thanks for the fix!

@rdblue rdblue merged commit 550ce61 into apache:master Jan 23, 2023
[Priority 2] Spec: Snapshot tagging and branching automation moved this from Reviewer approved to Done Jan 23, 2023
@rdblue
Copy link
Contributor

rdblue commented Jan 23, 2023

Thanks, @amogh-jahagirdar!

krvikash pushed a commit to krvikash/iceberg that referenced this pull request Mar 16, 2023
bartash added a commit to bartash/iceberg that referenced this pull request Nov 29, 2023
When a snapshot is expired as part of a transaction, the snapshot
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this be not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that snapshot files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
bartash added a commit to bartash/iceberg that referenced this pull request Nov 30, 2023
When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
bartash added a commit to bartash/iceberg that referenced this pull request Dec 4, 2023
When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
amogh-jahagirdar pushed a commit that referenced this pull request Dec 4, 2023
When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where #6634 was reverted
to show that this is a regression.
bartash added a commit to bartash/iceberg that referenced this pull request Dec 5, 2023
…che#9183)

When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
amogh-jahagirdar added a commit that referenced this pull request Dec 19, 2023
…ed (#9223)

* Core: Expired Snapshot files in a transaction should be deleted. (#9183)

When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

* Core: Fix logic for determining set of committed files in BaseTransaction when there are no new snapshots (#9221)

---------

Co-authored-by: Amogh Jahagirdar <amogh@tabular.io>
lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
…che#9183)

When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
…che#9183)

When a snapshot is expired as part of a transaction, the manifest list
file(s) should be deleted when the transaction commits. A recent change
(apache#6634) ensured that these files are not deleted when they have also
been committed as part of a transaction, but this breaks the simple
case where no new files are committed. Fix this by not skipping
deletion when the list of committed files is empty.

TESTING:

Extended a unit test to ensure that manifest list files are deleted.
Ran the test without the fix on a branch where apache#6634 was reverted
to show that this is a regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

3 participants