Flink: Refresh table in ListMetadataFiles to prevent incorrect orphan file deletion#16324
Conversation
39392db to
aaab19b
Compare
Guosmilesmile
left a comment
There was a problem hiding this comment.
Thanks for the PR.
Please remove the 2.0, 1.20 changes for now - we usually cherry pick the changes to the other versions after the original PR for the main versions has been merged.
| // Verify that manifest lists from ALL 3 snapshots are present, not just the first one. | ||
| // Without table.refresh() in processElement, only snapshot 1's files would be emitted. | ||
| table.refresh(); | ||
| for (org.apache.iceberg.Snapshot snapshot : table.snapshots()) { |
| // Verify that manifest lists from ALL 3 snapshots are present, not just the first one. | ||
| // Without table.refresh() in processElement, only snapshot 1's files would be emitted. |
There was a problem hiding this comment.
Can we verify this based on numbers rather than comments?
… file deletion ListMetadataFiles loads the table once at job start and never refreshes, so it only emits manifest list and manifest paths for snapshots that existed when the job started. Any snapshot added after job start has its metadata files unprotected. DeleteOrphanFiles then incorrectly classifies those files as orphans and deletes them, causing NotFoundException in subsequent ExpireSnapshots runs when IncrementalFileCleanup tries to read them. The fix adds table.refresh() in processElement(), matching what MetadataTablePlanner already does. Closes apache#15487
aaab19b to
266f0f6
Compare
|
Thanks for the feedback! |
|
Looks good to me. @pvary If you have a minute, could you help take a look? |
|
Merged to main. @yadavay-amzn: Please create the backport PR and tell us if you need to change anything manuallly, or the following command was doing everything without issue: |
Agent-Logs-Url: https://github.com/kevinjqliu/iceberg/sessions/682ce8b4-890f-41a9-a89a-b1f2873be44c Co-authored-by: kevinjqliu <9057843+kevinjqliu@users.noreply.github.com>
Agent-Logs-Url: https://github.com/kevinjqliu/iceberg/sessions/682ce8b4-890f-41a9-a89a-b1f2873be44c Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: kevinjqliu <9057843+kevinjqliu@users.noreply.github.com>
|
Thanks @Guosmilesmile for the review and @pvary for merging! I was just about to create the backport PRs but I see that @kevinjqliu already created and merged it! I'm assuming no more action needed from me. Thanks again folks! |
Problem
Fixes #15487.
When Flink TableMaintenance runs both
ExpireSnapshotsandDeleteOrphanFiles, manifest list files of live snapshots are incorrectly deleted as orphans, causingNotFoundExceptionin subsequentExpireSnapshotsruns.Root cause
ListMetadataFilesloads the table once at operator startup (open()) and never callstable.refresh()inprocessElement(). It only emits manifest list and manifest file paths for snapshots that existed when the Flink job started.Any snapshot added after job start has its metadata files missing from the "referenced" set that
DeleteOrphanFilesuses. When those manifest lists are older thanminAge,OrphanFilesDetectorclassifies them as orphans andDeleteFilesProcessordeletes them.On the next maintenance cycle,
ExpireSnapshotstries to read those manifest lists inIncrementalFileCleanup.cleanFiles()and fails withNotFoundException.This explains why the bug:
DeleteOrphanFilesenabled (it is the one incorrectly deleting the files)ExpireSnapshotsalone (it only deletes manifest lists of snapshots it has already expired and read)Fix
Add
table.refresh()at the top ofListMetadataFiles.processElement(), matching whatMetadataTablePlanneralready does. This ensures the "referenced" set always reflects the current table state.