-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Search before asking
- I searched in the issues and found nothing similar.
Paimon version
master branch (c20198b)
Compute Engine
Flink (ExpireSnapshotsImpl), but also affects JavaAPI
Minimal reproduce step
- Table has snap1, snap2, snap3 needed to be expire
- Trigger ExpireSnapshotsImpl to expire snap1, snap2, snap3
- During expiration (after data files deleted, before manifest files deleted), start a new read job from snap1
- Read job sees snap1 exists but fails with FileNotFoundException when accessing data files
What doesn't meet your expectations?
Current deletion order in ExpireSnapshotsImpl:
- Delete data files (all snapshots)
- Delete changelog files
- Delete manifest files
- Delete snapshot files (last)
This creates a window where snapshot file exists but data files are gone.
Existing protection (consumer-id) only protects already-running consumers, not new readers started during expiration.
Expected: Reader should either read successfully or not see the snapshot at all.
Anything else?
However, the probability of this issue occurring is LOW because:
- Most new jobs start reading from
latestsnapshot, notearliest - In most cases, the race window (data files deleted but snapshot file exists) is short, unless the table has a large number of data files to delete
- Starting a new job reading from earliest exactly during expiration is a rare scenario
Suggested priority: Low. This is more of a theoretical edge case than a practical problem.
Are you willing to submit a PR?
- I'm willing to submit a PR!
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working