-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[WIP][FLINK-26966][heap/state] Implement incremental checkpoints #19313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently, handling chained entry on copy-on-write uses object identity to find the wanted entry in the chain. However, if the same method is running concurrently, the object in the chain can be replaced by its copy; the condition will never be met and the chain end will be reached, causing an NPE. With some tiny changes in timings (i.e. overriding methods of CopyOnWriteStateMap), StateBackendTestBase.testValueStateRace fails when running repeatedly (~4 out of 100 runs). This change replaces object identity with key+namespace equality in the condition. The overhead should not be significant because the same check is already performed to find the element before copying.
…re write CopyOnWriteStateMap copies the entry before returning it to the client for update. This also updates its state and entry versions. However, if the entry is NOT used by any snapshots, the versions will stay the same despite that state is going to be updated. With incremental checkpoints, this causes such updated version to be ignored in the next snapshot. This change bumps the state version in this case (entry version stays the same).
…completion/abortion
Compaction is implemented the same way as in Changelog: - perform a full snapshot periodically and in the background - base all future snapshots on it
|
This PR is being marked as stale since it has not had any activity in the last 180 days. If you are having difficulty finding a reviewer, please reach out to the [community](https://flink.apache.org/what-is-flink/community/). If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 90 days, it will be automatically closed. |
|
This PR has been closed since it has not had any activity in 120 days. |
This PR includes all the commits from other tickets.