HDDS-14699. Fix orphan snapshot versions handling when snapshot chain tableKey mapping is stale#9810
Merged
jojochuang merged 4 commits intoapache:masterfrom Feb 24, 2026
Conversation
… tableKey mapping is stale
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical bug where active snapshots were incorrectly marked as purged when the SnapshotChainManager's snapshotIdToTableKey map became stale, leading to NullPointerException in the snapshot cache loader.
Changes:
- Fixed
isSnapshotPurged()to fall back to transactionInfo when tableKey lookup returns null, treating snapshots without purge transaction info as active - Added
addSnapshotToTableKey()method to update the snapshotIdToTableKey map during Raft log replay inOMSnapshotCreateResponse#addToDBBatch - Added debug logging throughout SnapshotChainManager to aid in diagnosing snapshot chain synchronization issues
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| OmSnapshotManager.java | Fixed isSnapshotPurged() logic to handle null tableKey by falling back to transactionInfo check; added debug logging |
| SnapshotChainManager.java | Added addSnapshotToTableKey() method for Raft replay; enhanced debug logging in getTableKey(), addSnapshot(), and removeFromSnapshotIdToTable() |
| OMSnapshotCreateResponse.java | Added call to addSnapshotToTableKey() in addToDBBatch() to maintain snapshotIdToTableKey map during Raft replay |
| TestOmSnapshotLocalDataManager.java | Added regression test testCheckOrphanSnapshotVersionsWithStaleSnapshotChain() to verify fix |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
jojochuang
reviewed
Feb 23, 2026
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
jojochuang
reviewed
Feb 23, 2026
...ger/src/main/java/org/apache/hadoop/ozone/om/response/snapshot/OMSnapshotCreateResponse.java
Outdated
Show resolved
Hide resolved
jojochuang
reviewed
Feb 24, 2026
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
jojochuang
reviewed
Feb 24, 2026
Contributor
jojochuang
left a comment
There was a problem hiding this comment.
apart from the debug log message, the rest is good to go
jojochuang
approved these changes
Feb 24, 2026
Contributor
|
Thanks @smengcl merged. |
Contributor
Author
|
Thanks @jojochuang for reviewing and merging this. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
In isSnapshotPurged() check, snapshot chain tableKey returning null should not be the sole indicator for judging whether the snapshot is still active or not.
isSnapshotPurged() incorrectly returning true causes checkOrphanSnapshotVersions() to incorrectly removing active snapshot's YAML metadata (in OmSnapshotLocalDataManagerService runs). This in turn causes NPE in CacheLoader when attempting to load the snapshot.
isSnapshotPurged()checkAdd entries to snapshotIdToTableKey in case of Raft replay in OMSnapshotCreateResponse#addToDBBatch (which was a bug / an oversight previously)What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14699
How was this patch tested?
testCheckOrphanSnapshotVersionsWithStaleSnapshotChain