-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-13905. Bootstrap lock acquired in background services can lead to deadlock #9273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…o deadlock Change-Id: I0e8a540848cbaddc1e476ccb537dca18134251d1
|
@adoroszlai this is the fix for HDDS-13889 |
Great, please run |
Please wait for clean CI run in fork before opening PR. |
Change-Id: I09e75187190855c430d4d5848d313f6f3ba87a66
Ran it 10x10 times all of them have passed https://github.com/swamirishi/ozone/actions/runs/19223007061 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a deadlock issue where bootstrap lock acquisition was happening after snapshots were opened, causing background services to deadlock with the bootstrap flow. The fix ensures that all background services acquire the bootstrap lock before opening any snapshots, preventing the deadlock condition.
Key Changes
- Replaced Semaphore-based locking with ReadWriteLock in
BootstrapStateHandler.Lockto allow concurrent read access for background services while bootstrap takes an exclusive write lock - Introduced
DeletingServiceTaskQueueinAbstractKeyDeletingServicethat wraps all background tasks to automatically acquire read locks before execution - Updated
SstFilteringServiceto acquire bootstrap lock before opening snapshots instead of after
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
BootstrapStateHandler.java |
Core locking mechanism changed from Semaphore to ReadWriteLock with separate read/write lock acquisition methods |
AbstractKeyDeletingService.java |
Added DeletingServiceTaskQueue wrapper class that automatically acquires read lock before task execution |
KeyDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed bootstrap lock acquisition from individual operations |
DirectoryDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed manual bootstrap lock handling |
SnapshotDeletingService.java |
Updated to use DeletingServiceTaskQueue and removed manual bootstrap lock handling |
SstFilteringService.java |
Moved bootstrap lock acquisition to before snapshot opening instead of during SST file operations |
RocksDBCheckpointDiffer.java |
Updated lock acquisition calls to use new acquireReadLock() API |
DBCheckpointServlet.java |
Updated Lock implementation to return UncheckedAutoCloseable instead of self-reference |
OMDBCheckpointServlet.java |
Updated Lock to aggregate multiple locks and return composite closeable |
OMDBCheckpointServletInodeBasedXfer.java |
Updated to use new acquireWriteLock() API |
| Test files | Updated test code to use new lock API and removed flaky test annotation from previously deadlocking test |
Comments suppressed due to low confidence (3)
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:1266
- This catch block is unreachable because
acquireReadLock()doesn't actually throwInterruptedException(the underlyinglock.lock()method is not interruptible). Either switch tolockInterruptibly()in the base Lock implementation, or remove this catch block.
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java:1105
- This catch block is unreachable because
acquireReadLock()doesn't actually throwInterruptedException(the underlyinglock.lock()method is not interruptible). Either switch tolockInterruptibly()in the base Lock implementation, or remove this catch block.
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMDBCheckpointServlet.java:664
- Lock has the same name as its supertype org.apache.hadoop.ozone.lock.BootstrapStateHandler$Lock.
static class Lock extends BootstrapStateHandler.Lock {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class Lock implements AutoCloseable { | ||
| private final Semaphore semaphore = new Semaphore(1); | ||
| /** Bootstrap state handler lock implementation. Should be always acquired before opening any snapshot to avoid | ||
| * deadlocks*/ |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing space after period in the comment. Should be "Should be always acquired before opening any snapshot to avoid deadlocks."
| * deadlocks*/ | |
| * deadlocks */ |
| * is wrapped such that its execution acquires a read lock via | ||
| * {@code getBootstrapStateLock().acquireReadLock()} before performing any | ||
| * operations. The lock is automatically released upon task completion or | ||
| * exception, ensuring safe concurrent execution of tasks within the service when running along with bootstrap flow. |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The Javadoc comment should end with "ensures safe concurrent execution of tasks within the service when running alongside bootstrap flow." (changed "along with" to "alongside" for better grammar).
| * exception, ensuring safe concurrent execution of tasks within the service when running along with bootstrap flow. | |
| * exception, ensuring safe concurrent execution of tasks within the service when running alongside bootstrap flow. |
| public UncheckedAutoCloseable acquireWriteLock() throws InterruptedException { | ||
| return lock(false); | ||
| } | ||
|
|
||
| @Override | ||
| public void close() { | ||
| unlock(); | ||
| public UncheckedAutoCloseable acquireReadLock() throws InterruptedException { | ||
| return lock(true); |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The methods acquireWriteLock() and acquireReadLock() declare throws InterruptedException, but they cannot actually throw this exception. The underlying lock.lock() call (line 35) is a blocking operation that doesn't throw InterruptedException.
If you need interruptible locking, use lock.lockInterruptibly() instead. Otherwise, remove the throws InterruptedException declaration from both methods.
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/DBCheckpointServlet.java
Outdated
Show resolved
Hide resolved
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the change, is that previously, the bootstrap lock is held when submitting internal requests (directory delete, key delete). That is moved to much later and bootstrap lock is held when the requests are being executed.
|
|
||
| // Submit Purge paths request to OM. Acquire bootstrap lock when processing deletes for snapshots. | ||
| try (BootstrapStateHandler.Lock lock = snapTableKey != null ? getBootstrapStateLock().lock() : null) { | ||
| try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock removed here.
| .setPurgeKeysRequest(requestBuilder.build()).setClientId(getClientId().toString()).build(); | ||
|
|
||
| try (Lock lock = snapTableKey != null ? getBootstrapStateLock().lock() : null) { | ||
| try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock removed here.
| .setClientId(clientId.toString()) | ||
| .build(); | ||
|
|
||
| try (BootstrapStateHandler.Lock lock = getBootstrapStateLock().lock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock removed here.
| .setSnapshotMoveTableKeysRequest(moveDeletedKeys) | ||
| .setClientId(clientId.toString()) | ||
| .build(); | ||
| try (BootstrapStateHandler.Lock lock = getBootstrapStateLock().lock()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock removed here.
|
|
||
| try ( | ||
| UncheckedAutoCloseableSupplier<OmSnapshot> snapshotMetadataReader = | ||
| // Acquire bootstrap lock before opening any snapshot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved the bootstrap lock earlier
| lock.lock(); | ||
| return lock::unlock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line returns a method reference to unlock the acquired lock. Specifically, it creates an UncheckedAutoCloseable instance that, when closed, calls unlock() on the same lock, releasing it. This enables usage of the lock in a try-with-resources style, ensuring the lock will be released when no longer needed.
Change-Id: Ib5a872948e090cbf1f3cc51dd3a92f159429e835 # Conflicts: # hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SstFilteringService.java
Change-Id: Icc3420a005fe68db59aeb9ede68a7f6cc4e22831
jojochuang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Merged. Thanks @swamirishi |
What changes were proposed in this pull request?
Currently bootstrap lock is acquired after the snapshot is already opened thus this can lead to a deadlock condition during bootstrap where the Bootstrap flow has already acquired a bootstrap lock and is waiting on snapshot cache lock to be acquired which cannot be acquired since the snapshots are still open. All background services acquire bootstrap lock while still having a snapshot open which is going to cause a deadlock here.
To fix this all background services should always acquire bootstrap lock before opening a snapshot. The only con to this is that the entire task of background service would be blocked when the bootstrap copy batch is running on the leader om which should be ok since bootstrap would be an infrequent operation.
However one possible improvement would be to just to create a hardlink(which we already do to ensure this file doesn't get deleted by rocksdb operations but we also write the file into the Tarball stream synchronously
ozone/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/Archiver.java
Lines 135 to 155 in 96390ac
The above can be done as part of a separate patch @sadanand48
https://issues.apache.org/jira/browse/HDDS-13906
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13905
How was this patch tested?
Existing unit tests which was stuck because of the deadlock and marked flaky because of 272544aa95c66d0743f9ac94b2f4d586b325b8a7