Skip to content

Conversation

@swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

A Snapshot Cache lock needs to be implemented which should prevent the rocksdb to be opened while the lock is held.

The idea is, the snapshot can't be opened for other purposes during snapshot defragmentation (note: snapshot compaction is not DB compaction). Essentially, an exclusive lock should be taken on the snapshot during the process.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13004

How was this patch tested?

Added unit tests

Change-Id: Ib988b6e862f4964bc5915f6e845c6905a8a650ee
@swamirishi swamirishi marked this pull request as ready for review October 28, 2025 15:28
@swamirishi swamirishi added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Oct 28, 2025
@swamirishi
Copy link
Contributor Author

@SaketaChalamchala can you review this patch?

@smengcl smengcl requested a review from Copilot October 29, 2025 04:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the SnapshotCache class to add support for snapshot-specific locking and improves the cleanup mechanism by extracting single-key cleanup logic into a separate method.

  • Adds a new lock(UUID snapshotId) method that acquires a write lock for a specific snapshot
  • Refactors the cleanup logic by extracting single-key cleanup into a new cleanup(UUID evictionKey) method
  • Renames test method from testGetHoldsWriteLock to testLockHoldsWriteLock for consistency
  • Adds test coverage for the new snapshot-specific lock method

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
SnapshotCache.java Adds snapshot-specific lock method and refactors cleanup logic into two overloaded methods
TestSnapshotCache.java Renames test method and adds test for new snapshot-specific lock functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 295 to 297
* Acquires a write lock on a specific snapshot database and returns an auto-closeble supplier for lock details.
* The lock ensures that the operations accessing the snapshot database are perfromed in a thread safe manner. The
* returned supplier automatically releases the lock acquired when closed, preveneting potential resource
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple spelling errors in the JavaDoc: 'auto-closeble' should be 'auto-closeable', 'perfromed' should be 'performed', 'preveneting' should be 'preventing'.

Suggested change
* Acquires a write lock on a specific snapshot database and returns an auto-closeble supplier for lock details.
* The lock ensures that the operations accessing the snapshot database are perfromed in a thread safe manner. The
* returned supplier automatically releases the lock acquired when closed, preveneting potential resource
* Acquires a write lock on a specific snapshot database and returns an auto-closeable supplier for lock details.
* The lock ensures that the operations accessing the snapshot database are performed in a thread safe manner. The
* returned supplier automatically releases the lock acquired when closed, preventing potential resource

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Change-Id: I4496de9d403268a179b9ab65afb8575b07ad83a6
Change-Id: I9a6914f0b383486da7c95063a57ea64ba52dcec7
@smengcl smengcl requested a review from Copilot October 29, 2025 05:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -324,43 +336,49 @@ public OMLockDetails get() {
* If cache size exceeds soft limit, attempt to clean up and close the
instances that has zero reference count.
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: 'instances that has' should be 'instances that have' for subject-verb agreement.

Suggested change
instances that has zero reference count.
instances that have zero reference count.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @swamirishi . lgtm

AtomicReference<OMLockDetails> lockDetails = new AtomicReference<>(lockFunction.get());
if (lockDetails.get().isLockAcquired()) {
cleanup(true);
cleanupFunction.get();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we closing the snapshot instance upon acquiring a lock?

Copy link
Contributor Author

@swamirishi swamirishi Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want a rocksdb open in the cache once a write lock has been acquired. We would be using this for atomically switching the rocksdb directory after defragging a snapshot

@swamirishi swamirishi merged commit 2806bae into apache:master Oct 29, 2025
49 checks passed
@swamirishi
Copy link
Contributor Author

thank you @smengcl and @sadanand48 for reviewing the patch

@adoroszlai
Copy link
Contributor

@swamirishi please set fix version when resolving Jira issue after PR merge

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Dec 5, 2025
jojochuang pushed a commit that referenced this pull request Dec 8, 2025
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants