Skip to content

Conversation

@swamirishi
Copy link
Contributor

@swamirishi swamirishi commented Nov 18, 2025

What changes were proposed in this pull request?

After HDDS-13005 all keys returned by SstFileSetReader is going to be monotonically sorted increasing order thus using rocksdbIterator for getting the keys would be more optimzed than doing multiple gets.
This is built on #9313

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13930

How was this patch tested?

Added unit tests

Change-Id: Iba47aeb21663dfa407ab71339cef02c0d74b49f2
Change-Id: Ifd2feca1fddb144e4955db025f0b15a2ab1f3bfe
Change-Id: I34536ff06efb7d5a4942853f0fd83942ab398b5f
…otLocalDataManager

Change-Id: I34536ff06efb7d5a4942853f0fd83942ab398b5f
Change-Id: I32bcaf2a1fb290f1790c02872a0230cd65586636
Change-Id: I105a2e8178c0444d52de41b99801f4ceb6d57ffd

# Conflicts:
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/util/Checksum.java
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/util/ObjectSerializer.java
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/util/YamlSerializer.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotLocalDataYaml.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotLocalDataManager.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshotLocalDataYaml.java
Change-Id: I985170e38fb8beeb784048e85a08a4c79e1aec97
Change-Id: I33e6e6e825bf23c323ad7ed593d800a11720fa4f
Change-Id: If30b2c766db82adde72145c8ecd3e590ef54cc2d
Change-Id: Id3f2c49050bc3476b9e0f5f51dacb6d9acc4c2f7
Change-Id: I432960725b4c6c55aa906b5780cc3027e41e10db
Change-Id: I3c5514e5bbd251a2b5297d8f074cfde5c71fa543
Change-Id: Ib5a9e6c91bdccba17820263c47eaf2c8400e930d
Change-Id: Ica36e0615c7bc6aa9b6a7f6fafafd0f830d4bafb
Change-Id: I26b66f266bb7677e4b1078f5fcd9f2ce3a651a70

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotLocalDataManager.java
Change-Id: I1d93dbc048a42cc55ff1f8ffa420e52f967527b8
Change-Id: I34202928a7a367dd0a1e57219317ff34de352b78
Change-Id: Iad6f26cb71ec921c51ee2d138745df1a2663533f
Change-Id: Ic5f7e249cfb9cb3973cbcd4abd36b22a6ff8f5aa
…calDataProvider

Change-Id: I3a004b4b435075a4348960aeed642e8da71e7e72
Change-Id: I06990bc9ab8fc7e1eb7bec255646a650bd8c35fe
Change-Id: I4c6c61c83aa9fadab8ecef854b99dcc0a89a2208
Change-Id: I0e476322372a302572f1fe79cbf2e874bfeac2ed
Change-Id: I31004e0c95dad64411c6fe848501a82f2f773cba
Change-Id: Id317c8b56e8b25c122b68eaf96599b9690d08f79
Change-Id: I3849387d064e093634e69cdaf870d27c1934cda5

# Conflicts:
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/util/ObjectSerializer.java
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/util/YamlSerializer.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotLocalData.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotLocalDataYaml.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotLocalDataManager.java
Change-Id: Ie5e5f3dab4324103e8855dd15619d7755f0422e6
Change-Id: I55bd5c3ef7fc32910a9111328638de2edffcd541
Change-Id: I880997d3eebdf378f14c203c61c2d63b2d17552e
Change-Id: I13ba8e2fd012a3c964d657e83496c93a4f55a3be
Change-Id: I41308fd260de2e7ecdd8e8318e2155096212b687
Change-Id: Ie607809e088aee642940f72fdd12a258a06cc96d

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OmSnapshotLocalDataManager.java
Change-Id: Ifb7f696c2ec859786286bf0545e6039c2b70b2ba
Change-Id: Ie464f0ed154e8911024dbe7802338e11a83f502a
Change-Id: I8abb7a9db2e595f0cf2a0d257cb38caaacec6e1f
Change-Id: I0533a4a3869ad0beb04d3f22f835985ee7d2982e
Change-Id: I09301ff17cf3c8abab5785d25336c7633f8ac293
Change-Id: Iddd5315532d35ed853f518daa77e0fa96dd5f2c0
Change-Id: I3e885ef24c95280ea0acccaf44787bd295aedbe4

# Conflicts:
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/utils/IOUtils.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/diff/delta/DeltaFileComputer.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/diff/delta/FileLinkDeltaFileComputer.java
Change-Id: I6ae5c5897f9ca8580ba68b17ad77dffa35dfafb6
Change-Id: Ib6089c9c7d8c787edd75721234d94d1ea0eb88c9
Change-Id: Idbf1d64b72ff4a8c80b2da8515639e430cef895c
Change-Id: I83a10d188f4764ef0646d4c19d0b2bf4cdcc82ee
@adoroszlai
Copy link
Contributor

Please rebase on top of master.

Change-Id: Ie968331ba23db3ae33ca421599a75ead35076d48

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/diff/delta/RDBDifferComputer.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/snapshot/diff/delta/TestCompositeDeltaDiffComputer.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/snapshot/diff/delta/TestRDBDifferComputer.java
Change-Id: I227cf7471e5235204926fb5f48c754516222fc30
Change-Id: I1d292144717589327756b737f8749085fa0e0213
Change-Id: I80d2de61eaf6f507509534ee834a5dbe44ae85b6

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotLocalData.java
Change-Id: I9dda0e6410d430cbf060cf97b6dbefb3710f29cf
@swamirishi
Copy link
Contributor Author

@jojochuang & @smengcl this can reviewed as I merged and rebased the dependent PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes snapshot diff operations by replacing multiple RocksDB get() calls with efficient iterator-based traversal. Since HDDS-13005 guarantees that keys from SstFileSetReader are returned in monotonically sorted order, using RocksDB iterators for lookups is more efficient than performing individual get operations.

Key Changes:

  • Introduced TableMergeIterator to merge and filter data from multiple tables using iterators instead of individual lookups
  • Changed SstFileSetReader API from Stream<String> to ClosableIterator<String> for better resource management and integration with iterator-based processing
  • Removed isKeyInBucket filtering logic from SnapshotDiffManager.addToObjectIdMap() by leveraging prefix-based table iteration

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
TableMergeIterator.java New utility class that merges multiple table iterators and filters by keys, enabling efficient multi-table lookups using seek operations
TestTableMergeIterator.java Comprehensive unit tests for TableMergeIterator covering edge cases including null keys, empty tables, and large datasets
package-info.java Package documentation for the new snapshot.util package
SnapshotDiffManager.java Refactored addToObjectIdMap() to use TableMergeIterator instead of individual get() calls; removed isKeyInBucket check as filtering is now done at the iterator level
TestSnapshotDiffManager.java Updated test to use StringInMemoryTestTable and modified key generation to include bucket prefix for iterator-based filtering
SstFileSetReader.java Changed return type from Stream to ClosableIterator for getKeyStream() and getKeyStreamWithTombstone(); removed getStreamFromIterator() helper method
TestSstFileSetReader.java Updated tests to use ClosableIterator with while loops instead of Stream forEach patterns
InMemoryTestTable.java Added name field and new constructors to support table naming in tests
StringInMemoryTestTable.java Added constructors for Map initialization and table naming to support improved testing infrastructure
Comments suppressed due to low confidence (1)

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/util/TableMergeIterator.java:80

  • Variable key may be null at this access as suggested by this null guard.
    if (itr.hasNext() && (kvs.get(index) == null || key.compareTo(kvs.get(index).getKey()) > 0)) {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


private V updateAndGetValueAtIndex(K key, int index) {
Table.KeyValueIterator<K, V> itr = itrs[index];
if (itr.hasNext() && (kvs.get(index) == null || key.compareTo(kvs.get(index).getKey()) > 0)) {
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential NullPointerException when key is null. The condition key.compareTo(kvs.get(index).getKey()) > 0 will throw NPE if key is null, even though line 90 checks for null. The null check should be performed before any method calls on key.

Consider changing line 80 to:

if (itr.hasNext() && (kvs.get(index) == null || (key != null && key.compareTo(kvs.get(index).getKey()) > 0))) {
Suggested change
if (itr.hasNext() && (kvs.get(index) == null || key.compareTo(kvs.get(index).getKey()) > 0)) {
if (itr.hasNext() && (kvs.get(index) == null || (key != null && key.compareTo(kvs.get(index).getKey()) > 0))) {

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key can never be null. If it is null then it would fetch wrong results


public InMemoryTestTable(Map<KEY, VALUE> map, String name) {
this.map = new ConcurrentSkipListMap<>(map);
this.map.putAll(map);
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant putAll operation. The constructor new ConcurrentSkipListMap<>(map) already copies all entries from the input map, so calling this.map.putAll(map) on line 49 duplicates the same entries unnecessarily.

Consider removing line 49:

public InMemoryTestTable(Map<KEY, VALUE> map, String name) {
  this.map = new ConcurrentSkipListMap<>(map);
  this.name = name;
}
Suggested change
this.map.putAll(map);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code change unrelated to PR

}

/**
* Gets the next key and corresponding values from all tables. Note: the values array returned is not Immutable and
Copy link

Copilot AI Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation inconsistency: The comment says "values array" but the return type is List<V>, not an array. Also, "Immutable" should be lowercase "immutable" since it refers to a property, not a type name.

Suggested fix:

/**
 * Gets the next key and corresponding values from all tables. Note: the values list returned is not immutable and
 * will be modified on the next call to next().
 * @return KeyValue containing the next key and corresponding values from all tables. The value would be null if
 * the key is not present in that table otherwise the value from that table corresponding to the next key.
 */
Suggested change
* Gets the next key and corresponding values from all tables. Note: the values array returned is not Immutable and
* Gets the next key and corresponding values from all tables. Note: the values list returned is not immutable and

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small code style suggestions but overall looks fine to me.
Though not sure how much difference does it make performance wise.

Comment on lines 1061 to 1081
if (fromObjectId != null) {
byte[] rawObjId = codecRegistry.asRawData(
fromObjectId.getObjectID());
byte[] rawObjId = codecRegistry.asRawData(fromObjectId.getObjectID());
// Removing volume bucket info by removing the table bucket Prefix
// from the key.
// For FSO buckets will be left with the parent id/keyname.
// For OBS buckets will be left with the complete path
byte[] rawValue = codecRegistry.asRawData(
key.substring(tablePrefix.length()));
oldObjIdToKeyMap.put(rawObjId, rawValue);
objectIdToIsDirMap.put(rawObjId, isDirectoryTable);
oldParentIds.ifPresent(set -> set.add(
fromObjectId.getParentObjectID()));
}
if (toObjectId != null) {
byte[] rawObjId = codecRegistry.asRawData(toObjectId.getObjectID());
byte[] rawValue = codecRegistry.asRawData(
key.substring(tablePrefix.length()));
byte[] rawValue = codecRegistry.asRawData(key.substring(tablePrefix.length()));
newObjIdToKeyMap.put(rawObjId, rawValue);
objectIdToIsDirMap.put(rawObjId, isDirectoryTable);
newParentIds.ifPresent(set -> set.add(toObjectId
.getParentObjectID()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate code fragments
Line 1061 - 1073 vs Line 1074 - 1081.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want make logical code changes here. Just want to keep it the same

@swamirishi
Copy link
Contributor Author

swamirishi commented Nov 21, 2025

Small code style suggestions but overall looks fine to me. Though not sure how much difference does it make performance wise.

This would make a difference avoiding a lot of seeks. In this we are always doing forward seeks in iterator since keys would be always in a sorted order

@swamirishi swamirishi marked this pull request as ready for review November 21, 2025 22:54
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @swamirishi feel free to merge as is.

@swamirishi swamirishi merged commit fd3cb67 into apache:master Nov 22, 2025
61 checks passed
@swamirishi
Copy link
Contributor Author

Thanks you for reviewing the patch @jojochuang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants