Skip to content

HDDS-14159. Have an option to read only Key in ManagedRawSSTFileIterator#9485

Merged
swamirishi merged 8 commits intoapache:masterfrom
swamirishi:HDDS-14159
Dec 13, 2025
Merged

HDDS-14159. Have an option to read only Key in ManagedRawSSTFileIterator#9485
swamirishi merged 8 commits intoapache:masterfrom
swamirishi:HDDS-14159

Conversation

@swamirishi
Copy link
Contributor

@swamirishi swamirishi commented Dec 12, 2025

What changes were proposed in this pull request?

Currently for SnapshotDiff and defrag we only rely on Key and never on the value of the sst file. We should make the value read optional to avoid unnecessary memory allocation.
This is to enable SSTFileSetReader to just read the key where the value is not required. This is to avoid unnecessary IO while reading the delta SSTFiles which only requires key to compute diff b/w snapshots both for defrag and snapshot diff.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14159

How was this patch tested?

Added unit tests

Change-Id: Iaaf2919c4e8a044ad0f3fc8714758e794e093480
@swamirishi swamirishi added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Dec 12, 2025
Change-Id: I173a8753d6e2dc7a039bbc6c8872f8026bd5ac6a
Change-Id: I26e22a57c52159431e401174f5bdffb195af3a00
Change-Id: Ifb66a66064e5a5efb62c1dde87208d2c0c0ddbad
Comment on lines 27 to 47
NEITHER,
/**
* Read key only.
*/
KEY_ONLY,
/**
* Read value only.
*/
VALUE_ONLY,
/**
* Read both key and value.
*/
KEY_AND_VALUE;

public boolean readKey() {
return (this.ordinal() & KEY_ONLY.ordinal()) != 0;
}

public boolean readValue() {
return (this.ordinal() & VALUE_ONLY.ordinal()) != 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @swamirishi
nit: set the bit mask explicitly instead of relying on the ordinal just to avoid any issues if the enum constants are reordered/added in the future.

Suggested change
NEITHER,
/**
* Read key only.
*/
KEY_ONLY,
/**
* Read value only.
*/
VALUE_ONLY,
/**
* Read both key and value.
*/
KEY_AND_VALUE;
public boolean readKey() {
return (this.ordinal() & KEY_ONLY.ordinal()) != 0;
}
public boolean readValue() {
return (this.ordinal() & VALUE_ONLY.ordinal()) != 0;
}
NEITHER(0),
/**
* Read key only.
*/
KEY_ONLY(1),
/**
* Read value only.
*/
VALUE_ONLY(2),
/**
* Read both key and value.
*/
KEY_AND_VALUE(3);
public final int mask;
IteratorType(int mask) {
this.mask = mask;
}
public boolean readKey() {
return (this.mask & KEY_ONLY.mask) != 0;
}
public boolean readValue() {
return (this.mask & VALUE_ONLY.mask) != 0;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice thanks for pointing out. I just moved this code from internal TableIterator. Let me make this change

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

I'd suggest to update the PR subject to make it clear that SsetFileReader reads just the key and skips value. And add the expected impact in the jira.

protected ClosableIterator<String> getKeyIteratorForFile(String file) {
return new ManagedRawSstFileIterator(file, options, lowerBoundSlice, upperBoundSlice,
keyValue -> StringUtils.bytes2String(keyValue.getKey()));
keyValue -> StringCodec.get().fromPersistedFormat(keyValue.getKey()), KEY_ONLY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SstFileSetReader reads just the key.

…instead of ordinal

Change-Id: I4bdfdc6da9990a1518dc900f8297d92700de0022
@swamirishi swamirishi marked this pull request as ready for review December 12, 2025 22:10
Change-Id: Ic7033f081410d196409cf054b1777a1e78901cd2
Change-Id: I2eb955c8fc0a06219465b424bffd5e426e0b70ab

# Conflicts:
#	hadoop-hdds/rocksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdb/util/SstFileSetReader.java
Change-Id: Ia60bc037c0fb29f9fd92b44a486977e48b7d19d9
@swamirishi swamirishi merged commit d6026b2 into apache:master Dec 13, 2025
43 checks passed
@swamirishi
Copy link
Contributor Author

thank you @jojochuang and @SaketaChalamchala for reviewing the the patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants