Cache feature for snapshot database #8702

matkt · 2025-05-27T13:22:09Z

PR description

This PR refactors the DB snapshot (thanks to @garyschulte's commit) to remove the use of transactions, as a snapshot is immutable by nature.
Additionally, it introduces an optional cache layer at the snapshot level to optimize repeated reads.

Tests :

Checked block building on a sepolia validator
Verified oldest_snapshot_seqno to be sure old snapshot are correctly closed
Tried calling eth_call in order to verify the performance with the snapshot cache

Fixed Issue(s)

Thanks for sending a pull request! Have you done the following?

Checked out our contribution guidelines?
Considered documentation and added the doc-change-required label to this PR if updates are required.
Considered the changelog and included an update if required.
For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

spotless: ./gradlew spotlessApply
unit tests: ./gradlew build
acceptance tests: ./gradlew acceptanceTest
integration tests: ./gradlew integrationTest
reference tests: ./gradlew ethereum:referenceTests:referenceTests

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

Signed-off-by: garyschulte <garyschulte@gmail.com>

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

ahamlat · 2025-05-28T12:18:05Z

...va/org/hyperledger/besu/plugin/services/storage/rocksdb/configuration/RocksDBCLIOptions.java

+      hidden = true,
+      paramLabel = "<BOOLEAN>",
+      description =
+          "Enable caching of reads during snapshot access to improve performance (default: ${DEFAULT-VALUE})")


Can you add here that we shouldn't use it with block processing ?

updated the description feel free to tell me if it's better

ahamlat · 2025-05-28T12:19:14Z

...erledger/besu/plugin/services/storage/rocksdb/segmented/RocksDBColumnarKeyValueSnapshot.java

+      throws RocksDBException {
+    final Bytes cacheKey = makeCacheKey(segmentId, key);
+    Optional<byte[]> cached = cache.getIfPresent(cacheKey);
+    //noinspection OptionalAssignedToNull


Not sure to understand the comment

it's because I'm checking if optional == null, but it's something we want to do, I will move it at the beginning of the class wuth SuppressWarnings

ahamlat · 2025-05-28T12:21:37Z

...erledger/besu/plugin/services/storage/rocksdb/segmented/RocksDBColumnarKeyValueSnapshot.java

+      final byte[] segmentId,
+      final byte[] key,
+      final ColumnFamilyHandle handle,
+      final Cache<Bytes, Optional<byte[]>> cache)


Just an idea, maybe it is clear for everyone but I wonder if adding a comment that explains why the value in the cache is an optional, because we want to cache zero reads.

added a comment

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

garyschulte · 2025-05-28T23:00:37Z

ethereum/eth/src/main/java/org/hyperledger/besu/ethereum/eth/transactions/TransactionPool.java

            .getWorldState(withBlockHeaderAndNoUpdateNodeHead(chainHeadBlockHeader))
            .orElseThrow()) {
+      if (worldState instanceof BonsaiWorldState bonsaiWorldState) {
+        bonsaiWorldState.disableCacheMerkleTrieLoader();


garyschulte · 2025-05-28T23:01:14Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/transaction/TransactionSimulator.java

      WorldUpdater updater = getEffectiveWorldStateUpdater(ws);
-
+      if (ws instanceof BonsaiWorldState bonsaiWorldState) {
+        bonsaiWorldState.disableCacheMerkleTrieLoader();


garyschulte

LGTM, I think we would do better to have segment specific caches though so we can leverage different update patterns

garyschulte · 2025-05-28T23:20:35Z

...erledger/besu/plugin/services/storage/rocksdb/segmented/RocksDBColumnarKeyValueSnapshot.java

-    return snapTx.get(segment, key);
+    try (final OperationTimer.TimingContext ignored = metrics.getReadLatency().startTimer()) {
+      final ColumnFamilyHandle handle = columnFamilyMapper.apply(segment);
+      if (isReadCacheEnabledForSnapshots && segment.isEligibleToHighSpecFlag()) {


requiring high spec would includes blockchain queries (which afaik should only be the corresponding body and header for this snapshot), but omits code. It might make sense to remove this condition entirely, since snapshots should only be used to access rpc-relevant things. the largest values are going to be block bodies, and they are not omitted by this check.

I can see how we might end up with multiple copies of the same code cached for each snapshot. ambivalent about this. it might be better to have a non-snapshot specific code cache, since the code hash key is deterministic.

having column family-specific read caches is probably a better approach in general since there are different update patterns for each. We can make better use of shared caches for several segments like blockchain and code

it might be better to have a non-snapshot specific code cache, since the code hash key is deterministic.

Not sure to understand this comment. I agree with your comment related to blocks but in this case, it is not an issue, block data calls doesn't go through this code as far as know, I could be wrong. The idea was to exclude caching the result of get calls related to code. We don't want to cache it at all here.

macfarla · 2025-05-29T06:31:23Z

any reason not to merge this @matkt ?

garyschulte · 2025-05-29T20:56:09Z

re-enabling auto-merge after the failed metrics CLI passed locally. I still think we could do smarter caching here though.

matkt · 2025-06-02T14:03:34Z

re-enabling auto-merge after the failed metrics CLI passed locally. I still think we could do smarter caching here though.

yes we can try to do a more optimized version if we really want to use this feature in mainnet or by default

matkt and others added 7 commits May 22, 2025 18:20

read snapshot and not transaction

6e3f99e

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

remove preload and add cache for snapshot layer

c13610a

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

remove snapshot transactions, make snapshots immutable

0bc059b

Signed-off-by: garyschulte <garyschulte@gmail.com>

merge Gary's PR and fix issues

4872f5e

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

add flag for snapshot cache

e074026

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

clean code

0fccd13

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

fix build issues

ea564b2

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

matkt force-pushed the feature/test-read-snapshot branch from d6c92d9 to ea564b2 Compare May 28, 2025 08:34

matkt added 2 commits May 28, 2025 10:39

merge main

d5de0c5

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

clean code

231399d

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

matkt force-pushed the feature/test-read-snapshot branch from 8447c7a to 231399d Compare May 28, 2025 09:04

matkt marked this pull request as ready for review May 28, 2025 12:14

Merge branch 'main' into feature/test-read-snapshot

9c2f541

ahamlat reviewed May 28, 2025

View reviewed changes

fix comments

dcdd604

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

matkt requested a review from ahamlat May 28, 2025 12:53

add cache only for isEligibleToHighSpecFlag segment

a63a561

Signed-off-by: Karim Taam <karim.t2am@gmail.com>

ahamlat approved these changes May 28, 2025

View reviewed changes

garyschulte reviewed May 28, 2025

View reviewed changes

Merge branch 'main' into feature/test-read-snapshot

61f8a9d

macfarla enabled auto-merge (squash) May 29, 2025 07:46

garyschulte disabled auto-merge May 29, 2025 20:47

garyschulte enabled auto-merge (squash) May 29, 2025 20:56

garyschulte merged commit 363324b into hyperledger:main May 29, 2025
48 checks passed

Cache feature for snapshot database #8702

Cache feature for snapshot database #8702

Uh oh!

Conversation

matkt commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description

Fixed Issue(s)

Thanks for sending a pull request! Have you done the following?

Locally, you can run these tests to catch failures early:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garyschulte left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahamlat Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

macfarla commented May 29, 2025

Uh oh!

garyschulte commented May 29, 2025

Uh oh!

Uh oh!

matkt commented Jun 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

matkt commented May 27, 2025 •

edited

Loading

ahamlat Jun 2, 2025 •

edited

Loading