Skip to content

Add dedicated block cache stats for range deletion blocks (#14102)#14317

Open
dannyhchen wants to merge 7 commits intofacebook:mainfrom
dannyhchen:14102
Open

Add dedicated block cache stats for range deletion blocks (#14102)#14317
dannyhchen wants to merge 7 commits intofacebook:mainfrom
dannyhchen:14102

Conversation

@dannyhchen
Copy link

Range deletion blocks were incorrectly counted under BLOCK_CACHE_DATA_* tickers because BlockType::kRangeDeletion fell through to the default case in UpdateCacheHitMetrics, UpdateCacheMissMetrics, and UpdateCacheInsertionMetrics. Add dedicated BLOCK_CACHE_RANGE_DELETION_* tickers following the same pattern as compression dictionary blocks.

…4102)

Range deletion blocks were incorrectly counted under BLOCK_CACHE_DATA_*
tickers because BlockType::kRangeDeletion fell through to the default
case in UpdateCacheHitMetrics, UpdateCacheMissMetrics, and
UpdateCacheInsertionMetrics. Add dedicated BLOCK_CACHE_RANGE_DELETION_*
tickers following the same pattern as compression dictionary blocks.
@meta-cla meta-cla bot added the CLA Signed label Feb 9, 2026
Resolve duplicate enum entries in statistics.h and statistics.cc
caused by merging main into the 14102 branch. Remove dead
CheckCompressedCacheCounters referencing removed BLOCK_CACHE_COMPRESSED
tickers and stale #endif ROCKSDB_LITE guard.

Fix ReadRangeDelBlock to use RetrieveBlock<Block_kRangeDeletion> instead
of NewDataBlockIterator<DataBlockIter> so that range deletion block
cache stats are tracked under BLOCK_CACHE_RANGE_DELETION_* tickers
rather than BLOCK_CACHE_DATA_*.
Update hardcoded BYTES_WRITTEN ticker value in c_test.c from 61 to 67
to match the new enum position after the Tickers reorganization in
statistics.h. Apply clang-format to files touched by the merge.
Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be easy to fix all these things 100%, but many can be fixed

BLOCK_CACHE_COMPRESSION_DICT_HIT,
BLOCK_CACHE_COMPRESSION_DICT_ADD,
BLOCK_CACHE_COMPRESSION_DICT_BYTES_INSERT,
BLOCK_CACHE_COMPRESSION_DICT_BYTES_EVICT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hallucination

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, removing it.

{BLOCK_CACHE_COMPRESSION_DICT_BYTES_INSERT,
"rocksdb.block.cache.compression.dict.bytes.insert"},
{BLOCK_CACHE_COMPRESSION_DICT_BYTES_EVICT,
"rocksdb.block.cache.compression.dict.bytes.evict"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hallucination

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, removing it.

ASSERT_OK(Put("key4", "val4"));
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(), "key2",
"key4"));
ASSERT_OK(Flush());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are smarter in the future about handling range deletions, we would have to update this test. (It would be perfectly valid here to drop the range deletion and the Puts it covers. I suggest at least two flushes.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey the test was updated with 2 flushes.. lmk if this is what you were looking for.

TestGetTickerCount(options, BLOCK_CACHE_RANGE_DELETION_BYTES_INSERT);

// Range deletion block should have been loaded (miss + insert).
ASSERT_EQ(1, range_del_misses);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes we load the range deletion block on file open, and that we didn't warm the cache, which are fragile assumptions that could cause unnecessary test churn on reasonable functional changes.

TestGetTickerCount(options, BLOCK_CACHE_RANGE_DELETION_MISS);
range_del_inserts =
TestGetTickerCount(options, BLOCK_CACHE_RANGE_DELETION_ADD);
ASSERT_GE(range_del_misses, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been numerous tests that become inscrutable from the implied state in updating ticker counts, where all that implied state is easily avoided by PopTicker(). Two steps isn't going to break the bank, but it's perpetuating a maddening pattern.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the test, lmk how it looks (once I push the changes):

Key improvements:

  1. No assumption about when range deletion blocks are loaded: The test now records baseline stats after setup and uses delta-based assertions (ASSERT_GE) rather than assuming exact counts.
  2. Explicitly triggers range deletion block access: Uses Get("key2") and Get("key3") on keys covered by the DeleteRange, which forces the range deletion block to be accessed regardless of when RocksDB chooses
    to load it.
  3. Flexible assertions:
    - Uses ASSERT_GE(after + hits, baseline + hits) instead of exact counts
    - Verifies "at least one access occurred" rather than "exactly 1 miss"
    - Checks deltas between before/after states
  4. Focuses on the key invariant: The main verification is that range deletion accesses are NOT counted as data block accesses - this is the critical functionality being tested.

The test is now resilient to:

  • Range deletion blocks loaded eagerly on file open
  • Range deletion blocks loaded lazily on first access
  • Cache warming implementations
  • Multiple SST files each with their own range deletion blocks

@meta-codesync
Copy link

meta-codesync bot commented Feb 11, 2026

@dannyhchen has imported this pull request. If you are a Meta employee, you can view this in D92984306.

@anand1976
Copy link
Contributor

  1. Missing Java bindings

The PR adds 6 new tickers (BLOCK_CACHE_RANGE_DELETION_MISS, _HIT, _ADD, _BYTES_INSERT, _ADD_REDUNDANT, and BLOCK_CACHE_COMPRESSION_DICT_BYTES_EVICT) to include/rocksdb/statistics.h but does not update:

  • java/src/main/java/org/rocksdb/TickerType.java
  • java/rocksjni/portal.h

The existing compression dict tickers are already in the Java bindings, so the new ones need to be added too. This will break the Java build.

  1. BLOCK_CACHE_COMPRESSION_DICT_BYTES_EVICT is unrelated and unused

This ticker is added to the enum and the name map but there's no code in the diff that actually records it. There are no corresponding BYTES_EVICT tickers for index, filter, or data blocks either, so this is an inconsistent addition. If there's a plan to use it, it
should be in a separate PR with the actual recording code. If it's a placeholder, it shouldn't be added yet — unused public API surface creates confusion.

  1. No PERF_COUNTER for range deletion blocks

The hit path for other block types has PERF_COUNTER_ADD calls (e.g., block_cache_index_hit_count, block_cache_filter_hit_count, block_cache_data_hit_count). The new kRangeDeletion case doesn't add any perf counters. This is consistent with compression dictionary (which
also has no perf counter, just a TODO), but it's worth noting.

  1. The helper methods RecordCacheCountersForRangeDeletion and CheckCacheCountersForRangeDeletion are added to the test fixture but not used in the new test (CacheRangeDeletionBlock uses direct ticker checks instead). Either use them or don't add them.

  2. No release note

Adding new public statistics tickers is a user-visible change that should have an entry in unreleased_history/.

Minor

  • The test CacheRangeDeletionBlock should also verify the aggregate BLOCK_CACHE_MISS includes the range deletion miss (i.e., BLOCK_CACHE_MISS >= BLOCK_CACHE_RANGE_DELETION_MISS), to validate the invariant stated in the comments.

@mszeszko-meta
Copy link
Contributor

mszeszko-meta commented Feb 11, 2026

Added comments in Phabricator 👍

- Make CacheRangeDeletionBlock test resilient to implementation changes
  by using delta-based assertions instead of exact counts
- Explicitly trigger range deletion block access via Get() rather than
  relying on file-open behavior
- Use two flushes to ensure range deletion ends up in separate SST file
- Remove obsolete BLOCK_CACHE_COMPRESSION_DICT_BYTES_EVICT ticker that
  was incorrectly present (listed as removed in HISTORY.md)

The test now verifies the key invariant (range deletion accesses are
NOT counted as data block accesses) without making fragile assumptions
about when range deletion blocks are loaded.
Add Java enum values and portal.h mappings for the new tickers:
- BLOCK_CACHE_RANGE_DELETION_MISS
- BLOCK_CACHE_RANGE_DELETION_HIT
- BLOCK_CACHE_RANGE_DELETION_ADD
- BLOCK_CACHE_RANGE_DELETION_BYTES_INSERT
- BLOCK_CACHE_RANGE_DELETION_ADD_REDUNDANT
- COMPACTION_ABORTED

These tickers were added in the C++ enum but missing from the Java
bindings, causing build failures due to unhandled switch cases.
The previous fix incorrectly updated BYTES_WRITTEN_TICKER from 61 to 67,
but the actual enum position is 66. This caused c_test to fail in the
statistics test phase.
@meta-codesync
Copy link

meta-codesync bot commented Feb 13, 2026

@dannyhchen has imported this pull request. If you are a Meta employee, you can view this in D92984306.

uint64_t num_cache_compression_dict_add = 0;
uint64_t num_cache_compression_dict_add_redundant = 0;
uint64_t num_cache_compression_dict_bytes_insert = 0;
uint64_t num_cache_range_deletion_hit = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this because it's paying a tiny penalty on EVERY lookup in the hopes that this will be incremented by more than 1 and we'll save a tiny amount in that case. That's really only relevant to 'hit' statistics, MAYBE data block miss/add/bytes_insert. You don't necessarily need to change anything. You're just piling onto an existing misappropriation of a reasonable optimization.

compression_dict_insert_count_ = new_compression_dict_insert_count;
}

void RecordCacheCountersForRangeDeletion(const Options& options) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

TestGetTickerCount(options, BLOCK_CACHE_RANGE_DELETION_ADD);
}

void CheckCacheCountersForRangeDeletion(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

size_t compression_dict_miss_count_ = 0;
size_t compression_dict_hit_count_ = 0;
size_t compression_dict_insert_count_ = 0;
size_t range_deletion_miss_count_ = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used (eventually)


// At least one range deletion block access should have occurred
// (either from file open or from the Get).
ASSERT_GE(range_del_misses + range_del_hits,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems complicated. Shouldn't there be a miss (with bytes added) somewhere before now? And a hit (or more than one) somewhere after re-open and re-get?

StartPhase("statistics");
{
const uint32_t BYTES_WRITTEN_TICKER = 61;
const uint32_t BYTES_WRITTEN_TICKER = 66;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the presence of C bindings like this, we should either
(a) avoid re-ordering the C++ tickers (and be transparent if we do)
(b) provide C exports in c.h to expose them, so that C users don't need to hard-code values that become obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments