Extract IndexFactory as unified pluggable index abstraction by zaidoon1 · Pull Request #14602 · facebook/rocksdb

zaidoon1 · 2026-04-11T05:01:59Z

Refactors the block-based table's index subsystem to make custom indexes first-class citizens alongside the built-in binary search index. Both are IndexFactory subclasses at the same abstraction level, following the FilterPolicy model where built-in implementations are proper subclasses of the public interface.

New public API:

include/rocksdb/index_factory.h:
IndexFactory, IndexFactoryBuilder, IndexFactoryReader,
IndexFactoryIterator — unified interface for all index types.

BlockBasedTableOptions::IndexMode enum:
kBuiltinOnly — standard binary search only (default)
kSecondary — both indexes; standard primary, custom per-read
kPrimary — both indexes; custom primary for all reads
kPrimaryOnly — custom only; no standard index built

ReadOptions::ReadIndex enum:
kDefault — use whatever IndexMode says
kBuiltin — force built-in for this read
kCustom — force custom index for this read

Built-in IndexFactory implementations:

BinarySearchIndexFactory, HashIndexFactory, PartitionedIndexFactory
wrap the existing internal IndexBuilder/IndexReader behind the public
interface. The table builder creates the built-in index through the
factory, same as custom indexes.

Architecture:

BlockBasedTableBuilder manages all indexes through IndexFactoryBuilder.
The built-in index uses a fast path (AddIndexEntryDirect) that passes
internal keys directly to the underlying IndexBuilder, avoiding the
user-key translation layer. Custom indexes receive user keys through
the standard AddIndexEntry path. Zero per-block overhead for the common
case (kBuiltinOnly with no custom index).

PartitionCoordinator interface decouples PartitionedFilterBlockBuilder
from the concrete PartitionedIndexBuilder type, allowing pluggable
index implementations without leaking internal types.

IndexFactoryReaderWrapper dispatches reads between built-in and custom
indexes based on IndexMode and per-read ReadIndex selection.

Single-index mode (kPrimaryOnly):

The standard index is not built — only the custom IndexFactory produces
an index. A minimal empty block satisfies the SST footer format. The
index_key_is_user_key property is set to 0 to match the custom index
wrapper's internal key format.

Backward compatibility:

user_defined_index.h provides using aliases (UserDefinedIndexFactory =
IndexFactory, etc.). Existing code compiles without changes. The old
use_udi_as_primary_index, skip_standard_index, and fail_if_no_udi_on_open
booleans are replaced by the single IndexMode enum.

Performance:

kBuiltinOnly path: zero per-block overhead (fast path passes internal
keys directly), 2 well-predicted branches per key (~0ns), 24 bytes
empty vector per SST. No read-path overhead (wrapper not installed).
Factory constructed on stack (no heap allocation). BlockBasedTableOptions
referenced by pointer (no copy).

This attempts to address #14547 (comment)

github-actions · 2026-04-11T05:18:52Z

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 3282.3s.

Summary by check

Check	Count
`cert-err58-cpp`	1
Total	1

Details

table/block_based/block_based_table_factory.cc (1 warning(s))

table/block_based/block_based_table_factory.cc:230:5: warning: initialization of 'block_base_table_index_mode_string_map' with static storage duration may throw an exception that cannot be caught [cert-err58-cpp]

xingbowang · 2026-04-29T20:13:33Z

/claude-review

github-actions · 2026-04-29T20:30:07Z

✅ Claude Code Review

Requested by @xingbowang

Code Review: Extract IndexFactory as unified pluggable index abstraction

Recommendation: Request Changes

The architectural direction is sound — unifying the index abstraction and introducing IndexMode are improvements over the boolean flags. However, the PR has 4 critical issues that must be resolved:

CRITICAL

C1. Broken backward-compatibility shim — The using UserDefinedIndexFactory = IndexFactory alias cannot preserve the old virtual method signatures (NewBuilder() returning raw pointer vs Status NewBuilder(opts, unique_ptr&)). Any existing subclass breaks. Additionally, IndexFactoryOptions::comparator defaults to nullptr (was BytewiseComparator()).

C2. Meta block key prefix change — kIndexFactoryMetaPrefix = "rocksdb.index_factory." vs old "rocksdb.user_defined_index.". SSTs written by new code are unreadable by old binaries and vice versa. Fix: keep old prefix or add dual-prefix reader lookup.

C3. ReadOptions::table_index_factory removal incomplete — 76+ usages in utilities/trie_index/, 6 in table_test.cc, 1 in db_bench_tool.cc are not shown in the diff. The PR appears to only update db_stress_tool/ and db/wide/ tests.

C4. Options migration missing — use_udi_as_primary_index/fail_if_no_udi_on_open marked kDeprecated (silently ignored) with no conversion to IndexMode enum. Existing OPTIONS files lose their configuration on upgrade.

HIGH

H1. kPrimaryOnly null safety — index_builder is null; diff is truncated so not all dereference sites are verifiable. Also operational risks: sst_dump/ldb/backup-restore without custom factory fail.

H2. IndexFactory::Type() returns "IndexFactory" (was "UserDefinedIndexFactory") — breaks ObjectLibrary registration and CreateFromString.

H3. index_key_is_user_key property inconsistency — PR description says 0, implementation sets 1 for kPrimaryOnly.

SUGGESTIONS

Split into phased migration (enum + interface + deprecation + removal)
Add dual meta block key reader support for SST compatibility
Extract PartitionCoordinator decoupling as separate smaller PR

Full report in review-findings.md.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

Claude may miss context from files not in the diff
Large PRs may be truncated
Always apply human judgment to AI suggestions

Commands:

/claude-review [context] — Request a code review
/claude-query <question> — Ask about the PR or codebase

zaidoon1 · 2026-04-30T03:11:15Z

/claude-review

side question, can i run this command to get the review started or does it only work if a member from rocksdb team do it?

github-actions · 2026-04-30T03:22:14Z

✅ Claude Code Review

Requested by @zaidoon1

Code Review: Extract IndexFactory as Unified Pluggable Index Abstraction

35 files changed, 3205 insertions, 1409 deletions | Diff truncated — review covers visible portions.

Critical Findings

C1. Meta Block Prefix Change Breaks Cross-Version SST Compatibility

Severity: Critical

The prefix changes from "rocksdb.user_defined_index." to "rocksdb.index_factory.". The writer uses the new prefix, the reader searches for the new prefix. SSTs written with the old prefix won't be found by new readers, and vice versa. This breaks rolling upgrades.

Fix: Support both prefixes during lookup in block_based_table_reader.cc:

s = FindMetaBlock(meta_iter, kIndexFactoryMetaPrefix + udi_name, &handle);
if (!s.ok()) {
  s = FindMetaBlock(meta_iter, kUserDefinedIndexPrefix + udi_name, &handle);
}

High Severity Findings

H1. Verify All UserDefinedIndexFactory Implementations Are Updated

Files using old API (table_index_factory, old NewBuilder()/NewReader() signatures) include utilities/trie_index/trie_index_factory.h (TrieIndexFactory), trie_index_db_test.cc (75+ refs), trie_index_test.cc (7+ refs), table/table_test.cc (6+ refs), tools/db_bench_tool.cc. The diff is truncated — verify these are all updated.

H2. IndexFactoryOptions Default Comparator Changed to nullptr

Old UserDefinedIndexOption::comparator defaulted to BytewiseComparator(). New IndexFactoryOptions::comparator defaults to nullptr. External code relying on the default will crash.

H3. Loss of Factory Name Validation in ReadOptions

Old table_index_factory pointer allowed name validation (name_ == read_options.table_index_factory->Name()). New read_index = kCustom enum has no factory reference — no validation that the requested index type matches the SST.

H4. kPrimaryOnly Has No Recovery Path for Corrupted Custom Index

No built-in index fallback. Corrupted custom meta block → zero keys readable. Backup/restore without factory configured → unreadable DB.

Medium Severity

M1: Parallel compression null guards for index_builder are defensive but unreachable (custom indexes disable parallel compression). Prefer assert over silent null checks.
M4: Stress test doesn't exercise ReadOptions::read_index = kBuiltin override.

Suggestions

S1: Consider splitting into smaller PRs (API, built-in wrappers, IndexMode, cleanup).
S2: Add LIKELY/UNLIKELY hints to ForwardAddIndexEntryToAll() fast path.
S3: Verify db_bench is updated for new options.
S4: Add a release note for the API changes.

Full report written to review-findings.md. To the reviewer's side question: yes, anyone can trigger this review — it runs against the diff provided in the prompt.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

Claude may miss context from files not in the diff
Large PRs may be truncated
Always apply human judgment to AI suggestions

Commands:

/claude-review [context] — Request a code review
/claude-query <question> — Ask about the PR or codebase

zaidoon1 · 2026-04-30T03:43:34Z

@xingbowang so the main points from the AI review is about backwards compatibility. Given this api is experimental, do we need to worry about it? Not sure if you have internal use cases for it right now?

xingbowang · 2026-04-30T12:02:10Z

side question, can i run this command to get the review started or does it only work if a member from rocksdb team do it?

I added you in the allow list, so it works for you as well.
Normally the review would automatically be kicked off. But we had a bug when you submit the diff, so it didn't kick off. Therefore, I manually kicked it off by leaving a comment

xingbowang · 2026-04-30T12:02:57Z

We do have internal usage, but since most of the times are deployed in one binary, instead of dynamic loading, we should be able to fix that.

xingbowang · 2026-04-30T12:43:20Z

+inline constexpr const char* kIndexFactoryMetaPrefix = "rocksdb.index_factory.";
+
+// ============================================================================
+// IndexFactory: pluggable index for BlockBasedTable SST files.


Could we add an explicit comment that this API is intentionally asymmetric between build and read?

Right now IndexFactory reads as if built-in and custom indexes both go through the same abstraction on both sides. But the actual design is narrower: write/build is unified through IndexFactory, while built-in reads still use the richer internal BlockBasedTable::IndexReader path, and custom reads are adapted into that path via IndexFactoryReaderWrapper.

That seems like the right design for now, but it is not obvious from the header, and readers could easily assume builtin NewReader() returning NotSupported is an incomplete refactor rather than an intentional boundary. A short note near IndexFactoryReader / IndexFactory::NewReader() would make the design intent much clearer.

If you want suggested code-comment text, I'd use:

// NOTE: The IndexFactory API is intentionally asymmetric. // Built-in and custom indexes share the factory abstraction for SST // construction, but built-in index reads continue to use the internal // BlockBasedTable::IndexReader path. That internal reader contract carries // table-local behaviors such as cache/prefetch/pinning and iterator reuse // that are not part of this public SPI. Custom IndexFactoryReader // implementations are adapted to the internal reader contract via // IndexFactoryReaderWrapper.

xingbowang · 2026-04-30T12:49:52Z

+  // for any mode other than kBuiltinOnly.
+  //
+  //   kBuiltinOnly (default):
+  //     Only the built-in binary search index is used.


"binary search index" is not accurate, it could be other built in types.

xingbowang · 2026-04-30T12:50:14Z

+  //   - Partitioned index (kTwoLevelIndexSearch) in kPrimary/kPrimaryOnly
+  //   - Partitioned filters in kPrimary/kPrimaryOnly
+  //   - Parallel compression in any mode that uses a custom index
+  enum class IndexMode {


BlockBasedTableOptions::IndexMode still feels hard to read in its current form. The main issue is the value names: kSecondary / kPrimary make the reader stop and decode what is actually primary here. In database terminology, "primary index" and "secondary index" usually refer to different logical indexes over the data, so those names already carry a strong meaning for users. Here, though, the enum is not modeling multiple user-visible indexes in that sense; it is controlling which SST index implementations are built and which one is the default on reads. kPrimaryOnly also changes the physical SST layout, not just index priority, which makes the naming even less obvious. kBuiltinOnly is also a bit misleading, since the "built-in" side is not always binary search; index_type can still make it hash or partitioned.

I think this would be easier to understand if we keep the single enum, but rename the values to describe the semantics directly, e.g.:

enum class IndexMode { kStandardOnly, kStandardDefault, kCustomDefault, kCustomOnly, };

That maps cleanly to the current behavior:

kStandardOnly: only the standard index is built/used

kStandardDefault: both indexes are built, standard is the default

kCustomDefault: both indexes are built, custom is the default

kCustomOnly: only the custom index is built

This keeps the API simple, preserves the rollout model, and makes the mode names self-describing without forcing readers to translate "primary/secondary" into actual behavior.

xingbowang · 2026-04-30T12:54:49Z

Since the refactor touches existing internal indexes, could you run some benchmark to measure this refactor does not introduce performance regression. Essentially, it means run benchmark on read and flush without UDI and make sure no perf regression is observed. Please share some number.

xingbowang · 2026-04-30T12:56:52Z

+  //   kCustom: force the custom IndexFactory index for this read.
+  //     In kSecondary mode, this is how you select the custom index
+  //     for individual reads without changing index_mode.
+  enum class ReadIndex : uint8_t {


It may be worth adding one explicit sentence here that ReadIndex is intentionally a two-way selector because BlockBasedTable currently supports exactly two index read targets per table: one standard index selected by BlockBasedTableOptions::index_type, and at most one custom index from user_defined_index_factory. Right now that constraint is mostly implied by the API shape, but not stated clearly, so a reader could reasonably wonder why this is a fixed enum (kBuiltin / kCustom) instead of something more general like an index ID/name.

Also, the wording "built-in binary search index" is a bit misleading here, since the standard index path can still be hash or partitioned depending on index_type. If we keep the current naming, I think "standard index" would be more accurate than "built-in binary search index."

xingbowang · 2026-04-30T13:05:27Z

+  enum class ReadIndex : uint8_t {
+    kDefault = 0,
+    kBuiltin = 1,
+    kCustom = 2,


Related to the semantics here: kCustom reads more like a strict selection ("must use the custom index for this read"), but the implementation behaves more like a best-effort preference. If no custom index is available for a table/file, reads fall back to the standard index instead of treating this as an invalid request. That seems like a reasonable migration/compatibility behavior, but the current name does not make it obvious.

If that fallback is intentional, would it make sense to rename kCustom to something like kPreferCustom? That would better match the actual contract and reduce the chance that readers interpret this as a strict selector.

xingbowang · 2026-04-30T13:15:38Z

+// Fault injection note: the custom index meta block is vulnerable to
+// metadata write fault injection (metadata_write_fault_one_in). If the
+// meta block is corrupted, kPrimaryOnly has no fallback index and the
+// compaction iterator reads zero keys from the affected SST. This is
+// expected behavior — the standard binary search index (in kPrimary and
+// below) is part of the SST's main block layout and is not affected by
+// metadata write faults, providing a natural fallback. The stress tool
+// disables compaction_verify_record_count for kPrimary/kPrimaryOnly
+// when write fault injection is active. Without fault injection, all
+// modes pass the compaction record count check correctly.


I don't think we need this detail here. The one in stress test is good enough.

xingbowang · 2026-04-30T13:19:20Z

+//
+// Thread safety: all methods except EstimatedSize() are called from a
+// single thread (the emit thread in BlockBasedTableBuilder). Parallel
+// compression is not supported for custom IndexFactory implementations.


Given the new API we provided on this interface, the limitation of not support parallel compression is just a limit for specific UDI implementation, right? If a new UDI implement some of the interface, it would be able to be supported by parallel compression.

…ntation Address feedback from xingbowang on PR facebook#14602: 1. Rename IndexMode enum values to be self-describing: kBuiltinOnly -> kStandardOnly kSecondary -> kStandardDefault kPrimary -> kCustomDefault kPrimaryOnly -> kCustomOnly The new names describe behavior directly (which index is built, which is the default) instead of using primary/secondary terminology that conflicts with database index semantics. 2. Replace 'binary search index' with 'standard index' in all comments. The built-in index is not always binary search — it can be hash or partitioned depending on BlockBasedTableOptions::index_type. 3. Add asymmetric design note to IndexFactory header explaining that the build path is unified through IndexFactory while the read path uses the internal BlockBasedTable::IndexReader for built-ins and IndexFactoryReaderWrapper as an adapter for custom implementations. 4. Add ReadIndex rationale comment explaining why it is a fixed two-way enum (exactly two read targets per SST: standard + custom). 5. Document kCustom fallback behavior: when no custom index is available for a given SST, reads fall back to the standard index. 6. Remove fault injection implementation detail from the public index_factory.h header (kept in db_stress_test_base.cc where it belongs). 7. Clarify that parallel compression support is per-implementation: custom IndexFactory implementations can support it by overriding SupportsParallelAddEntry/PrepareAddEntry/FinishAddEntry.

Refactors the block-based table's index subsystem to make custom indexes first-class citizens alongside the built-in binary search index. Both are IndexFactory subclasses at the same abstraction level, following the FilterPolicy model where built-in implementations are proper subclasses of the public interface. New public API: include/rocksdb/index_factory.h: IndexFactory, IndexFactoryBuilder, IndexFactoryReader, IndexFactoryIterator — unified interface for all index types. BlockBasedTableOptions::IndexMode enum: kBuiltinOnly — standard binary search only (default) kSecondary — both indexes; standard primary, custom per-read kPrimary — both indexes; custom primary for all reads kPrimaryOnly — custom only; no standard index built ReadOptions::ReadIndex enum: kDefault — use whatever IndexMode says kBuiltin — force built-in for this read kCustom — force custom index for this read Built-in IndexFactory implementations: BinarySearchIndexFactory, HashIndexFactory, PartitionedIndexFactory wrap the existing internal IndexBuilder/IndexReader behind the public interface. The table builder creates the built-in index through the factory, same as custom indexes. Architecture: BlockBasedTableBuilder manages all indexes through IndexFactoryBuilder. The built-in index uses a fast path (AddIndexEntryDirect) that passes internal keys directly to the underlying IndexBuilder, avoiding the user-key translation layer. Custom indexes receive user keys through the standard AddIndexEntry path. Zero per-block overhead for the common case (kBuiltinOnly with no custom index). PartitionCoordinator interface decouples PartitionedFilterBlockBuilder from the concrete PartitionedIndexBuilder type, allowing pluggable index implementations without leaking internal types. IndexFactoryReaderWrapper dispatches reads between built-in and custom indexes based on IndexMode and per-read ReadIndex selection. Single-index mode (kPrimaryOnly): The standard index is not built — only the custom IndexFactory produces an index. A minimal empty block satisfies the SST footer format. The index_key_is_user_key property is set to 0 to match the custom index wrapper's internal key format. Backward compatibility: user_defined_index.h provides using aliases (UserDefinedIndexFactory = IndexFactory, etc.). Existing code compiles without changes. The old use_udi_as_primary_index, skip_standard_index, and fail_if_no_udi_on_open booleans are replaced by the single IndexMode enum. Performance: kBuiltinOnly path: zero per-block overhead (fast path passes internal keys directly), 2 well-predicted branches per key (~0ns), 24 bytes empty vector per SST. No read-path overhead (wrapper not installed). Factory constructed on stack (no heap allocation). BlockBasedTableOptions referenced by pointer (no copy).

…ntation Address feedback from xingbowang on PR facebook#14602: 1. Rename IndexMode enum values to be self-describing: kBuiltinOnly -> kStandardOnly kSecondary -> kStandardDefault kPrimary -> kCustomDefault kPrimaryOnly -> kCustomOnly The new names describe behavior directly (which index is built, which is the default) instead of using primary/secondary terminology that conflicts with database index semantics. 2. Replace 'binary search index' with 'standard index' in all comments. The built-in index is not always binary search — it can be hash or partitioned depending on BlockBasedTableOptions::index_type. 3. Add asymmetric design note to IndexFactory header explaining that the build path is unified through IndexFactory while the read path uses the internal BlockBasedTable::IndexReader for built-ins and IndexFactoryReaderWrapper as an adapter for custom implementations. 4. Add ReadIndex rationale comment explaining why it is a fixed two-way enum (exactly two read targets per SST: standard + custom). 5. Document kCustom fallback behavior: when no custom index is available for a given SST, reads fall back to the standard index. 6. Remove fault injection implementation detail from the public index_factory.h header (kept in db_stress_test_base.cc where it belongs). 7. Clarify that parallel compression support is per-implementation: custom IndexFactory implementations can support it by overriding SupportsParallelAddEntry/PrepareAddEntry/FinishAddEntry.

Wrap long lines exceeding 80-column limit introduced by the longer IndexMode enum names (kStandardOnly/kStandardDefault/kCustomDefault/ kCustomOnly). No functional change.

zaidoon1 · 2026-05-01T08:08:54Z

Since the refactor touches existing internal indexes, could you run some benchmark to measure this refactor does not introduce performance regression. Essentially, it means run benchmark on read and flush without UDI and make sure no perf regression is observed. Please share some number.

I've addressed all the feedback, working on the benchmarks now.

zaidoon1 · 2026-05-01T23:31:01Z

Benchmark Results: No Regression Observed

Ran db_bench comparing PR branch (a4b25e7a4) vs merge-base (bad2d5b0a) without UDI (default options = kStandardOnly mode).

Setup

10M keys, 16B key, 100B value, Snappy compression, 1 thread
Apple M-series, 14 cores
Fresh DB per run, release build (DEBUG_LEVEL=0)
Both binaries built with identical flags

Write workloads (flush path — most affected by this refactor)

Benchmark	N	Baseline (ops/s)	PR (ops/s)	Δ median
fillseq	3	641,557	642,000	+0.07%
fillrandom	3	372,002	368,696	-0.89%
overwrite	3	369,388	373,713	+1.17%

Read workloads

Two DB states tested for reads:

Post-overwrite (LSM has L0 churn):

Benchmark	N	Baseline (ops/s)	PR (ops/s)	Δ median
readrandom	3	121,017	123,444	+2.01%
readseq	3	8,497,981	8,357,360	-1.65%

Post-compact (clean LSM):

Benchmark	N	Baseline (ops/s)	PR (ops/s)	Δ median
readrandom	5	372,883	368,647	-1.14%
readseq	5	8,999,199	8,943,983	-0.61%

Conclusion

All deltas (using median, robust to outliers) are within ±2%, well inside benchmark noise (per-benchmark stdev was 1–5% on this hardware). No regression observed in either flush or read paths when UDI is not configured.

This matches expectations: the refactor's hot path for kStandardOnly mode goes through BuiltinIndexFactoryBuilder::AddIndexEntryDirect() which forwards directly to the same internal IndexBuilder as before, with no extra abstraction overhead. Read paths for the standard built-in index continue to use the existing BlockBasedTable::IndexReader path unchanged.

zaidoon1 · 2026-05-01T23:33:52Z

Reproducing the Benchmarks

Build (both versions)

# Baseline (merge-base of this PR with upstream/main)
git checkout bad2d5b0a
make clean
DEBUG_LEVEL=0 make -j$(nproc) db_bench
mv db_bench /tmp/db_bench_baseline

# PR branch
git checkout a4b25e7a4
make clean
DEBUG_LEVEL=0 make -j$(nproc) db_bench
mv db_bench /tmp/db_bench_pr

fillseq (write-only flush benchmark)

$binary \
  --benchmarks=fillseq \
  --db=/tmp/bench_db \
  --num=10000000 \
  --value_size=100 \
  --key_size=16 \
  --seed=42 \
  --threads=1 \
  --use_existing_db=0

Combined: write + compact + read benchmarks

$binary \
  --benchmarks="fillrandom,compact,overwrite,readrandom,readseq" \
  --db=/tmp/bench_db \
  --num=10000000 \
  --reads=1000000 \
  --value_size=100 \
  --key_size=16 \
  --seed=42 \
  --threads=1 \
  --use_existing_db=0

This benchmark sequence:

fillrandom — write 10M random keys (exercises memtable + flush path)
compact — full manual compaction (settles LSM into clean state)
overwrite — overwrite all 10M keys (exercises flush + L0 churn)
readrandom — 1M random point lookups (exercises index seek path)
readseq — sequential scan of 1M keys (exercises iterator)

The readrandom/readseq numbers in the "post-overwrite (LSM has L0 churn)" table come from this combined run.

Focused read benchmark (post-compact, no overwrite)

$binary \
  --benchmarks="fillrandom,compact,readrandom,readseq" \
  --db=/tmp/bench_db \
  --num=10000000 \
  --reads=1000000 \
  --value_size=100 \
  --key_size=16 \
  --seed=42 \
  --threads=1 \
  --use_existing_db=0

The readrandom/readseq numbers in the "post-compact (clean LSM)" table come from this run, with iteration order alternated between baseline/PR to reduce systematic bias from cold-cache / system warm-up effects.

Iterations

3 iterations for write workloads (fillseq, fillrandom, overwrite)
3 iterations for post-overwrite reads (from combined run)
5 iterations for post-compact reads (focused run, alternating order)

DB path was wiped (rm -rf) between every run to ensure a fresh starting state.

meta-cla Bot added the CLA Signed label Apr 11, 2026

zaidoon1 force-pushed the zaidoon/UDI-refactor branch 15 times, most recently from a826b42 to 72a081c Compare April 12, 2026 00:43

xingbowang reviewed Apr 30, 2026

View reviewed changes

zaidoon1 added 3 commits May 1, 2026 04:07

Apply clang-format to IndexMode rename changes

9132665

Wrap long lines exceeding 80-column limit introduced by the longer IndexMode enum names (kStandardOnly/kStandardDefault/kCustomDefault/ kCustomOnly). No functional change.

zaidoon1 force-pushed the zaidoon/UDI-refactor branch from a4b25e7 to 9132665 Compare May 1, 2026 08:08

Conversation

zaidoon1 commented Apr 11, 2026

Uh oh!

github-actions Bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ clang-tidy: 1 warning(s) on changed lines

Summary by check

Details

Uh oh!

xingbowang commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

✅ Claude Code Review

Code Review: Extract IndexFactory as unified pluggable index abstraction

CRITICAL

HIGH

SUGGESTIONS

Uh oh!

zaidoon1 commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

✅ Claude Code Review

Code Review: Extract IndexFactory as Unified Pluggable Index Abstraction

Critical Findings

C1. Meta Block Prefix Change Breaks Cross-Version SST Compatibility

High Severity Findings

H1. Verify All UserDefinedIndexFactory Implementations Are Updated

H2. IndexFactoryOptions Default Comparator Changed to nullptr

H3. Loss of Factory Name Validation in ReadOptions

H4. kPrimaryOnly Has No Recovery Path for Corrupted Custom Index

Medium Severity

Suggestions

Uh oh!

zaidoon1 commented Apr 30, 2026

Uh oh!

xingbowang commented Apr 30, 2026

Uh oh!

xingbowang commented Apr 30, 2026

Uh oh!

xingbowang Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingbowang commented Apr 30, 2026

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

xingbowang Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

zaidoon1 commented May 1, 2026

Uh oh!

zaidoon1 commented May 1, 2026

Benchmark Results: No Regression Observed

Setup

Write workloads (flush path — most affected by this refactor)

Read workloads

Conclusion

Uh oh!

zaidoon1 commented May 1, 2026

Reproducing the Benchmarks

Build (both versions)

fillseq (write-only flush benchmark)

Combined: write + compact + read benchmarks

Focused read benchmark (post-compact, no overwrite)

Iterations

Uh oh!

github-actions Bot commented Apr 11, 2026 •

edited

Loading

xingbowang Apr 30, 2026 •

edited

Loading