Skip to content

Make RocksDB memory config configurable (block cache + WriteBufferManager)#780

Merged
kriszyp merged 4 commits into
mainfrom
feat/rocksdb-block-cache-configurable
May 26, 2026
Merged

Make RocksDB memory config configurable (block cache + WriteBufferManager)#780
kriszyp merged 4 commits into
mainfrom
feat/rocksdb-block-cache-configurable

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 25, 2026

Exposes RocksDB memory tuning knobs as Harper config params so deployments can override the hardcoded defaults. Adds four new params, all under storage.rocks.*:

Param What it controls
storage.rocks.blockCacheSize Bytes for the shared block cache. Defaults to 25% of constrained memory.
storage.rocks.writeBufferManagerSize Bytes the process-wide WriteBufferManager caps total memtable memory at (0 = disabled, the default). Structural cap on memtables + OptimisticTransactionDB maintain-window history.
storage.rocks.writeBufferManagerCostToCache Charge memtable bytes against the block cache so both subsystems share accounting. Visible via a single rocksdb.block-cache-usage metric.
storage.rocks.writeBufferManagerAllowStall Stall writes once the WBM cap is hit (hard cap) vs. allow brief overshoot with aggressive flushing (soft cap).

Why: On shared Fabric hosts, each container's RocksDB memory footprint accumulates. For a canopy-style customer with 14 tables, the per-CF maintain window alone holds ~1.8 GB of anonymous memory by default. Today there's no way to bound this without rebuilding harper-pro. These params let host-manager (and other operators) configure both subsystems without code changes.

Implementation: Both reads happen inside openRocksDatabase() so env/CLI overrides applied after module load are respected. Values are passed to RocksDatabase.config() via conditional spread — unset env vars don't surface unknown options to older bindings of @harperfast/rocksdb-js (forward-compatibility).

Dependency: The WBM options are no-ops until HarperFast/rocksdb-js#584 lands and harper-pro updates its dependency. The blockCacheSize param works today.

Cross-model review findings addressed:

  • Gemini (first round): module-load timing — moved reads from module-level constants into openRocksDatabase() so config changes after module load are picked up.
  • Codex: same timing concern — same fix.
  • Input validation: > 0 guard on blockCacheSize handles 0/negative/NaN.
  • Gemini (WBM additions): string coercion handled by Harper's existing envGet (parses 'true'/'false' → boolean, '123' → number).

Generated by Claude. — Claude Sonnet 4.7

…heSize

Adds STORAGE_ROCKS_BLOCKCACHESIZE config param (env var or YAML). When set,
it overrides the default 25%-of-constrained-memory heuristic. Reading is done
lazily inside openRocksDatabase() so env/CLI overrides applied after module
load are always respected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp kriszyp requested a review from cb1kenobi May 25, 2026 23:22
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 25, 2026

Reviewed; no blockers found.

Comment thread resources/databases.ts
Adds three new config params for the process-wide RocksDB
WriteBufferManager exposed by @harperfast/rocksdb-js#584:

- storage.rocks.writeBufferManagerSize — bytes the WBM caps across
  all DBs in the process (0 disables). The structural cap on total
  memtable + maintain-window memory.
- storage.rocks.writeBufferManagerCostToCache — when true, memtable
  charges show up in block-cache-usage as pinned entries (single
  observability metric for read cache + memtable footprint).
- storage.rocks.writeBufferManagerAllowStall — when true, writes
  stall once the cap is reached instead of allowing memtables to
  briefly exceed it.

Values are read inside openRocksDatabase() and passed via spread,
so unset values don't surface unknown options to older bindings of
rocksdb-js where the WBM isn't yet wired up. This makes the change
forward-compatible: it's a no-op until the env var is set AND the
underlying binding supports it.

Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com>
@kriszyp kriszyp changed the title Make RocksDB block cache size configurable Make RocksDB memory config configurable (block cache + WriteBufferManager) May 26, 2026
kriszyp and others added 2 commits May 26, 2026 05:48
envGet may return raw strings from process.env (configUtils doesn't
cast every code path), so RocksDatabase.config previously could
receive "12345" instead of 12345 for the size params and "false"
(truthy in JS) instead of false for the booleans.

- Numeric params (blockCacheSize, writeBufferManagerSize): wrap with
  Number(). Falsy results (NaN, 0) fall through the > 0 guard.
- Boolean params (writeBufferManagerCostToCache, allowStall): inline
  toBool() that recognizes booleans, 'true'/'false' strings, and
  generic truthy/falsy fallback. Returns undefined when unset so the
  spread skips the property.

Per PR review from @cb1kenobi (PR #780, line 169).

Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com>
Reverts 064501b's Number()/toBool() coercion in favor of strict
type checks. Config values flow through configUtils.castConfigValue
which produces proper numbers/booleans from YAML, env vars, and CLI
args — anything else arriving here is misconfiguration that should
fall through to the default rather than be silently rescued.

- blockCacheSize / writeBufferManagerSize: only honored when
  `typeof === 'number' && value > 0`. Non-numbers (including
  unparseable strings) fall back to the 25% default for the cache
  and disable the WBM.
- writeBufferManagerCostToCache / allowStall: only honored when
  `typeof === 'boolean'`. The 'false' string is no longer accepted
  — it's a YAML-quoting mistake the operator should fix.

Per feedback on PR #780 review thread.

Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com>
@kriszyp kriszyp merged commit 46109e6 into main May 26, 2026
4 of 5 checks passed
kriszyp added a commit that referenced this pull request May 26, 2026
envGet may return raw strings from process.env (configUtils doesn't
cast every code path), so RocksDatabase.config previously could
receive "12345" instead of 12345 for the size params and "false"
(truthy in JS) instead of false for the booleans.

- Numeric params (blockCacheSize, writeBufferManagerSize): wrap with
  Number(). Falsy results (NaN, 0) fall through the > 0 guard.
- Boolean params (writeBufferManagerCostToCache, allowStall): inline
  toBool() that recognizes booleans, 'true'/'false' strings, and
  generic truthy/falsy fallback. Returns undefined when unset so the
  spread skips the property.

Per PR review from @cb1kenobi (PR #780, line 169).

Co-Authored-By: Claude Sonnet 4.7 <noreply@anthropic.com>
@kriszyp kriszyp deleted the feat/rocksdb-block-cache-configurable branch May 26, 2026 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants