test(kitchen-sink): add sqlite memory soak harness by NathanFlurry · Pull Request #4919 · rivet-dev/rivet

NathanFlurry · 2026-05-04T00:46:11Z

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

NathanFlurry · 2026-05-04T00:47:00Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

fix(rivetkit-core): fail closed sqlite during shutdown #4921
docs(rivetkit): raise serverless start payload limit #4920
test(kitchen-sink): add sqlite memory soak harness #4919 👈 (View in Graphite)
feat(rivetkit): expose runtime diagnostics #4918
chore(depot-client): migrate to worker thread #4907 : 2 other dependent PRs (#4909 , #4912 )
test(driver): select matrix cells with vitest filters #4906
fix(pegboard): restore hibernating requests on serverful start #4904 : 1 other dependent PR (#4905 )
fix(rivetkit): drain shutdown work before sleep #4902 : 1 other dependent PR (#4903 )
feat(sqlite): wire remote sqlite execute handlers #4897
test(serverless): avoid pinning header validation order #4896
chore(driver-test-runner): add native/wasm runtime matrix to skill #4884 : 3 other dependent PRs (#4892 , #4894 , #4901 )
chroe: rename rivetkit-sqlite to depot-client #4883
fix(rivetkit-wasm): fix mem leaks #4880
chore(rivetkit): wasm support #4860
feat(sqlite): add cold read benchmarks and simplify optimizations #4857
feat(sqlite): benchmark cold reads #4848 : 1 other dependent PR (#4864 )
docs(actors): document manual lifecycle controls #4847
test(sqlite): expand db stress coverage #4846 : 1 other dependent PR (#4849 )
test(raw-websocket): add serverless smoke coverage #4845
feat(serverless): configure drain grace period #4844
fix(publish): enable frontend build for engine target and use turbo build instead of turbo build:engine #4843
feat(rivetkit): expose conn handle in onWebSocket context #4842
fix(rivetkit): normalize waitUntil/keepAwake promises to null to avoid serde undefined error #4841
fix(rivetkit): remove requirement for token #4839 : 1 other dependent PR (#4840 )
fix(pegboard-envoy): allow sqlite requests across lifecycle states #4838
fix(pegboard): route scoped actor key reads by runner dc #4837
fix(sqlite-storage): let sqlite open take over stale owners #4834
fix(engine-tests): parallelize multi-DC startup + enable 6 multi-DC tests #4833
fix(api): honor cursor in actor_ids list path #4832
fix(api): silently filter invalid IDs from actor_ids query instead of 400 #4831
fix(api-peer): make actors_delete idempotent for already-destroyed actors #4830
fix(pegboard): use Iterator streaming mode in list_names so limit param is honored #4829
chore(engine-tests): convert envoy tests to Sleep crash policy + triage ignored #4828
feat(engine-tests): wire envoy actor names through prepopulate_actor_names #4827
chore(depot): fault injection tests #4868
feat(sqlite): pitr & forking #4859 : 1 other dependent PR (#4863 )
chore(sqlite): stateless storage refactor #4856
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

claude · 2026-05-04T00:53:08Z

Code Review: SQLite Memory Soak Harness

Overview

This PR adds a comprehensive memory soak testing harness for SQLite-backed Rivet Actors, including:

A new sqliteMemoryPressure test actor (sqlite-memory-pressure.ts)
A standalone harness script (sqlite-memory-soak.ts) that spawns a local engine + kitchen-sink server and drives actors through configurable workloads
A metrics report renderer (proc-metrics-report.ts) that produces HTML charts from collected data
An agent note documenting current known issues from runs so far

The tooling is sophisticated and clearly the result of real investigation work. The notes doc is especially useful as it captures concrete run results and open questions.

Issues and Suggestions

`sqlite-memory-pressure.ts`

Large commented-out code blocks

Several logical paths are commented out and left in the file:

// const DEFAULT_DELETE_ROWS = 64;
// const DEFAULT_RETAIN_ROWS = 1024;
// const deleteRows = finiteInt(input.deleteRows, DEFAULT_DELETE_ROWS);
// const retainRows = Math.max(...);

And the entire delete/VACUUM path in runCycle and releaseStorage. If these are intentionally disabled for the current soak focus (as noted in the agent doc), either remove them or add a brief // intentionally disabled: see .agent/notes/sqlite-memory-soak-issues.md so readers understand the state.

onSleep crash risk on failed startup (Issue 4 from agent notes)

onSleep: (c) => {
    c.state.sleepCount += 1;  // crashes if state was never initialized

The notes already flag this as Issue 4. Given the investigation found Cannot read properties of undefined (reading 'sleepCount') in real runs, add a defensive guard:

onSleep: (c) => {
    if (!c.state) return;
    c.state.sleepCount += 1;

copyNativeMetrics dual-name lookup is fragile

const numberField = (camel: string, snake: string) =>
    Number(raw[camel] ?? raw[snake] ?? 0);

This works around a camelCase/snake_case uncertainty in the SqliteNativeMetrics type at runtime. If the type is known at compile time, use the correct field name directly. If the type can come from either runtime, add a comment explaining why both are needed.

queryOne uses rest-spread args on a typed interface

async function queryOne<T>(
    database: { execute: (sql: string, ...args: unknown[]) => Promise<unknown[]> },
    sql: string,
    ...args: unknown[]
): Promise<T>

args is typed unknown[] but the actual c.db.execute signature may not match. Consider using the concrete c.db type directly to get compile-time safety rather than the structural duck-type here.

`sqlite-memory-soak.ts`

Indentation inconsistency in parseArgs

Around the validation block at the end of parseArgs, the indentation switches between 2-space indent inside the for...of and 4-space (\t-equivalent) for the surrounding array elements:

		["--concurrency", args.concurrency],
		[
			"--spike-max-concurrency",
			...
		],
	["--sample-interval-ms", args.sampleIntervalMs],    // ← wrong indent
	["--request-lifespan-seconds", ...],

Similarly at the bottom of main():

		const sleepSummary = summarizeActorSleeps(jsonlPath);
		if (sleepSummary) console.log(sleepSummary);
		console.log(`events=${jsonlPath}`);

These appear to be stray mixed-indent lines that escaped the formatter. Worth cleaning up.

runWithSpikeConcurrency — actor exhaustion is a thrown error, not a run failure

if (nextActor >= args.actors) {
    throw new Error(`ran out of actors before spike duration completed`);
}

This throws after Promise.allSettled(workers) already drained, meaning the run succeeded but the spike window outlasted the actor pool. Throwing here prevents the normal summary from printing. Consider logging a warning instead and returning normally (or emit a run_actor_pool_exhausted event).

spawnActor uses var-style hoisting pattern with let worker; worker = ...

let worker: Promise<void>;
worker = runChurnActorDriver(...)
    .catch(...)
    .finally(() => { workers.delete(worker); });
workers.add(worker);

This is valid TypeScript but worker is referenced inside the finally closure before the outer let is assigned in some engine analyses. Consider using const worker = ...; workers.add(worker); with the .finally moved after .add to make the self-reference explicit.

failed flag is checked but errors array is not always populated

.catch((err) => {
    failed = true;
    errors.push(err);
})

The loop exits on failed, but errors[0] is re-thrown only after Promise.allSettled. If Promise.allSettled is called with an empty workers set (edge case where all actors exhaust immediately), errors.length > 0 check is fine — but if failed is set but errors is empty for any code path, the error is silently dropped. The current code looks correct, but the errors array should be the sole source of truth; failed adds redundancy without benefit — consider removing failed and checking errors.length > 0 in the loop condition.

waitForHttpOk polls with 100ms sleep — fragile for slow cold starts

The timeoutMs default is 20 seconds, which may be too short for a slow release build engine start. Consider making the cold-start timeout configurable or bumping it, especially given the soak notes mention very long actor_ready_timeout errors under load.

configureServerlessRunner leaks engine config details into the test harness

The harness calls PUT /runner-configs/{pool} to wire up the serverless URL. This creates a hard dependency on that particular engine API shape and means the harness won't work if the engine config API changes. Consider adding a comment noting this is an internal/dev-only API path.

CLOCK_TICKS_PER_SECOND = 100 is a Linux assumption

This constant is used to convert /proc/stat CPU ticks, which is correct on most Linux distros but not universal. Add a comment that this is the standard Linux CONFIG_HZ=100 default, or read it from sysconf(_SC_CLK_TCK) via the os module if portability matters.

summarizeCycleVfs reads the entire JSONL file into memory

for (const line of readFileSync(jsonlPath, "utf8").split("\n")) {

For long runs this could be hundreds of MB. A streaming line reader would be more efficient, though for a dev-tooling script this is acceptable in practice.

`proc-metrics-report.ts`

Not reviewed in detail since it's a standalone HTML report renderer, but note that it embeds inline JavaScript for chart rendering — that's fine for a dev tool.

`page-data.ts` / `index.ts`

The wiring is clean. The new sqliteMemoryPressure entry in PAGE_GROUPS, ACTION_TEMPLATES, and SNIPPETS follows the existing pattern exactly.

Summary

This is solid investigation tooling. The main actionable items before merging:

Clean up the mixed-indentation lines in sqlite-memory-soak.ts.
Add the defensive if (!c.state) return guard in onSleep (or document the decision not to, if core guarantees onSleep is never called after a failed onMigrate).
Decide whether the large commented-out delete/VACUUM blocks should be removed or annotated.

Everything else is minor style or low-risk future polish. The agent notes doc is a good artifact to keep — the documented run IDs, error counts, and open questions are directly useful for whoever picks up the correctness investigation next.

test(kitchen-sink): add sqlite memory soak harness

307c362

This was referenced May 4, 2026

feat(rivetkit): expose runtime diagnostics #4918

Closed

docs(rivetkit): raise serverless start payload limit #4920

Closed

fix(rivetkit-core): fail closed sqlite during shutdown #4921

Closed

NathanFlurry closed this May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(kitchen-sink): add sqlite memory soak harness#4919

test(kitchen-sink): add sqlite memory soak harness#4919
NathanFlurry wants to merge 1 commit intodriver-test-complaints/expose-runtime-diagnostsfrom
driver-test-complaints/sqlite-memory-soak-harness

NathanFlurry commented May 4, 2026

Uh oh!

NathanFlurry commented May 4, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NathanFlurry commented May 4, 2026

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

NathanFlurry commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented May 4, 2026

Code Review: SQLite Memory Soak Harness

Overview

Issues and Suggestions

sqlite-memory-pressure.ts

sqlite-memory-soak.ts

proc-metrics-report.ts

page-data.ts / index.ts

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NathanFlurry commented May 4, 2026 •

edited

Loading

`sqlite-memory-pressure.ts`

`sqlite-memory-soak.ts`

`proc-metrics-report.ts`

`page-data.ts` / `index.ts`