chore: sqlite v1 to v2 data migration by NathanFlurry · Pull Request #4692 · rivet-dev/rivet

NathanFlurry · 2026-04-22T09:02:16Z

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

railway-app · 2026-04-22T09:02:24Z

🚅 Deployed to the rivet-pr-4692 environment in rivet-frontend

Service	Status	Web	Updated (UTC)
website	😴 Sleeping (View Logs)	Web	Apr 22, 2026 at 9:14 am
kitchen-sink	❌ Build Failed (View Logs)	Web	Apr 22, 2026 at 9:12 am
frontend-cloud	❌ Build Failed (View Logs)	Web	Apr 22, 2026 at 9:04 am
frontend-inspector	❌ Build Failed (View Logs)	Web	Apr 22, 2026 at 9:04 am
mcp-hub	✅ Success (View Logs)	Web	Apr 22, 2026 at 9:03 am
ladle	❌ Build Failed (View Logs)	Web	Apr 22, 2026 at 9:03 am

NathanFlurry · 2026-04-22T09:02:44Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

claude · 2026-04-22T09:09:39Z

Code Review: SQLite v1 to v2 Data Migration

This is a solid, well-tested migration path. The core logic is correct and the test coverage is comprehensive. A few items to address before merging:

Bug / Correctness

Indentation in prepare_v1_migration breaks the if let block
engine/packages/sqlite-storage/src/takeover.rs

The body of if let Some(existing_meta) = ... is not indented relative to the if-let arm, which is both a formatting violation and risks future misreads:

if let Some(existing_meta) =
    udb::tx_get_value_serializable(&tx, &subspace, &meta_storage_key).await?
{
let existing_head = decode_db_head(&existing_meta)?;   // ← should be one tab deeper
ensure!(
    matches!(existing_head.origin, SqliteOrigin::MigratingFromV1),
    SqliteStorageError::ConcurrentTakeover
);
}

Should be:

if let Some(existing_meta) =
    udb::tx_get_value_serializable(&tx, &subspace, &meta_storage_key).await?
{
    let existing_head = decode_db_head(&existing_meta)?;
    ensure!(
        matches!(existing_head.origin, SqliteOrigin::MigratingFromV1),
        SqliteStorageError::ConcurrentTakeover
    );
}

Performance Concerns

Every actor start scans v1 KV keys even after a successful migration
engine/packages/pegboard-envoy/src/sqlite_runtime.rs — maybe_migrate_v1_to_v2

After migration succeeds, the v1 KV keys are intentionally not deleted. As a result, every subsequent actor start still runs:

sqlite_v1_data_exists → true (KV prefix scan)
Acquire migration lock
sqlite_v1_data_exists again → true (second KV prefix scan)
try_load_head → MigratedFromV1 → early return

This is 2 extra KV round-trips on every warm start for any actor that was ever on v1. If v1 actors are numerous, this adds up at scale. Consider adding a test for idempotent call behavior, and a TODO comment explaining why v1 data is not cleaned up post-migration (or clean it up in a follow-up).

Wrong histogram bucket set for SQLITE_MIGRATION_PAGES
engine/packages/pegboard-envoy/src/metrics.rs

SQLITE_MIGRATION_PAGES reuses the latency-oriented BUCKETS (max 500.0). The migration limit is 128 MB / 4 KB = 32 768 pages, so nearly all migrations will fall into the overflow bucket. A custom exponential bucket set covering 1–40 000 would make this histogram useful:

pub static ref SQLITE_MIGRATION_PAGES: Histogram = register_histogram_with_registry!(
    "pegboard_envoy_sqlite_migration_pages",
    "Number of pages imported during sqlite v1 to v2 migration.",
    vec![1.0, 4.0, 16.0, 64.0, 256.0, 1024.0, 4096.0, 16384.0, 32768.0],
    *REGISTRY
).unwrap();

CLAUDE.md Convention Violations

eprintln! in test functions
rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs — profile_large_tx_insert_5mb, profile_hot_row_updates, profile_large_tx_insert_1mb, profile_large_tx_insert_1mb_preloaded

CLAUDE.md: "Never use eprintln! or println! for logging in Rust code. Always use tracing macros."

These profile tests print timing summaries with eprintln!. Either convert to tracing::info! calls, or mark the eprintln! blocks with a comment explaining they are intentional test-only diagnostic output and use println! (which the cargo test runner captures by default).

anyhow! macro usage (prefer .context() / .ok_or_else(|| ...) with .context())
rivetkit-rust/packages/rivetkit-sqlite/src/database.rs and engine/packages/sqlite-storage/src/udb.rs

CLAUDE.md: "Prefer anyhow's .context() over anyhow! macro."

Affected lines:

// database.rs
.ok_or_else(|| anyhow!("missing sqlite startup data for actor {actor_id}"))?;
.map_err(|e| anyhow!("failed to register sqlite VFS: {e}"))?;
.map_err(|e| anyhow!("failed to open sqlite database: {e}"))

// udb.rs
return Err(anyhow::anyhow!(
    "unknown sqlite-storage value marker {other} for key {:?}", key
));

Minor / Design Notes (no action required)

decode_db_head fallback is silently lenient on parse errors
engine/packages/sqlite-storage/src/types.rs

When the new DBHead (with origin) fails to deserialize, the code silently retries with LegacyDBHead. If the failure is due to data corruption (not just a missing trailing field), LegacyDBHead might still succeed and return silently wrong data. Adding an explicit check that the legacy decode only fires when the error looks like a trailing-field mismatch (e.g. checking byte length) would make this more robust, but the current approach is probably acceptable given BARE's strict length encoding.

split_bytes edge case with empty max_chunk_bytes = 0
engine/packages/pegboard-envoy/src/sqlite_runtime.rs

When bytes.is_empty() || max_chunk_bytes == 0, the function returns vec![bytes.to_vec()] — a single empty chunk. Callers protect against this (max_delta_bytes coming from SqliteMeta is always positive), but an assert!(max_chunk_bytes > 0) guard would make the invariant explicit.

v1 data never explicitly cleaned up post-migration
The design decision to leave v1 KV keys in place after migration appears intentional (as evidenced by skips_native_v2_state_even_if_v1_tombstone_exists). A short comment in maybe_migrate_v1_to_v2 explaining this choice would help future readers.

Test Coverage Assessment

Coverage is excellent:

migrates_v1_sqlite_into_v2_storage — happy path
retries_cleanly_after_stale_partial_v1_import — lease/retry logic
rejects_fresh_in_progress_v1_migrations — in-progress guard
skips_native_v2_state_even_if_v1_tombstone_exists — origin check
bails_when_v2_meta_is_unreadable — corrupt meta handling
recovers_a_pending_v1_journal_before_import — journal rollback
migrates_zero_size_v1_state_without_pages — empty DB
rejects_v1_databases_with_unsupported_page_size — page size guard
rejects_v1_wal_sidecars / rejects_v1_shm_sidecars — WAL/SHM guards
rejects_v1_files_with_missing_chunks — chunk integrity
rejects_v1_files_that_exceed_migration_limit — size guard
reads_v1_files_beyond_the_default_kv_list_limit — pagination

Missing: a test that calls maybe_migrate_v1_to_v2 twice on an already-migrated actor and asserts it returns Ok(false) on the second call (idempotency path through MigratedFromV1).

Summary

The architecture is sound. The key fixes before merging are:

Fix the if let body indentation in prepare_v1_migration (formatting/readability bug)
Replace eprintln! calls in profile tests with tracing::info! or println!
Fix the SQLITE_MIGRATION_PAGES histogram buckets (observability bug)
Replace anyhow! macro usages with .context()/.ok_or_else(...).context(...) per project conventions

railway-app Bot had a problem deploying to rivet-frontend / rivet-pr-4692 April 22, 2026 09:02 Failure

NathanFlurry mentioned this pull request Apr 22, 2026

chore: move rivetkit to task model #4680

Draft

11 tasks

NathanFlurry changed the base branch from 04-19-chore_move_rivetkit_to_task_model to graphite-base/4692 April 22, 2026 09:04

chore: sqlite v1 to v2 data migration

daf7aa8

NathanFlurry force-pushed the 04-21-chore_sqlite_v1_to_v2_data_migration branch from c2fbb00 to daf7aa8 Compare April 22, 2026 09:11

NathanFlurry force-pushed the graphite-base/4692 branch from 710f2df to 5b8e173 Compare April 22, 2026 09:11

railway-app Bot had a problem deploying to rivet-frontend / rivet-pr-4692 April 22, 2026 09:12 Failure

NathanFlurry changed the base branch from graphite-base/4692 to 04-19-chore_move_rivetkit_to_task_model April 22, 2026 09:12

This was referenced Apr 22, 2026

feat: import/export actors #4640

Draft

feat(rivetkit): sqlite vfs v2 #4673

Draft

chore: rivetkit to rust #4679

Draft

chore: fix remaining issues with rivetkit-core #4701

Draft

chore: rivetkit core/napi/typescript follow up review #4702

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: sqlite v1 to v2 data migration#4692

chore: sqlite v1 to v2 data migration#4692
NathanFlurry wants to merge 1 commit into04-19-chore_move_rivetkit_to_task_modelfrom
04-21-chore_sqlite_v1_to_v2_data_migration

NathanFlurry commented Apr 22, 2026

Uh oh!

railway-app Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

NathanFlurry commented Apr 22, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NathanFlurry commented Apr 22, 2026

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

railway-app Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NathanFlurry commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented Apr 22, 2026

Code Review: SQLite v1 to v2 Data Migration

Bug / Correctness

Performance Concerns

CLAUDE.md Convention Violations

Minor / Design Notes (no action required)

Test Coverage Assessment

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

railway-app Bot commented Apr 22, 2026 •

edited

Loading

NathanFlurry commented Apr 22, 2026 •

edited

Loading