Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .agent/specs/sqlite-vfs-staging-cache-ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# SQLite VFS Staging Cache TTL Plan

Date: 2026-05-03

This plan changes the SQLite VFS page cache from a broad second-level pager cache into a short-lived staging cache for speculative pages. Demand pages fetched for `xRead` should be handed to SQLite and then forgotten by the VFS.

## Goals

- Avoid retaining pages in VFS memory after SQLite has already received them through `xRead`.
- Keep startup preload and read-ahead useful by retaining speculative pages briefly.
- Evict speculative pages on first successful target read so TTL is only the fallback for unused preloads.
- Keep lazy loading correct when all cache and preload features are disabled.
- Treat page 1 as staging data after `xRead` while keeping parsed page-size and database-size metadata.

## Non-Goals

- Do not change the remote `get_pages` protocol.
- Do not change SQLite pager settings.
- Do not add read pools back.
- Do not implement persisted preload hints in this branch.

## Current Behavior

- `resolve_pages` classifies fetched pages as `Target` when SQLite requested them and `Prefetch` when they were predicted.
- `fetch_initial_pages_for_registration` seeds startup pages as `Startup`.
- `should_cache_page` allows target, prefetch, and startup caching based on `SqliteVfsPageCacheMode`.
- Page 1 is always cacheable.
- Early protected pages live in `protected_page_cache`, which is an `scc::HashMap` with no TTL.

## Proposed Behavior

- Target pages should not be inserted into the VFS page cache by default.
- Target reads should remove speculative read pages from the cache after bytes are copied to the caller.
- Prefetch pages should be inserted into a TTL cache.
- Startup preload pages should be inserted into the same TTL cache.
- Commit completion should stage dirty pages in a separate TTL cache so SQLite can reread its own writes without retaining them permanently.
- Page 1 should follow the same staging rule as other pages after `xRead`. The VFS keeps parsed page-size and database-size metadata, and it can synthesize the empty page-1 header again before the first commit when depot has no database yet.
- Protected cache should no longer protect speculative pages forever. It should be removed or left unused in favor of the TTL cache.

## Configuration

- Add `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`.
- Default to a short TTL such as `30000` ms.
- A value of `0` disables speculative retention while preserving lazy target fetches.
- Keep `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off` as the stronger kill switch for all non-page-1 VFS caching.
- Do not use `RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES` to pin VFS page bytes beyond `xRead`.

## Implementation Plan

1. Extend `SqliteOptimizationFlags` and `VfsConfig` with a bounded staging TTL field.
2. Build `page_cache` with `time_to_live(Duration::from_millis(ttl_ms))` when TTL is nonzero.
3. Split cache insertion semantics so `PageCacheInsertKind::Target` is not retained by default.
4. Add an explicit `evict_pages_after_target_read` helper that removes every consumed page from both normal and protected speculative caches.
5. Call that helper after `io_read` copies returned bytes into SQLite's buffer.
6. Evict dirty page numbers from the staging cache after commit completion.
7. Rework `protected_page_cache` so it cannot pin speculative pages forever.
8. Keep `seed_main_page` behavior intact for parsed page 1 metadata.
9. Update metrics naming only if needed. `page_cache_entries` can continue to report retained VFS entries.

## Expected Cache Matrix

| Page source | Retained after fetch | Evicted on target read | TTL fallback |
| --- | --- | --- | --- |
| Target `xRead` miss | No | Not needed | No |
| Read-ahead prefetch | Yes | Yes | Yes |
| Startup preload | Yes | Yes | Yes |
| Page 1 | Yes during bootstrap or preload | Yes | Yes when retained |
| Dirty write buffer | Existing behavior | Existing behavior | No |

## Tests

- Add a VFS test proving a target read miss does not increase retained VFS cache entries.
- Add a VFS test proving prefetch pages are retained before use and removed after target read.
- Add a VFS test proving startup preload pages are retained briefly and removed after target read.
- Add a VFS test proving `VFS_STAGING_CACHE_TTL_MS=0` still lazily fetches pages.
- Add a VFS test proving `VFS_PAGE_CACHE_MODE=off` still lazily fetches pages and does not retain non-page-1 pages.
- If practical, use Tokio time pause/advance to verify TTL expiry deterministically instead of sleeping.

## Open Questions

- Should target retention remain available as an explicit benchmark mode, or should we remove target caching from the shipped matrix?
- Should `VFS_PROTECTED_CACHE_PAGES` be deprecated now that VFS pages are staging-only?
4 changes: 2 additions & 2 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,10 @@ jobs:
run: rivetkit-rust/packages/rivetkit-core/scripts/check-event-driven-drains.sh

- name: Check
run: cargo check --all-targets --all-features
run: cargo check --workspace --exclude rivetkit-wasm
env:
# Deny warnings
RUSTFLAGS: --cfg tokio_unstable -D warnings
RUSTFLAGS: --cfg tokio_unstable -D warnings -A unsafe-op-in-unsafe-fn

# test:
# name: Test
Expand Down
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions docker/build/darwin-arm64.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV BINDGEN_EXTRA_CLANG_ARGS_aarch64_apple_darwin="--sysroot=/root/osxcross/target/SDK/MacOSX11.3.sdk -isystem /root/osxcross/target/SDK/MacOSX11.3.sdk/usr/include" \
CFLAGS_aarch64_apple_darwin="-B/root/osxcross/target/bin" \
Expand All @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add aarch64-apple-darwin

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/darwin-x64.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV BINDGEN_EXTRA_CLANG_ARGS_x86_64_apple_darwin="--sysroot=/root/osxcross/target/SDK/MacOSX11.3.sdk -isystem /root/osxcross/target/SDK/MacOSX11.3.sdk/usr/include" \
CFLAGS_x86_64_apple_darwin="-B/root/osxcross/target/bin" \
Expand All @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add x86_64-apple-darwin

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/linux-arm64-gnu.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV RUSTFLAGS="--cfg tokio_unstable"
ENV RUSTC_WRAPPER=sccache \
Expand All @@ -19,6 +20,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add aarch64-unknown-linux-gnu

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/linux-arm64-musl.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV OPENSSL_DIR=/musl-aarch64 \
OPENSSL_INCLUDE_DIR=/musl-aarch64/include \
Expand All @@ -25,6 +26,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add aarch64-unknown-linux-musl

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/linux-x64-gnu.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV RUSTFLAGS="--cfg tokio_unstable"

Expand All @@ -27,6 +28,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add x86_64-unknown-linux-gnu

# Build frontend if building engine with frontend enabled.
RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/linux-x64-musl.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

ENV OPENSSL_DIR=/musl-x86_64 \
OPENSSL_INCLUDE_DIR=/musl-x86_64/include \
Expand All @@ -24,6 +25,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add x86_64-unknown-linux-musl

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
5 changes: 5 additions & 0 deletions docker/build/windows-x64.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ ARG BUILD_MODE=release
ARG BUILD_FRONTEND=false
ARG VITE_APP_API_URL=__SAME__
ARG VITE_FEATURE_FLAGS=
ARG RUST_TOOLCHAIN=1.91.1

# Windows-specific build flags:
# - lld linker is ~5x faster than MinGW's default ld for big Rust binaries.
Expand All @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \
WORKDIR /build
COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}" && \
rustup target add x86_64-pc-windows-gnu

RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \
export NODE_OPTIONS="--max-old-space-size=8192" && \
export SKIP_NAPI_BUILD=1 && \
Expand Down
4 changes: 2 additions & 2 deletions docker/builder-base/engine-builder.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ RUN apt-get update -y && \
openssl \
pkg-config \
wget && \
rustup toolchain install 1.91.0 && \
rustup default 1.91.0 && \
rustup toolchain install 1.91.1 && \
rustup default 1.91.1 && \
curl -fsSL https://deb.nodesource.com/setup_22.x | bash - && \
apt-get install -y --no-install-recommends nodejs && \
corepack enable && \
Expand Down
2 changes: 1 addition & 1 deletion docker/builder-base/linux-gnu.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# and the aarch64 cross-compiler.
#
# Build & push: scripts/docker-builder-base/build-push.sh linux-gnu
FROM rust:1.89.0-bullseye
FROM rust:1.91.1-bullseye

# Install base packages. Bullseye ships clang 11; we pull clang 14 from the
# official LLVM apt repo (https://apt.llvm.org) for modern bindgen support
Expand Down
2 changes: 1 addition & 1 deletion docker/builder-base/linux-musl.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
# Pre-bakes Rust, Node.js 22, napi-rs CLI.
#
# Build & push: scripts/docker-builder-base/build-push.sh linux-musl
FROM rust:1.89.0-bookworm
FROM rust:1.91.1-bookworm

RUN apt-get update && apt-get install -y --no-install-recommends \
musl-tools \
Expand Down
2 changes: 1 addition & 1 deletion docker/builder-base/osxcross.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#
# Build & push: scripts/docker-builder-base/build-push.sh osxcross
# syntax=docker/dockerfile:1.10.0
FROM rust:1.89.0-bookworm
FROM rust:1.91.1-bookworm

RUN apt-get update && apt-get install -y \
git-lfs \
Expand Down
2 changes: 1 addition & 1 deletion docker/builder-base/windows-mingw.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# Pre-bakes MinGW-w64, Rust target, Node.js 22, napi-rs CLI.
#
# Build & push: scripts/docker-builder-base/build-push.sh windows-mingw
FROM rust:1.89.0-bookworm
FROM rust:1.91.1-bookworm

RUN apt-get update && apt-get install -y --no-install-recommends \
llvm-14-dev \
Expand Down
4 changes: 4 additions & 0 deletions docker/engine/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,15 @@ ARG CARGO_BUILD_MODE=debug
ARG VITE_APP_API_URL=__SAME__
ARG VITE_APP_TURNSTILE_SITE_KEY=
ARG OVERRIDE_GIT_SHA
ARG RUST_TOOLCHAIN=1.91.1

WORKDIR /app

COPY . .

RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \
rustup default "${RUST_TOOLCHAIN}"

# Build frontend. Use --ignore-scripts because the root postinstall runs
# `lefthook install`, which needs a .git directory (excluded by
# .dockerignore). lefthook is a dev-only git hook manager and has no
Expand Down
5 changes: 4 additions & 1 deletion docs-internal/engine/SQLITE_OPTIMIZATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ Range page-read protocol details live in `.agent/specs/sqlite-range-page-read-pr
## Existing Optimizations

- Actor startup can preload SQLite VFS pages through `OpenConfig.preload_pgnos`, `OpenConfig.preload_ranges`, and persisted `/PRELOAD_HINTS`; first pages, hint mechanisms, and the preload byte budget are configured through central SQLite optimization flags.
- The VFS keeps an in-memory page cache seeded from `sqlite_startup_data.preloaded_pages`; cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity and protected-cache budget configured separately.
- The VFS keeps a short-lived staging cache for startup preload and read-ahead pages. Direct target pages fetched for `xRead` are not retained in VFS memory.
- Any speculative page consumed by `xRead`, including page 1, is evicted from the VFS staging cache after SQLite receives it. Before the first commit, a lazy page-1 read for a missing database synthesizes the empty SQLite header again instead of retaining page bytes. Staged pages that SQLite never reads expire through `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`.
- Commit completion stages dirty pages in a separate TTL cache so SQLite can reread its own writes without turning the VFS into a permanent second pager.
- VFS staging cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity configured separately. The protected-cache budget no longer pins VFS page bytes beyond `xRead`.
- The VFS has speculative read-ahead selected with `RIVETKIT_SQLITE_OPT_READ_AHEAD_MODE=off|bounded|adaptive`; the default bounded budget is 64 pages, which reduced the cold-read benchmark from 1,249 to 368 VFS `get_pages` calls.
- The VFS tracks bounded recent page hints as hot pages plus coalesced scan ranges; `NativeDatabase::snapshot_preload_hints()` exposes the in-memory plan for future flush wiring.
- Actor Prometheus metrics expose VFS read counters, fetched bytes, cache hits/misses, and `get_pages` duration at `/gateway/<actor_id>/metrics`.
Expand Down
2 changes: 1 addition & 1 deletion engine/packages/depot-client/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ depot-client-types.workspace = true
depot.workspace = true
moka = { version = "0.12", default-features = false, features = ["sync"] }
parking_lot.workspace = true
scc.workspace = true

[dev-dependencies]
depot = { workspace = true, features = ["test-faults"] }
Expand All @@ -31,7 +32,6 @@ gas.workspace = true
rivet-config.workspace = true
rivet-pools.workspace = true
rivet-test-deps.workspace = true
scc.workspace = true
sha2.workspace = true
tempfile.workspace = true
universaldb.workspace = true
Expand Down
13 changes: 7 additions & 6 deletions engine/packages/depot-client/src/database.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ use crate::{
vfs::{
NativeVfsHandle, SqliteTransportHandle, SqliteVfs, SqliteVfsMetrics,
SqliteVfsMetricsSnapshot, VfsConfig, VfsPreloadHintSnapshot,
fetch_initial_main_page_for_registration,
fetch_initial_pages_for_registration,
},
worker::SqliteWorkerHandle,
};
Expand All @@ -32,17 +32,18 @@ pub async fn open_database_from_transport(
metrics: Option<Arc<dyn SqliteVfsMetrics>>,
) -> Result<NativeDatabaseHandle> {
let vfs_name = vfs_name_for_actor_database(&actor_id, generation);
let initial_main_page = fetch_initial_main_page_for_registration(transport.clone(), &actor_id)
let config = VfsConfig::default();
let initial_pages = fetch_initial_pages_for_registration(transport.clone(), &actor_id, &config)
.await
.map_err(|e| anyhow!("failed to preload sqlite main page: {e}"))?;
.map_err(|e| anyhow!("failed to preload sqlite pages: {e}"))?;
let vfs = Arc::new(
SqliteVfs::register_with_transport_and_initial_page(
SqliteVfs::register_with_transport_and_initial_pages(
&vfs_name,
transport,
actor_id.clone(),
rt_handle,
VfsConfig::default(),
initial_main_page,
config,
initial_pages,
metrics.clone(),
)
.map_err(|e| anyhow!("failed to register sqlite VFS: {e}"))?,
Expand Down
Loading
Loading