Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .agent/specs/sqlite-vfs-staging-cache-ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# SQLite VFS Staging Cache TTL Plan

Date: 2026-05-03

This plan changes the SQLite VFS page cache from a broad second-level pager cache into a short-lived staging cache for speculative pages. Demand pages fetched for `xRead` should be handed to SQLite and then forgotten by the VFS.

## Goals

- Avoid retaining pages in VFS memory after SQLite has already received them through `xRead`.
- Keep startup preload and read-ahead useful by retaining speculative pages briefly.
- Evict speculative pages on first successful target read so TTL is only the fallback for unused preloads.
- Keep lazy loading correct when all cache and preload features are disabled.
- Treat page 1 as staging data after `xRead` while keeping parsed page-size and database-size metadata.

## Non-Goals

- Do not change the remote `get_pages` protocol.
- Do not change SQLite pager settings.
- Do not add read pools back.
- Do not implement persisted preload hints in this branch.

## Current Behavior

- `resolve_pages` classifies fetched pages as `Target` when SQLite requested them and `Prefetch` when they were predicted.
- `fetch_initial_pages_for_registration` seeds startup pages as `Startup`.
- `should_cache_page` allows target, prefetch, and startup caching based on `SqliteVfsPageCacheMode`.
- Page 1 is always cacheable.
- Early protected pages live in `protected_page_cache`, which is an `scc::HashMap` with no TTL.

## Proposed Behavior

- Target pages should not be inserted into the VFS page cache by default.
- Target reads should remove speculative read pages from the cache after bytes are copied to the caller.
- Prefetch pages should be inserted into a TTL cache.
- Startup preload pages should be inserted into the same TTL cache.
- Commit completion should stage dirty pages in a separate TTL cache so SQLite can reread its own writes without retaining them permanently.
- Page 1 should follow the same staging rule as other pages after `xRead`. The VFS keeps parsed page-size and database-size metadata, and it can synthesize the empty page-1 header again before the first commit when depot has no database yet.
- Protected cache should no longer protect speculative pages forever. It should be removed or left unused in favor of the TTL cache.

## Configuration

- Add `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`.
- Default to a short TTL such as `30000` ms.
- A value of `0` disables speculative retention while preserving lazy target fetches.
- Keep `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off` as the stronger kill switch for all non-page-1 VFS caching.
- Do not use `RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES` to pin VFS page bytes beyond `xRead`.

## Implementation Plan

1. Extend `SqliteOptimizationFlags` and `VfsConfig` with a bounded staging TTL field.
2. Build `page_cache` with `time_to_live(Duration::from_millis(ttl_ms))` when TTL is nonzero.
3. Split cache insertion semantics so `PageCacheInsertKind::Target` is not retained by default.
4. Add an explicit `evict_pages_after_target_read` helper that removes every consumed page from both normal and protected speculative caches.
5. Call that helper after `io_read` copies returned bytes into SQLite's buffer.
6. Evict dirty page numbers from the staging cache after commit completion.
7. Rework `protected_page_cache` so it cannot pin speculative pages forever.
8. Keep `seed_main_page` behavior intact for parsed page 1 metadata.
9. Update metrics naming only if needed. `page_cache_entries` can continue to report retained VFS entries.

## Expected Cache Matrix

| Page source | Retained after fetch | Evicted on target read | TTL fallback |
| --- | --- | --- | --- |
| Target `xRead` miss | No | Not needed | No |
| Read-ahead prefetch | Yes | Yes | Yes |
| Startup preload | Yes | Yes | Yes |
| Page 1 | Yes during bootstrap or preload | Yes | Yes when retained |
| Dirty write buffer | Existing behavior | Existing behavior | No |

## Tests

- Add a VFS test proving a target read miss does not increase retained VFS cache entries.
- Add a VFS test proving prefetch pages are retained before use and removed after target read.
- Add a VFS test proving startup preload pages are retained briefly and removed after target read.
- Add a VFS test proving `VFS_STAGING_CACHE_TTL_MS=0` still lazily fetches pages.
- Add a VFS test proving `VFS_PAGE_CACHE_MODE=off` still lazily fetches pages and does not retain non-page-1 pages.
- If practical, use Tokio time pause/advance to verify TTL expiry deterministically instead of sleeping.

## Open Questions

- Should target retention remain available as an explicit benchmark mode, or should we remove target caching from the shipped matrix?
- Should `VFS_PROTECTED_CACHE_PAGES` be deprecated now that VFS pages are staging-only?
5 changes: 4 additions & 1 deletion docs-internal/engine/SQLITE_OPTIMIZATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ Range page-read protocol details live in `.agent/specs/sqlite-range-page-read-pr
## Existing Optimizations

- Actor startup can preload SQLite VFS pages through `OpenConfig.preload_pgnos`, `OpenConfig.preload_ranges`, and persisted `/PRELOAD_HINTS`; first pages, hint mechanisms, and the preload byte budget are configured through central SQLite optimization flags.
- The VFS keeps an in-memory page cache seeded from `sqlite_startup_data.preloaded_pages`; cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity and protected-cache budget configured separately.
- The VFS keeps a short-lived staging cache for startup preload and read-ahead pages. Direct target pages fetched for `xRead` are not retained in VFS memory.
- Any speculative page consumed by `xRead`, including page 1, is evicted from the VFS staging cache after SQLite receives it. Before the first commit, a lazy page-1 read for a missing database synthesizes the empty SQLite header again instead of retaining page bytes. Staged pages that SQLite never reads expire through `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`.
- Commit completion stages dirty pages in a separate TTL cache so SQLite can reread its own writes without turning the VFS into a permanent second pager.
- VFS staging cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity configured separately. The protected-cache budget no longer pins VFS page bytes beyond `xRead`.
- The VFS has speculative read-ahead selected with `RIVETKIT_SQLITE_OPT_READ_AHEAD_MODE=off|bounded|adaptive`; the default bounded budget is 64 pages, which reduced the cold-read benchmark from 1,249 to 368 VFS `get_pages` calls.
- The VFS tracks bounded recent page hints as hot pages plus coalesced scan ranges; `NativeDatabase::snapshot_preload_hints()` exposes the in-memory plan for future flush wiring.
- Actor Prometheus metrics expose VFS read counters, fetched bytes, cache hits/misses, and `get_pages` duration at `/gateway/<actor_id>/metrics`.
Expand Down
22 changes: 22 additions & 0 deletions engine/packages/depot-client/src/optimization_flags.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ pub const VFS_PAGE_CACHE_MODE_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MO
pub const VFS_PAGE_CACHE_CAPACITY_PAGES_ENV: &str =
"RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_CAPACITY_PAGES";
pub const VFS_PROTECTED_CACHE_PAGES_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES";
pub const VFS_STAGING_CACHE_TTL_MS_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS";

pub const DEFAULT_STARTUP_PRELOAD_MAX_BYTES: usize = 1024 * 1024;
pub const MAX_STARTUP_PRELOAD_MAX_BYTES: usize = 8 * 1024 * 1024;
Expand All @@ -31,6 +32,8 @@ pub const DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES: u64 = 50_000;
pub const MAX_VFS_PAGE_CACHE_CAPACITY_PAGES: u64 = 500_000;
pub const DEFAULT_VFS_PROTECTED_CACHE_PAGES: usize = 512;
pub const MAX_VFS_PROTECTED_CACHE_PAGES: usize = 8_192;
pub const DEFAULT_VFS_STAGING_CACHE_TTL_MS: u64 = 30_000;
pub const MAX_VFS_STAGING_CACHE_TTL_MS: u64 = 300_000;

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum SqliteReadAheadMode {
Expand Down Expand Up @@ -102,6 +105,7 @@ pub struct SqliteOptimizationFlags {
pub vfs_page_cache_mode: SqliteVfsPageCacheMode,
pub vfs_page_cache_capacity_pages: u64,
pub vfs_protected_cache_pages: usize,
pub vfs_staging_cache_ttl_ms: u64,
}

impl Default for SqliteOptimizationFlags {
Expand All @@ -128,6 +132,7 @@ impl Default for SqliteOptimizationFlags {
vfs_page_cache_mode: SqliteVfsPageCacheMode::All,
vfs_page_cache_capacity_pages: DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES,
vfs_protected_cache_pages: DEFAULT_VFS_PROTECTED_CACHE_PAGES,
vfs_staging_cache_ttl_ms: DEFAULT_VFS_STAGING_CACHE_TTL_MS,
}
}
}
Expand Down Expand Up @@ -196,6 +201,11 @@ impl SqliteOptimizationFlags {
DEFAULT_VFS_PROTECTED_CACHE_PAGES,
MAX_VFS_PROTECTED_CACHE_PAGES,
),
vfs_staging_cache_ttl_ms: u64_bounded_by_default(
read_env(VFS_STAGING_CACHE_TTL_MS_ENV).as_deref(),
DEFAULT_VFS_STAGING_CACHE_TTL_MS,
MAX_VFS_STAGING_CACHE_TTL_MS,
),
}
}
}
Expand Down Expand Up @@ -307,6 +317,7 @@ mod tests {
VFS_PAGE_CACHE_MODE_ENV => Some("off".to_string()),
VFS_PAGE_CACHE_CAPACITY_PAGES_ENV => Some("0".to_string()),
VFS_PROTECTED_CACHE_PAGES_ENV => Some("0".to_string()),
VFS_STAGING_CACHE_TTL_MS_ENV => Some("0".to_string()),
_ => None,
});

Expand All @@ -327,6 +338,7 @@ mod tests {
assert_eq!(flags.vfs_page_cache_mode, SqliteVfsPageCacheMode::Off);
assert_eq!(flags.vfs_page_cache_capacity_pages, 0);
assert_eq!(flags.vfs_protected_cache_pages, 0);
assert_eq!(flags.vfs_staging_cache_ttl_ms, 0);
}

#[test]
Expand All @@ -336,6 +348,7 @@ mod tests {
STARTUP_PRELOAD_FIRST_PAGE_COUNT_ENV => Some("nope".to_string()),
VFS_PAGE_CACHE_CAPACITY_PAGES_ENV => Some("invalid".to_string()),
VFS_PROTECTED_CACHE_PAGES_ENV => Some("invalid".to_string()),
VFS_STAGING_CACHE_TTL_MS_ENV => Some("invalid".to_string()),
_ => None,
});
assert_eq!(
Expand All @@ -354,6 +367,10 @@ mod tests {
invalid.vfs_protected_cache_pages,
DEFAULT_VFS_PROTECTED_CACHE_PAGES
);
assert_eq!(
invalid.vfs_staging_cache_ttl_ms,
DEFAULT_VFS_STAGING_CACHE_TTL_MS
);

let clamped = SqliteOptimizationFlags::from_env_reader(|key| match key {
STARTUP_PRELOAD_MAX_BYTES_ENV => Some((MAX_STARTUP_PRELOAD_MAX_BYTES + 1).to_string()),
Expand All @@ -364,6 +381,7 @@ mod tests {
Some((MAX_VFS_PAGE_CACHE_CAPACITY_PAGES + 1).to_string())
}
VFS_PROTECTED_CACHE_PAGES_ENV => Some((MAX_VFS_PROTECTED_CACHE_PAGES + 1).to_string()),
VFS_STAGING_CACHE_TTL_MS_ENV => Some((MAX_VFS_STAGING_CACHE_TTL_MS + 1).to_string()),
_ => None,
});
assert_eq!(
Expand All @@ -382,5 +400,9 @@ mod tests {
clamped.vfs_protected_cache_pages,
MAX_VFS_PROTECTED_CACHE_PAGES
);
assert_eq!(
clamped.vfs_staging_cache_ttl_ms,
MAX_VFS_STAGING_CACHE_TTL_MS
);
}
}
Loading
Loading