Skip to content

feat(sqlite): wire remote sqlite execute handlers#4897

Closed
NathanFlurry wants to merge 1 commit intographite-base/4897from
driver-test-complaints/wire-remote-sqlite-exec
Closed

feat(sqlite): wire remote sqlite execute handlers#4897
NathanFlurry wants to merge 1 commit intographite-base/4897from
driver-test-complaints/wire-remote-sqlite-exec

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented May 3, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 3, 2026

Code Review: feat(sqlite): wire remote sqlite execute handlers

Overview

This PR replaces the three stub error responses for ToRivetSqliteExecRequest, ToRivetSqliteExecuteRequest, and ToRivetSqliteExecuteWriteRequest with real implementations. It introduces a Conveyer transport variant in depot-client, wires lazy NativeDatabaseHandle caches into Conn, and adds actor/generation validation before dispatching remote SQL.


Security Concerns

Missing sqlite.remote_indeterminate_result on WebSocket disconnect (CLAUDE.md invariant)

CLAUDE.md states: "Sent remote SQL requests must fail with sqlite.remote_indeterminate_result on WebSocket disconnect; only unsent remote SQL may be sent after reconnect." The new handlers execute SQL synchronously inside handle_message, so if the WebSocket drops mid-execution the caller receives no response and the actor SQL client is left in an indeterminate state. This invariant needs to be satisfied before the PR can be considered correct.

has_pending_start_command bypass in validate_remote_sqlite_generation

The condition allows a request through whenever there is a pending CommandStartActor for the requested generation, even if the actor is already running at a different generation. This is a TOCTOU window: between validation and SQL execution the pending start may be superseded. Consider tightening to only allow when active_generation == Some(generation) and treating the pending-start path as a hard error.

No SQL string size validation

validate_remote_sqlite_params caps bind-parameter bytes at 128 KiB, but there is no analogous limit on the SQL string itself (request.sql). A crafted large SQL statement bypasses the guard entirely. A MAX_REMOTE_SQL_STATEMENT_BYTES check on request.sql.len() would close this gap.


Correctness

generation narrowing: validate_remote_sqlite_generation converts the protocol u64 generation to u32 for the FDB key lookup. This is fine and the error path is handled. The original u64 value is used everywhere else (cache key, open_database_from_conveyer), so types are consistent.

Cache cleanup on stop/shutdown is correct. actor_lifecycle::stop_actor now calls retain_sync to evict entries for the stopped actor, and shutdown_conn_actors calls clear_sync. Both match the actor_dbs eviction pattern.

Arc<OnceCell<NativeDatabaseHandle>> per (actor_id, generation). Follows the CLAUDE.md constraint exactly. Single-flight initialization via get_or_try_init is correct.


Code Quality

Three nearly identical handle_remote_sqlite_* functions. handle_remote_sqlite_exec, handle_remote_sqlite_execute, and handle_remote_sqlite_execute_write share the same validation + actor_db + remote_sqlite_executor_from_parts preamble with only the final call differing. Extracting a shared validate_and_get_executor helper would make future changes (e.g. adding metrics) easier to apply consistently.

Metrics are None for remote executors. open_database_from_conveyer is always called with metrics: None. VFS read/write/commit metrics will not be recorded for the remote-SQL path. If intentional, add a short comment so it does not look like an omission.

Unused #[cfg(test)] helpers. remove_remote_sqlite_executor_generation, remove_remote_sqlite_executors_for_actor, and clear_remote_sqlite_executors are test-only functions with no callers yet. Either add the tests that use them or remove them until needed.

sqlite_error_reason leaks internal chain. In depot-client/src/vfs.rs, sqlite_error_reason joins the full anyhow error chain and sends it back over the transport. Confirm that ws_to_tunnel_task.rs's sqlite_error_response applies appropriate sanitization before the message reaches an untrusted envoy client.


Minor / Formatting

Most of the remaining diff is pure reformatting (indentation corrections in VfsConfig::from_optimization_flags, open_database, open_connection, and a handful of long-line wraps). All correct.


Summary

Area Status
Arc<OnceCell> per (actor_id, generation) Follows CLAUDE.md
Cache eviction on stop/shutdown Correct
Bind-param size validation Present
SQL string size validation Missing
WebSocket disconnect -> remote_indeterminate_result Not handled (CLAUDE.md requirement)
has_pending_start_command validation logic Potential TOCTOU
Metrics for remote executor Silent None -- add a comment
Unused #[cfg(test)] helpers No callers yet

The sqlite.remote_indeterminate_result gap and the SQL-string size limit are the two issues worth resolving before merging. The generation-validation bypass deserves a closer look as well.

@NathanFlurry NathanFlurry marked this pull request as draft May 3, 2026 07:19
@NathanFlurry NathanFlurry force-pushed the driver-test-complaints/wire-remote-sqlite-exec branch from 3dad87c to 96d2371 Compare May 3, 2026 07:34
@NathanFlurry NathanFlurry force-pushed the driver-test-complaints/loosen-serverless-header-test branch from 6ddb158 to 44367cd Compare May 3, 2026 07:34
@NathanFlurry NathanFlurry force-pushed the driver-test-complaints/wire-remote-sqlite-exec branch from 96d2371 to 86c11aa Compare May 3, 2026 21:03
@NathanFlurry NathanFlurry force-pushed the driver-test-complaints/loosen-serverless-header-test branch from 44367cd to 20f0389 Compare May 3, 2026 21:03
This was referenced May 4, 2026
@NathanFlurry
Copy link
Copy Markdown
Member Author

Landed in main via stack-merge fast-forward push. Commits are in main; closing to match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant