chore(mcp): atomic multi-statement execute via array sql parameter#49
Merged
StefanSteiner merged 6 commits intoMay 27, 2026
Merged
Conversation
The execute tool's `sql` parameter changes from `String` to `Vec<String>`.
A multi-element array runs inside a Hyper transaction so all statements
commit together or all roll back. The canonical use case is the Hyper
upsert pattern (no ON CONFLICT support):
execute({ "sql": [
"UPDATE settings SET value = 'dark' WHERE key = 'theme'",
"INSERT INTO settings (key, value) SELECT 'theme', 'dark'
WHERE NOT EXISTS (SELECT 1 FROM settings WHERE key = 'theme')"
]})
Validation rules enforced before any SQL hits the server:
- Non-empty array; no empty / whitespace-only / comment-only elements.
- Each element must contain exactly one statement (quote- and
comment-aware lexer rejects embedded `;`-separated multi-statements,
with E-string and line/block-comment handling).
- No element may be read-only (use the query tool for SELECT/WITH).
- DDL+DML cannot be mixed in one batch (Hyper aborts mixed transactions
with SQLSTATE 0A000).
- Multi-element all-DDL batches are rejected because Hyper auto-commits
CREATE/DROP/ALTER even inside a transaction.
Single-element batches skip BEGIN/COMMIT and keep the existing auto-
commit behavior, so a one-off DDL or DML stays a one-round-trip call.
The breaking shape change (string -> array) is intentional — the only
consumers are LLM clients and an array is the natural shape for batched
SQL.
Adds 18 inline unit tests covering the helper lexer (E-strings, doubled
quotes, nested block comments, trailing semicolons) and the batch
validator (every rejection path plus happy paths). Integration tests
against real hyperd land in a follow-up commit.
Update both READMEs (LLM-facing get_readme output + the published crate README) for the new `execute` shape: - Tool-index entry now describes `sql` as an array with multi-element batches running inside a transaction. - Replace the missing-ON-CONFLICT note with a tighter version that shows the array-form upsert pattern directly. Spell out the DDL+DML rejection and SQLSTATE 0A000 background. - Add an upsert example to the Examples block alongside a single- element auto-commit example. - README.md: add a "Transactions" subsection that points at docs/TRANSACTIONS.md for the underlying API and lists the Hyper transaction quirks (DDL auto-commit, post-error aborted state) that motivated the validation rules.
Three new integration tests in transaction_tests.rs that exercise the canonical use cases driving the `execute` array shape: - `batched_upsert_inserts_when_row_missing` — UPDATE + INSERT-WHERE-NOT- EXISTS lands a single row when the target key was absent. - `batched_upsert_updates_when_row_exists` — same pair with the row pre-populated; UPDATE applies, INSERT's WHERE NOT EXISTS suppresses the duplicate. Final cardinality is exactly 1. - `batched_multi_table_mutation_rolls_back_on_second_failure` — three- statement batch (INSERT into orders, second INSERT violates NOT NULL, UPDATE customers) where the failing second statement must roll back the first INSERT entirely. Without atomicity the orders table would carry the orphan row. Tests target `Engine::execute_in_transaction` directly, which is the correctness primitive that backs the new tool surface. The handler itself is mostly validation + classification, both of which are covered by the inline unit tests in commit 1.
Reviewer caught that the prior validation pass classified BEGIN / COMMIT / ROLLBACK / SAVEPOINT / END / START / ABORT / RELEASE as the catch-all `Other` kind, which let them through unchanged. A batch like `["BEGIN", "INSERT...", "COMMIT"]` would then run inside the wrapper's own transaction; the user-issued COMMIT mid-batch would commit early, silently breaking the atomicity guarantee LLMs rely on. Add a `StatementKind::TransactionControl` variant and reject it unconditionally in `validate_execute_batch` (both in singletons and multi-element arrays — even a one-shot BEGIN would leave the connection in an open-transaction state for the next caller). Update both READMEs to spell out the rule, and extend the unit tests to cover every TXN-control keyword we recognize.
Deep-reviewer pass surfaced a handful of cross-cutting concerns. Fixed:
- **Validation order.** `classify_statement` now runs BEFORE
`first_statement_only` inside `validate_execute_batch`. An LLM that
passes `["SELECT 1; SELECT 2"]` now learns to switch to the `query`
tool on the first round-trip instead of being told to split the
array first and only learning about the read-only rule on the retry.
- **`affected_rows` total.** Replaced the `filter_map(...).sum()` over
the JSON envelope with an in-loop `saturating_add` accumulator. No
more silent skip if a per-statement entry's `affected_rows` ever
fails to coerce to `u64`, and overflow clamps instead of wrapping.
- **End-to-end coverage gap.** Added four MCP-handler-level tests in
`end_to_end_mcp_tests.rs` that exercise the full path through rmcp
dispatch (validate → search-path guard → transaction wrapper →
response envelope): atomic upsert commit, mid-batch rollback with
per-statement-index naming, DDL/DML rejection, singleton auto-
commit. The pre-existing two `execute` calls in that file also
needed updating to the new array shape — caught by the new compile.
- **Response shape doc.** The LLM-facing readme now spells out the
response envelope (`{statements, affected_rows, per_statement,
stats}`) so an LLM with a stale prior-shape memory can self-correct.
- **Doc hygiene.** Dropped the dangling `is_structural_sql` reference
from `StatementKind`'s doc comment and added a brief on
`classify_statement` itself (rustdoc convention for `pub` items).
…rror The hand-rolled SQL lexer that detected `;`-separated multi-statement input was YAGNI: its only product was a slightly nicer error message, not a correctness boundary. Hyper itself rejects multi-statement strings on the wire with SQLSTATE `0A000` / "Multi-part queries", and the existing mapping at error.rs:130-134 already rewrites that into "Hyper only accepts one SQL statement per call. Split your query into separate execute/query calls — one per statement." The atomic-batch handler then wraps that into "statement N of M failed: ..." with the same actionable hint. So an LLM that smuggles `["INSERT...", "UPDATE A; UPDATE B"]` through gets effectively the same error after one BEGIN/ROLLBACK round-trip that the lexer would have produced client-side. Not worth ~150 LOC of hand-rolled byte scanning, the lexical-grammar drift risk against Hyper, and the open edge cases (dollar-quoted strings, parameter placeholders, anything else PG-compat that gets added later). Removed: - `first_statement_only` (~120 LOC) and its 7 inline tests - The `EscapeString` / quote/comment-aware state machine - Server-side validation call site + its `rejects_inner_multi_statement` unit test - README bullet about "exactly one statement per element" Kept (still load-bearing for correctness): - `classify_statement` / `StatementKind` for read-only / DDL / DML / TransactionControl rejection. These rules govern semantics, not message quality. Validation rules now: 1) non-empty array; 2) no empty/whitespace elements; 3) no read-only; 4) no transaction control; 5) no DDL+DML mix; 6) no multi-element all-DDL. Multi-statement-in-one-element input is no longer caught client-side — defers to Hyper's wire protocol error and the existing error mapping.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The MCP
executetool'ssqlparameter changes fromStringtoVec<String>. Multi-element arrays run inside a Hyper transaction so all statements commit together or all roll back. The canonical use case is the Hyper-style upsert (noON CONFLICTin Hyper):execute({ "sql": [ "UPDATE settings SET value = 'dark' WHERE key = 'theme'", "INSERT INTO settings (key, value) SELECT 'theme', 'dark' WHERE NOT EXISTS (SELECT 1 FROM settings WHERE key = 'theme')" ]})Single-element arrays (
["UPDATE …"]) skipBEGIN/COMMITentirely and keep the existing auto-commit behavior. The breaking shape change is intentional — the only consumers are LLM clients and an array is the natural shape for batched SQL.Validation rules (enforced before any SQL hits hyperd)
querytool for SELECT/WITH/EXPLAIN.BEGIN/COMMIT/ROLLBACK/SAVEPOINTin array elements are rejected — the wrapper manages the transaction, and a user-issued COMMIT mid-batch would silently break atomicity.0A000).Multi-statement input within a single array element (e.g.
"UPDATE a; UPDATE b") is not caught client-side. The handler defers to Hyper's own wire-protocol error (SQLSTATE0A000/ "Multi-part queries"), and the existing error-mapping rewrites that into "Hyper only accepts one SQL statement per call. Split your query into separate execute/query calls" before it surfaces to the LLM. This keeps validation focused on correctness rules rather than reimplementing a SQL lexer.Response shape
Uniform across single- and multi-statement batches:
{ "statements": 2, "affected_rows": 1, "per_statement": [ { "sql": "UPDATE settings …", "affected_rows": 0, "elapsed_ms": 1.2 }, { "sql": "INSERT INTO settings …", "affected_rows": 1, "elapsed_ms": 0.8 } ], "stats": { "operation": "transaction", "elapsed_ms": 4.7 } }Singletons report
operation: "command"andstatements: 1.Documentation
Both READMEs updated:
hyperdb-mcp/src/readme.rs(theget_readmetool's payload) — tool index, no-ON CONFLICTnote with the array-form pattern, response-shape spec, atomic-upsert example.hyperdb-mcp/README.md— rewrote theexecutetool block, addedUpsertsandTransactionssubsections that referencedocs/TRANSACTIONS.mdfor the underlying API semantics.Tests
6 commits (feature → docs → engine integration tests → fix transaction-control rejection → reviewer follow-ups → drop client-side multi-statement lexer). Test coverage:
engine.rscoveringclassify_statement(DDL/DML/read-only/transaction-control/other classification, comment-stripping).server.rscovering every rejection path plus happy paths (empty array, whitespace-only, read-only element, transaction-control in batch, transaction-control singleton, DDL+DML mix, multi-DDL, single-DDL, multi-DML, allows trailing semicolon).transaction_tests.rsagainst real hyperd: upsert insert path, upsert update path, multi-table rollback on second-statement failure.end_to_end_mcp_tests.rsgoing through full rmcp dispatch (validation, search-path guard, transaction wrapper, response envelope): atomic-upsert commit, mid-batch rollback with per-statement-index naming, DDL/DML rejection, singleton auto-commit.executecalls inend_to_end_mcp_tests.rsupdated to the new array shape.Test plan
hyperdb-mcp:execute({"sql":["CREATE TABLE t (k TEXT, v TEXT)"]}).BEGIN-only singleton is rejected with the "tool manages the transaction" suggestion.["UPDATE a; UPDATE b"]) surfaces Hyper's "Multi-part queries" error with the existing error-mapping suggestion to split into separate elements.