Skip to content

chore(mcp): atomic multi-statement execute via array sql parameter#49

Merged
StefanSteiner merged 6 commits into
tableau:mainfrom
StefanSteiner:ssteiner/atomic-multi-statement-execute
May 27, 2026
Merged

chore(mcp): atomic multi-statement execute via array sql parameter#49
StefanSteiner merged 6 commits into
tableau:mainfrom
StefanSteiner:ssteiner/atomic-multi-statement-execute

Conversation

@StefanSteiner
Copy link
Copy Markdown
Contributor

Summary

The MCP execute tool's sql parameter changes from String to Vec<String>. Multi-element arrays run inside a Hyper transaction so all statements commit together or all roll back. The canonical use case is the Hyper-style upsert (no ON CONFLICT in Hyper):

execute({ "sql": [
  "UPDATE settings SET value = 'dark' WHERE key = 'theme'",
  "INSERT INTO settings (key, value) SELECT 'theme', 'dark'
     WHERE NOT EXISTS (SELECT 1 FROM settings WHERE key = 'theme')"
]})

Single-element arrays (["UPDATE …"]) skip BEGIN/COMMIT entirely and keep the existing auto-commit behavior. The breaking shape change is intentional — the only consumers are LLM clients and an array is the natural shape for batched SQL.

Validation rules (enforced before any SQL hits hyperd)

  1. Non-empty array; no empty / whitespace-only / comment-only elements.
  2. No element may be read-only — use the query tool for SELECT/WITH/EXPLAIN.
  3. Explicit BEGIN / COMMIT / ROLLBACK / SAVEPOINT in array elements are rejected — the wrapper manages the transaction, and a user-issued COMMIT mid-batch would silently break atomicity.
  4. DDL+DML cannot be mixed in one batch (Hyper aborts mixed transactions with 0A000).
  5. Multi-element all-DDL batches are rejected — Hyper auto-commits CREATE/DROP/ALTER even inside a transaction, so the "atomic" promise can't be honored.

Multi-statement input within a single array element (e.g. "UPDATE a; UPDATE b") is not caught client-side. The handler defers to Hyper's own wire-protocol error (SQLSTATE 0A000 / "Multi-part queries"), and the existing error-mapping rewrites that into "Hyper only accepts one SQL statement per call. Split your query into separate execute/query calls" before it surfaces to the LLM. This keeps validation focused on correctness rules rather than reimplementing a SQL lexer.

Response shape

Uniform across single- and multi-statement batches:

{
  "statements": 2,
  "affected_rows": 1,
  "per_statement": [
    { "sql": "UPDATE settings …", "affected_rows": 0, "elapsed_ms": 1.2 },
    { "sql": "INSERT INTO settings …", "affected_rows": 1, "elapsed_ms": 0.8 }
  ],
  "stats": { "operation": "transaction", "elapsed_ms": 4.7 }
}

Singletons report operation: "command" and statements: 1.

Documentation

Both READMEs updated:

  • hyperdb-mcp/src/readme.rs (the get_readme tool's payload) — tool index, no-ON CONFLICT note with the array-form pattern, response-shape spec, atomic-upsert example.
  • hyperdb-mcp/README.md — rewrote the execute tool block, added Upserts and Transactions subsections that reference docs/TRANSACTIONS.md for the underlying API semantics.

Tests

6 commits (feature → docs → engine integration tests → fix transaction-control rejection → reviewer follow-ups → drop client-side multi-statement lexer). Test coverage:

  • 3 helper unit tests in engine.rs covering classify_statement (DDL/DML/read-only/transaction-control/other classification, comment-stripping).
  • 10 validator unit tests in server.rs covering every rejection path plus happy paths (empty array, whitespace-only, read-only element, transaction-control in batch, transaction-control singleton, DDL+DML mix, multi-DDL, single-DDL, multi-DML, allows trailing semicolon).
  • 3 engine-level integration tests in transaction_tests.rs against real hyperd: upsert insert path, upsert update path, multi-table rollback on second-statement failure.
  • 4 MCP-handler-level integration tests in end_to_end_mcp_tests.rs going through full rmcp dispatch (validation, search-path guard, transaction wrapper, response envelope): atomic-upsert commit, mid-batch rollback with per-statement-index naming, DDL/DML rejection, singleton auto-commit.
  • Pre-existing execute calls in end_to_end_mcp_tests.rs updated to the new array shape.

Test plan

  • CI green on the PR (workspace lint/build/test).
  • Manual smoke through Claude Code / Cursor against a freshly-built hyperdb-mcp:
    • Single-element happy path: execute({"sql":["CREATE TABLE t (k TEXT, v TEXT)"]}).
    • Atomic upsert: the canonical UPDATE+INSERT-WHERE-NOT-EXISTS pair, twice (insert path on first call, update path on second).
    • Forced rollback: a two-element batch where the second statement violates a constraint; verify the first row is not present afterward.
    • DDL/DML mix rejection surfaces with the SQLSTATE-0A000 reference in the suggestion.
    • BEGIN-only singleton is rejected with the "tool manages the transaction" suggestion.
    • Inner-multi-statement element (e.g. ["UPDATE a; UPDATE b"]) surfaces Hyper's "Multi-part queries" error with the existing error-mapping suggestion to split into separate elements.

The execute tool's `sql` parameter changes from `String` to `Vec<String>`.
A multi-element array runs inside a Hyper transaction so all statements
commit together or all roll back. The canonical use case is the Hyper
upsert pattern (no ON CONFLICT support):

  execute({ "sql": [
    "UPDATE settings SET value = 'dark' WHERE key = 'theme'",
    "INSERT INTO settings (key, value) SELECT 'theme', 'dark'
       WHERE NOT EXISTS (SELECT 1 FROM settings WHERE key = 'theme')"
  ]})

Validation rules enforced before any SQL hits the server:
- Non-empty array; no empty / whitespace-only / comment-only elements.
- Each element must contain exactly one statement (quote- and
  comment-aware lexer rejects embedded `;`-separated multi-statements,
  with E-string and line/block-comment handling).
- No element may be read-only (use the query tool for SELECT/WITH).
- DDL+DML cannot be mixed in one batch (Hyper aborts mixed transactions
  with SQLSTATE 0A000).
- Multi-element all-DDL batches are rejected because Hyper auto-commits
  CREATE/DROP/ALTER even inside a transaction.

Single-element batches skip BEGIN/COMMIT and keep the existing auto-
commit behavior, so a one-off DDL or DML stays a one-round-trip call.

The breaking shape change (string -> array) is intentional — the only
consumers are LLM clients and an array is the natural shape for batched
SQL.

Adds 18 inline unit tests covering the helper lexer (E-strings, doubled
quotes, nested block comments, trailing semicolons) and the batch
validator (every rejection path plus happy paths). Integration tests
against real hyperd land in a follow-up commit.
Update both READMEs (LLM-facing get_readme output + the published crate
README) for the new `execute` shape:

- Tool-index entry now describes `sql` as an array with multi-element
  batches running inside a transaction.
- Replace the missing-ON-CONFLICT note with a tighter version that
  shows the array-form upsert pattern directly. Spell out the DDL+DML
  rejection and SQLSTATE 0A000 background.
- Add an upsert example to the Examples block alongside a single-
  element auto-commit example.
- README.md: add a "Transactions" subsection that points at
  docs/TRANSACTIONS.md for the underlying API and lists the Hyper
  transaction quirks (DDL auto-commit, post-error aborted state)
  that motivated the validation rules.
Three new integration tests in transaction_tests.rs that exercise the
canonical use cases driving the `execute` array shape:

- `batched_upsert_inserts_when_row_missing` — UPDATE + INSERT-WHERE-NOT-
  EXISTS lands a single row when the target key was absent.
- `batched_upsert_updates_when_row_exists` — same pair with the row
  pre-populated; UPDATE applies, INSERT's WHERE NOT EXISTS suppresses
  the duplicate. Final cardinality is exactly 1.
- `batched_multi_table_mutation_rolls_back_on_second_failure` — three-
  statement batch (INSERT into orders, second INSERT violates NOT NULL,
  UPDATE customers) where the failing second statement must roll back
  the first INSERT entirely. Without atomicity the orders table would
  carry the orphan row.

Tests target `Engine::execute_in_transaction` directly, which is the
correctness primitive that backs the new tool surface. The handler
itself is mostly validation + classification, both of which are covered
by the inline unit tests in commit 1.
Reviewer caught that the prior validation pass classified BEGIN /
COMMIT / ROLLBACK / SAVEPOINT / END / START / ABORT / RELEASE as the
catch-all `Other` kind, which let them through unchanged. A batch like
`["BEGIN", "INSERT...", "COMMIT"]` would then run inside the wrapper's
own transaction; the user-issued COMMIT mid-batch would commit early,
silently breaking the atomicity guarantee LLMs rely on.

Add a `StatementKind::TransactionControl` variant and reject it
unconditionally in `validate_execute_batch` (both in singletons and
multi-element arrays — even a one-shot BEGIN would leave the
connection in an open-transaction state for the next caller). Update
both READMEs to spell out the rule, and extend the unit tests to
cover every TXN-control keyword we recognize.
Deep-reviewer pass surfaced a handful of cross-cutting concerns. Fixed:

- **Validation order.** `classify_statement` now runs BEFORE
  `first_statement_only` inside `validate_execute_batch`. An LLM that
  passes `["SELECT 1; SELECT 2"]` now learns to switch to the `query`
  tool on the first round-trip instead of being told to split the
  array first and only learning about the read-only rule on the retry.

- **`affected_rows` total.** Replaced the `filter_map(...).sum()` over
  the JSON envelope with an in-loop `saturating_add` accumulator. No
  more silent skip if a per-statement entry's `affected_rows` ever
  fails to coerce to `u64`, and overflow clamps instead of wrapping.

- **End-to-end coverage gap.** Added four MCP-handler-level tests in
  `end_to_end_mcp_tests.rs` that exercise the full path through rmcp
  dispatch (validate → search-path guard → transaction wrapper →
  response envelope): atomic upsert commit, mid-batch rollback with
  per-statement-index naming, DDL/DML rejection, singleton auto-
  commit. The pre-existing two `execute` calls in that file also
  needed updating to the new array shape — caught by the new compile.

- **Response shape doc.** The LLM-facing readme now spells out the
  response envelope (`{statements, affected_rows, per_statement,
  stats}`) so an LLM with a stale prior-shape memory can self-correct.

- **Doc hygiene.** Dropped the dangling `is_structural_sql` reference
  from `StatementKind`'s doc comment and added a brief on
  `classify_statement` itself (rustdoc convention for `pub` items).
…rror

The hand-rolled SQL lexer that detected `;`-separated multi-statement
input was YAGNI: its only product was a slightly nicer error message,
not a correctness boundary. Hyper itself rejects multi-statement
strings on the wire with SQLSTATE `0A000` / "Multi-part queries", and
the existing mapping at error.rs:130-134 already rewrites that into
"Hyper only accepts one SQL statement per call. Split your query into
separate execute/query calls — one per statement." The atomic-batch
handler then wraps that into "statement N of M failed: ..." with the
same actionable hint.

So an LLM that smuggles `["INSERT...", "UPDATE A; UPDATE B"]` through
gets effectively the same error after one BEGIN/ROLLBACK round-trip
that the lexer would have produced client-side. Not worth ~150 LOC
of hand-rolled byte scanning, the lexical-grammar drift risk against
Hyper, and the open edge cases (dollar-quoted strings, parameter
placeholders, anything else PG-compat that gets added later).

Removed:
- `first_statement_only` (~120 LOC) and its 7 inline tests
- The `EscapeString` / quote/comment-aware state machine
- Server-side validation call site + its `rejects_inner_multi_statement`
  unit test
- README bullet about "exactly one statement per element"

Kept (still load-bearing for correctness):
- `classify_statement` / `StatementKind` for read-only / DDL / DML /
  TransactionControl rejection. These rules govern semantics, not
  message quality.

Validation rules now: 1) non-empty array; 2) no empty/whitespace
elements; 3) no read-only; 4) no transaction control; 5) no DDL+DML
mix; 6) no multi-element all-DDL. Multi-statement-in-one-element
input is no longer caught client-side — defers to Hyper's wire
protocol error and the existing error mapping.
@StefanSteiner StefanSteiner merged commit 15ddc8e into tableau:main May 27, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant