Skip to content

fix(mcp): finish-persistent follow-ups — alias canonicalization, execute reconcile, e2e harness#33

Merged
StefanSteiner merged 5 commits into
tableau:mainfrom
StefanSteiner:ssteiner/mcp-finish-persistent-2
May 26, 2026
Merged

fix(mcp): finish-persistent follow-ups — alias canonicalization, execute reconcile, e2e harness#33
StefanSteiner merged 5 commits into
tableau:mainfrom
StefanSteiner:ssteiner/mcp-finish-persistent-2

Conversation

@StefanSteiner
Copy link
Copy Markdown
Contributor

Summary

Closes the three follow-up items deferred from PR #32 (feat(mcp): finish persistent):

  • M5 — alias canonicalization at attach time. Eliminates the latent footgun where attaching as "User_DB" and detaching as "user_db" silently no-op'd while the catalog-presence cache stayed populated.
  • M4execute(database="foo", sql="DROP TABLE bar") now reconciles the user-attached DB's catalog. Pre-fix, dropped tables left stub rows stranded indefinitely.
  • Iter 6 — end-to-end MCP test harness via tokio::io::duplex. 10 tests exercise tool calls through the full rmcp dispatch path (params deserialization, request-context plumbing, error mapping) — coverage that engine-level tests can't reach.

Plus two issues caught by the final-sweep deeper reviewer that no individual iteration owned (the cross-cutting class the workflow exists to find).

Iteration story

Commit Iter What
bab5193 1 M5 alias canonicalization. AttachRegistry::attach lowercases req.alias after validate_alias. get / detach lowercase their input arg. Engine::resolve_target_db lowercases non-persistent aliases (Hyper is case-sensitive on quoted identifiers). qualified_catalog_in and resolve_catalog_alias lowercase the user-attached branch. detach_database's watcher-active check simplifies from eq_ignore_ascii_case to ==. CHANGELOG entry under "Changed (breaking — pre-1.0)".
53707ee 2 M4 reconcile-on-execute. after_execute_catalog_update gains target_db: Option<&str> and reconciles persistent first, then the user-attached target if non-persistent. The eq_ignore_ascii_case(PERSISTENT_ALIAS) guard is defensive — Iter 1 made resolve_target_db return canonical lowercase, but the helper handles a non-canonical string passed directly. Stale // No-op in bare mode. comment rewritten (the --bare flag was removed in PR #32).
64faee3 3 End-to-end MCP test harness. Phase-3a investigation chose Option A (rmcp's own pattern from tests/test_tool_macros.rs): tokio::io::duplex(64KiB) connecting an in-memory server to a DummyClientHandler, then driving tools via client.call_tool(CallToolRequestParams). No faking of Peer<RoleServer> required — the rmcp client feature is added as a dev-dep to unlock ClientHandler. 9 tests covering 4 happy paths (PR #31 lifted rejections), 3 routing paths, 2 iter-4-5 paths.
d4c3a06 Final-sweep reviewer fixes (see "Reviewer findings" below).
6dad92c Test flake fix: bump TestDaemon::start timeout from 30s to 60s. The outer wait was identical to HyperProcess::new's own 30s callback timeout, so on macOS under load the generic "did not start" assertion masked the real error. 60s gives the inner timeout room to surface via the existing daemon_handle.is_finished() branch.

Workflow

Followed the standing 5-phase plan-driven pattern:

  • Phase 1: surface mapping via single Explore agent — confirmed M4/M5 call sites and discovered ToolCallContext::new is pub (Option B viable; ultimately Option A landed).
  • Phase 2: drafted plan, edited inline with verified facts.
  • Phase 4: per-iteration feature-dev:code-reviewer against each iter. All 3 came back clean (the only flag was a hallucinated "syntax error" the reviewer self-corrected).
  • Phase 5: BOTH reviewers in parallel against the integrated branch. Deeper reviewer caught two real cross-cutting defects that no individual iteration owned — exactly the kind of issue the final-sweep step exists to find. Fixed in d4c3a06.

Reviewer findings (post-iter-3 sweep)

Fixed in this PR (d4c3a06):

  • CRITICAL — copy_query bypassed alias canonicalization. copy_query is the one tool that doesn't go through resolve_db / Engine::resolve_target_db, so the Iter 1 sweep missed it. Pre-fix: attach_database(alias="My_DB") (which Iter 1 stores lowercased as "my_db") followed by copy_query(target_database="My_DB", ...) rendered SQL referring to "My_DB"."public"."t" and failed with "database does not exist" because Hyper is case-sensitive on quoted identifiers. Fix: lowercase target_database after the LOCAL_ALIAS filter so both the registry lookup AND qualified_name agree on the canonical form. New regression test tool_copy_query_target_database_mixed_case_canonicalizes exercises the full attach-then-copy round-trip.

  • MAJOR — after_execute_catalog_update ran on every execute, doubling under M4. reconcile_in walks pg_catalog.pg_tables and runs 2N + 2 SQL round-trips for N existing tables. With Iter 2's M-target fan-out, execute(database="user_db", ...) did 4N + 4. For a user with 50 tables in each of persistent + user_db, a single row-level UPDATE triggered ~204 catalog round-trips. Fix: gate the call on is_structural_sql — same predicate that already guards notify_resource_list_changed two lines later. Reconcile now only fires for CREATE / DROP / ALTER / TRUNCATE / RENAME, which is the semantic justification for it in the first place.

  • MINOR — misleading #[expect(clippy::unused_self)] reasons. Two methods (after_execute_catalog_update, after_ingest_catalog_update) had reason = "kept for symmetry" which suggested &self was decorative. Actually &self is required by Rust's method-call syntax even when the body uses only the engine argument. Reasons updated to "&self required for method-call dispatch; body uses only engine + params".

E2E test harness — what's in it

Four "now works" happy paths (PR #31 rejections lifted by PR #32):

  • tool_load_files_persist_via_router_now_works
  • tool_load_file_merge_database_now_works
  • tool_export_hyper_database_now_works
  • tool_watch_directory_persist_via_router_now_works

Three PR #31 routing / rejection paths:

  • ephemeral_only_plus_persist_returns_invalid_argument
  • database_persistent_case_insensitive_routes_correctly (capital P)
  • persist_true_plus_database_local_lets_database_win

Two iter-4-5 paths:

  • tool_set_table_metadata_database_persistent
  • tool_detach_database_rejects_when_watcher_active

Plus the regression test added in the final sweep:

  • tool_copy_query_target_database_mixed_case_canonicalizes

Each test owns its own harness (server task + embedded hyperd via with_no_daemon=true + tempdir-backed persistent file), and watcher tests unwatch_directory before shutdown so the background tokio task joins cleanly. Each happy-path test asserts both the tool's !is_error AND a downstream effect (rows in the right database, file produced) — is_error: false alone could be true while the rows landed in the wrong place.

Breaking changes (pre-1.0)

  • Attach aliases are canonicalized to lowercase at attach time. attach_database(alias="MyDB", ...) now stores "mydb" in the registry. Engine::resolve_target_db returns the lowercase form for any alias. Affects users who relied on case-sensitive registry distinctness ("foo" vs "FOO" as separate aliases). Documented in CHANGELOG.

Test plan

  • cargo fmt --check && cargo clippy --workspace --tests -- -D warnings — clean
  • cargo test -p hyperdb-mcp — all green (added 4 tests in per_tool_database_tests, 10 in new end_to_end_mcp_tests)
  • cargo test -p hyperdb-mcp --test cross_db_dml_smoke -- --ignored — 8 pass (no change from PR feat(mcp): finish persistent — remove all v1 limitations + per-database catalog #32)
  • Per-iteration adversarial review on iters 1, 2, 3 — all clean
  • Final pre-merge sweep with both reviewers in parallel — 3 fixes landed in d4c3a06
  • cargo doc --no-deps -p hyperdb-mcp — no new warnings
  • daemon_tests re-run 3x locally back-to-back — clean each time after the timeout bump
  • Manual MCP-client smoke for the case-insensitive alias round-trip:
    • attach_database(alias="MyDB", path=...)list_attached_databases() shows "mydb" (canonical)
    • detach_database(alias="MyDB") → succeeds

Pre-fix the registry stored aliases verbatim (case-sensitive `==`)
while `Engine::catalog_present_cache` and the cache helpers all
lowercased their keys. Reachable user scenario:

1. `attach_database(alias="User_DB", writable=true)` — registry
   stores `"User_DB"`.
2. `load_data(database="User_DB", ...)` — populates
   `"User_DB"."public"."_table_catalog"`. Cache key: `"user_db"`.
3. `detach_database(alias="user_db")` — registry says "no such
   alias" (case mismatch), returns `Ok(false)`. Cache stays
   populated under `"user_db"`.
4. `attach_database(alias="user_db", path=...)` to a different
   file — cache hit on `"user_db"` skips the existence probe.
   If the new file lacks `_table_catalog`, subsequent reads
   silently fail.

Fix: lowercase at every alias boundary.

- `AttachRegistry::attach` lowercases `req.alias` after `validate_alias`.
- `AttachRegistry::get` and `detach` lowercase their input arg.
- `Engine::resolve_target_db` lowercases non-persistent aliases so
  qualified SQL identifiers match the form `ATTACH DATABASE` used
  (Hyper is case-sensitive on quoted identifiers).
- `qualified_catalog_in` and `resolve_catalog_alias` lowercase the
  user-attached branch.
- `detach_database` watcher-active check simplifies from
  `eq_ignore_ascii_case` to `==` since both sides are now canonical.

`validate_alias` is unchanged; error messages preserve user-typed
casing for invalid input.

Three new round-trip tests in `per_tool_database_tests.rs` cover:
attach "User_DB" → list shows "user_db" + get("USER_DB") finds it;
detach via differing-case alias succeeds; qualified catalog SQL
written through user-typed mixed case lands in the canonical
lowercase catalog row.

Breaking change documented in CHANGELOG under "Changed (breaking —
pre-1.0)".
`after_execute_catalog_update` only reconciled persistent's
`_table_catalog`. Raw DDL like `execute(database="foo", sql="DROP
TABLE bar")` against a user-attached writable DB removed the table
but left its stub row stranded in `"foo"."public"."_table_catalog"`
indefinitely. Bootstrap reconcile and the post-execute reconcile
both walked persistent only, so subsequent `describe(database="foo")`
calls would keep listing the dropped table.

Fix: thread the resolved `target_db` from the `execute` handler into
`after_execute_catalog_update`, then reconcile persistent first and
the user target if non-persistent. The `eq_ignore_ascii_case` guard
on `PERSISTENT_ALIAS` is defensive — Iter 1 made `resolve_target_db`
return canonical lowercase, but the helper handles a non-canonical
string passed directly by anyone calling it in the future.

The stale `// No-op in bare mode.` comment at the call site is
rewritten to explain the new threading behavior; the `--bare` flag
was removed in PR tableau#32.

New engine-level test
`execute_drop_table_in_user_attached_reconciles_that_dbs_catalog`
mirrors the post-fix reconcile pattern: stub a row in user_db's
catalog, DROP the table via raw DDL, run both reconciles, and
assert the stub is gone from user_db while persistent stays clean.

CHANGELOG entry under "Fixed".
Adds a 9-test end-to-end harness that exercises tool calls through
the full rmcp dispatch path — params deserialization, request-context
plumbing, error mapping — that engine-level tests can't reach.

Phase 3a investigation chose Option A (rmcp's own pattern from
`tests/test_tool_macros.rs`): connect server + minimal `ClientHandler`
across a `tokio::io::duplex(64KiB)` pair, then drive tools via the
public `client.call_tool(CallToolRequestParams)` entry point. No
faking of `Peer<RoleServer>` required.

Adds `rmcp = { ..., features = ["client"] }` to dev-dependencies.

Tests cover:

Happy paths (PR tableau#31 rejections lifted by PR tableau#32):
- tool_load_files_persist_via_router_now_works
- tool_load_file_merge_database_now_works
- tool_export_hyper_database_now_works
- tool_watch_directory_persist_via_router_now_works

PR tableau#31 routing / rejection:
- ephemeral_only_plus_persist_returns_invalid_argument
- database_persistent_case_insensitive_routes_correctly (capital P)
- persist_true_plus_database_local_lets_database_win

Iter 4-5 paths:
- tool_set_table_metadata_database_persistent
- tool_detach_database_rejects_when_watcher_active

Each test owns its own harness (server task + embedded hyperd via
`with_no_daemon=true` + tempdir-backed persistent file), and the
watcher tests `unwatch_directory` before shutdown so the background
tokio task joins cleanly.

Each happy-path test asserts BOTH the tool's success AND a downstream
effect (rows in the right database, file produced) — the success flag
alone could be true while the rows landed in the wrong place.

Dropped from the originally-planned 13 tests:
- case-insensitive `"Local"` (the persist+local-wins test already
  exercises case-insensitive Local resolution inside resolve_db).
- readonly-attached set_table_metadata + user-attached
  set_table_metadata via router. Both are exercised at engine level
  in `per_tool_database_tests.rs`; the router layer just plumbs
  params through, no new behavior to verify.
…reconcile gating

Final pre-merge adversarial sweep against the integrated branch
caught two cross-cutting issues that no individual iteration owned:

1. **CRITICAL — `copy_query` bypassed alias canonicalization.**
   Pre-fix, attaching as `"My_DB"` (which Iter 1's `AttachRegistry`
   now stores lowercased as `"my_db"`) followed by
   `copy_query(target_database="My_DB", ...)` rendered SQL referring
   to `"My_DB"."public"."t"` and failed with "database does not
   exist" because Hyper is case-sensitive on quoted identifiers.
   `copy_query` is the one tool that doesn't go through `resolve_db`
   / `Engine::resolve_target_db`, so the Iter 1 sweep missed it.

   Fix: lowercase `target_database` after the `LOCAL_ALIAS` filter
   so both the registry lookup AND `perform_copy` →
   `qualified_name` agree on the canonical form. New regression test
   `tool_copy_query_target_database_mixed_case_canonicalizes` in the
   E2E harness exercises the full attach-then-copy round-trip.

2. **MAJOR — `after_execute_catalog_update` ran on every `execute`,
   doubling under M4.**
   `reconcile_in` walks `pg_catalog.pg_tables` and runs `2N + 2`
   SQL round-trips for N existing tables. With Iter 2's M-target
   fan-out, `execute(database="user_db", ...)` did `4N + 4`. For a
   user with 50 tables in each of persistent + user_db, a single
   row-level UPDATE triggered ~204 catalog round-trips. INSERT /
   UPDATE / DELETE can't change the table set, so the reconcile is
   wasted work for those.

   Fix: gate the call on `is_structural_sql` — same predicate that
   already guards `notify_resource_list_changed` two lines later.
   Now the reconcile only fires for CREATE / DROP / ALTER / TRUNCATE /
   RENAME, which is the semantic justification for it in the first
   place.

Also fixed two `#[expect(clippy::unused_self)]` reasons that said
"kept for symmetry" but actually mean "method-call dispatch
requires &self even though the body uses only the engine
argument" — line-level reviewer flagged the reasons as misleading.

CHANGELOG updated with both fixes; the structural-gate note rolls
into the existing M4 entry.
The user reported intermittent failures of
`daemon_mode_two_engines_share_same_hyperd` on macOS:

  panicked at hyperdb-mcp/tests/daemon_tests.rs:818:
  TestDaemon did not start within 30 seconds

Root cause: the outer wait was the same length as `HyperProcess::new`'s
own internal 30s timeout for the hyperd-callback connection. Under
macOS load (parallel tests, hyperd subprocess churn, system resource
pressure), hyperd startup can approach or exceed that limit. When it
does, the outer assertion fires before the inner timeout can return
its actual error message — masking the real cause behind a generic
"did not start" panic.

Bumping the outer timeout to 60s ensures the inner timeout has room
to surface a real error via the existing `daemon_handle.is_finished()`
branch. If hyperd genuinely can't start, we'll see "daemon thread
errored: <real reason>" instead of the unhelpful generic message.

Also tightens the panic message to clarify the failure mode (daemon
thread still running, no discovery file written) so the next reader
knows whether to look at hyperd-startup vs. discovery-write paths.
@StefanSteiner StefanSteiner merged commit 242be20 into tableau:main May 26, 2026
18 of 19 checks passed
@StefanSteiner StefanSteiner mentioned this pull request May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant