feat(mcp): finish persistent — remove all v1 limitations + per-database catalog#32
Merged
StefanSteiner merged 9 commits intoMay 25, 2026
Conversation
The daemon_mode_persistent_engine_data_is_queryable test (and others using TestDaemon) periodically failed in CI with a 30s startup timeout. Root cause: find_free_port() bound port 0, read the assigned port, *dropped the listener*, then set HYPERDB_DAEMON_PORT for the daemon thread to bind that same port a moment later. Between drop and rebind another process could grab the port — daemon's bind would fail, run_daemon would Err out silently, and the test would hit the 30s timeout instead of failing fast with a useful message. Fix: pass port 0 to DaemonConfig and let HealthListener::bind(0) pick a free port itself. The actual port is reported back via the discovery file (info.health_port), which is what tests already read. The HYPERDB_DAEMON_PORT env var only matters for the spawn-from-scratch client path; once the daemon writes the discovery file, clients use its health_port directly. We still set HYPERDB_DAEMON_PORT="0" so the two `resolve_port_*` tests that explicitly inspect the env-var resolution path keep their isolation. Also: capture the daemon thread JoinHandle and check is_finished() on each iteration of the readiness loop, so future bind/spawn errors fail fast (~ms) with the real error rather than after the 30s timeout.
Iter 0 of the "remove v1 limitations" plan. Verifies Hyper supports
the five cross-database SQL shapes the merge-in-target-DB design
depends on, before any plumbing work begins:
1. CREATE TABLE "alias"."public"."tmp" AS SELECT ...
2. DELETE FROM "alias"."public"."t" t USING "alias"."public"."tmp" s
WHERE t.k = s.k
3. INSERT INTO "alias"."public"."t" SELECT * FROM "alias"."public"."tmp"
4. ALTER TABLE "alias"."public"."t" ADD COLUMN c TEXT
5. Qualified pg_catalog.{pg_tables,pg_attribute,pg_type,pg_class,
pg_namespace} probes for table_exists_in / column_metadata_in.
All five pass against a writable user-attached database, so the
merge path's plan to keep the temp table in the target DB (rather
than primary, which would require cross-DB DML) is viable.
Marked #[ignore] so they only run on demand:
cargo test -p hyperdb-mcp --test cross_db_dml_smoke -- --ignored
Lifts the v1 limitation that rejected `load_file` with mode="merge" + a non-primary `database` parameter. The merge path now keeps the temp table inside the same database as the target — eliminating cross-DB DML entirely (the temp and final table share a DB) and unlocking merges into persistent or any user-attached writable DB. Engine helpers (engine.rs): - table_exists_in(target_db, table) — qualified pg_tables probe - column_metadata_in(target_db, table) — qualified pg_attribute join via the existing describe_columns_via_pg_catalog helper - alter_table_add_columns_in(target_db, table, cols) — qualified ALTER Each falls back to the un-suffixed Catalog-API path when target_db is None, preserving the fast path for unqualified merges. Merge path (ingest.rs): - tmp_opts inherits opts.target_db so the temp lands in the target DB - TempTableGuard records target_db and qualifies its DROP - All five SQL sites (table_exists, column_metadata×3, alter_table_add_columns) route through the new *_in variants - DELETE-USING and INSERT-SELECT use qualified_table against the target; RENAME TO emits "alias"."public"."tmp" RENAME TO "newname" (qualified source, unqualified target — PostgreSQL/Hyper semantics) Server (server.rs): - Drops the InvalidArgument rejection in load_file's handler - Removes the matching note from SetTableMetadataParams' doc-comment Tests (per_tool_database_tests.rs): - merge_into_persistent_creates_table_when_missing — rename path - merge_into_persistent_replaces_matching_and_appends_new — DELETE-USING + INSERT-SELECT path - merge_into_persistent_alters_when_incoming_has_new_column — ALTER ADD COLUMN path with NULL backfill The pre-flight smoke battery (cross_db_dml_smoke.rs, Iter 0) verified all five SQL shapes work against an attached writable DB before any plumbing.
Lifts the v1 limitation that rejected `export(format="hyper", database=X)`
for non-primary X. The hyper-format export now snapshots whichever
database the caller named, instead of always snapshotting the primary
ephemeral workspace.
ExportOptions gains `source_db: Option<String>`. `export_hyper` and
`populate_export_target` thread it through to the cross-DB
`CREATE TABLE AS SELECT` (one per user table per schema), with `None`
falling back to `engine.primary_db_name()` so existing single-DB
callers keep working unchanged.
Server handler drops the up-front InvalidArgument rejection. The
underlying cross-DB CTAS shape was verified by Iter 0's smoke
battery before any plumbing.
New test: `export_hyper_with_source_db_snapshots_persistent` writes
distinct tables into primary and persistent, exports with
`source_db=Some("persistent")`, opens the snapshot fresh, and
asserts only the persistent table appears — primary's table is
excluded as expected.
Existing `ExportOptions { ... }` literals across export_tests.rs
and integration_tests.rs each gain `source_db: None` for the
preserved-behavior callers.
Lifts the v1 limitations that rejected `database`/`persist` on
`load_files` and `watch_directory`. Both tools now build their
connection pool against whichever .hyper file the resolved target
points at — primary, persistent, or any user-attached writable
database. Adds a guard so `detach_database` refuses to detach an
alias an active watcher is using.
Design choice: instead of qualifying SQL by alias on the pool's
connections, the pool opens the target's .hyper file directly as
its workspace. From those connections' perspective the target IS
the primary database, so SQL stays unqualified. The user-supplied
alias is used for path resolution and the catalog-skip decision
only.
watcher.rs:
- New resolve_pool_workspace(eng, attachments, target_db) returns
the file path the pool should open. Rejects unattached aliases
and read-only attachments up front.
- build_watcher_pool / rebuild_watcher_pool gain attachments +
target_db parameters.
- WatcherHandle gains target_db: Option<String> so detach can
inspect it; also surfaced in the watcher status JSON.
- start_watching gains attachments + target_db.
server.rs:
- load_files and watch_directory drop their up-front rejections.
load_files resolves the target once and inlines the workspace-
path resolution (mirrors resolve_pool_workspace) so it can build
the pool against the right file. watch_directory threads the
resolved target into start_watching.
- detach_database enumerates active watchers and refuses to detach
an alias any of them target. Error message names the directory
to unwatch_directory first.
- after_ingest_catalog_update gains target_db: Option<&str> and
short-circuits non-persistent user-attached targets at debug
level (per-DB catalog ships in a follow-up iter).
- Doc-comments on LoadFilesParams / WatchDirectoryParams updated
to reflect the new behavior.
tests/watcher_tests.rs:
- Bulk-update existing start_watching call sites (5) to pass
Arc::new(AttachRegistry::new()) and target_db: None.
- New watcher_ingests_into_persistent_target verifies a watcher
with target_db=Some("persistent") writes rows visible at
"persistent"."public"."events" but NOT in primary.
This iteration removes 3 of the 4 documented v1 limitations; the
remaining one (per-database _table_catalog) is the subject of the
next iter.
Closes the catalog gap surfaced in PR tableau#31 review. Every writable database (persistent + user-attached) now gets its own _table_catalog, lazily seeded on first ingest. The iter-3 placeholder that short-circuited non-persistent targets in after_ingest_catalog_update is removed. Pre-flight smoke tests (cross_db_dml_smoke.rs) verified that Hyper rejects PRIMARY KEY ("Index support is disabled"), ALTER TABLE ADD CONSTRAINT ("named constraints not implemented"), and INSERT … ON CONFLICT ("syntax error: got ON, expected FETCH, FOR, LIMIT, OFFSET"). The original plan for atomic upsert backed by a PK is therefore not viable. The DELETE+INSERT-in-transaction pattern stays. Within one MCP server engine, every catalog write runs inside HyperMcpServer::with_engine which holds the engine mutex for the duration, so concurrent ingests serialize at the catalog write boundary even though data writes run in parallel through the watcher pool's separate connections. Cross-process concurrency on the same .hyper file is out of scope. Three rejection tests (`expect_err`-asserting) document Hyper's limitations and serve as a regression alarm if a future Hyper release adds support. table_catalog.rs: - New *_in family: ensure_exists_in, upsert_stub_in, set_metadata_in, get_in, list_in, delete_for_in, reconcile_in, each taking target_db: Option<&str>. - Old names kept as 1-line wrappers passing None. - qualified_catalog_in resolves None/empty/"persistent" → persistent, Some(alias) → that alias verbatim. Case-insensitive on the persistent special-case. - resolve_catalog_alias is the alias-only sibling for callers that build their own SQL. - Internal helpers (user_tables, table_present, row_count_of, refresh_row_count) all gain _in variants. - set_metadata_in error message names the target DB when the target is a non-persistent alias. engine.rs: - catalog_present_cache: Mutex<Option<bool>> → Mutex<HashMap<String, bool>> keyed by lowercased alias. - catalog_present_in_persistent → catalog_present_in(alias, prober). - mark_catalog_present → mark_catalog_present_for(alias). - NEW clear_catalog_cache_for(alias). server.rs: - after_ingest_catalog_update routes via upsert_stub_in(target_db). - detach_database calls clear_catalog_cache_for inside with_engine on successful detach so a re-attach with different file/writability doesn't reuse a stale cache entry. attach.rs: - on_missing=create path uses ensure_exists_in(engine, Some(alias)) (the deprecated wrapper still works but the new name is preferred). tests: - cross_db_dml_smoke.rs: 3 new #[ignore]'d tests that document Hyper's rejection of PK / ADD CONSTRAINT / ON CONFLICT. - per_tool_database_tests.rs: 4 new tests covering ensure_exists_in + upsert_stub_in + get_in + reconcile_in routing into a user-attached writable DB. Each verifies the row lands in the user-attached catalog, NOT in persistent's. - two_db_model_tests.rs: catalog_presence_probe_is_cached rewritten for the HashMap variant; new clear_catalog_cache_for_invalidates_ alias_entry test. This is breaking at the engine API surface (catalog_present_in_* methods renamed; cache shape changed). Pre-1.0, deliberate.
Closes the user-attached friendliness gap surfaced in PR tableau#31 review. set_table_metadata now accepts a `database: Option<String>` field that routes the catalog write to that database's _table_catalog instead of always targeting persistent's. The handler resolves via `resolve_db(require_writable=true)`, so a read-only attachment is rejected up front with a clear "re-attach with writable:true" message — before any catalog write is attempted. Threads through to `table_catalog::set_metadata_in(target_db)` (landed in the prior iter), which lazily seeds the per-DB _table_catalog if absent. Works seamlessly with the recently- landed cross-DB ingest paths: load_data into a user-attached DB auto-stubs the catalog row, then set_table_metadata updates prose fields in that same DB's catalog. Two new tests: - set_metadata_in_routes_to_user_attached_db verifies the prose update lands in the user-attached DB's _table_catalog (visible via qualified SELECT) and not in persistent's. - set_metadata_in_missing_row_names_target_database_in_error asserts the TableNotFound message names the target DB so the LLM gets actionable guidance ("load it via load_data with database='bar' first") instead of a confusing missing-row error.
README: - The `database` parameter is now listed for `load_files`, `watch_directory`, and `set_table_metadata` (added in iters 3-5). - The `persist: true` shorthand list expanded to include `load_files` and `watch_directory`. - "v1 limitations" paragraph removed. - Per-database `_table_catalog` semantics documented: every writable database (persistent + user-attached) gets its own catalog, lazily seeded on first ingest. Pristine `.hyper` recipe updated to use `<alias>` since persistent is no longer the only option. - New "Detach safety" note documenting the watcher-active rejection on `detach_database`. CHANGELOG [Unreleased]: - "Limitations (deferred to a follow-up)" subsection removed. - New entries for cross-database merge, export-hyper source_db, load_files/watch_directory persist, per-database _table_catalog, detach guard, and set_table_metadata.database. - New "Changed (breaking — pre-1.0)" subsection lists the engine cache rename and the table_catalog deprecated wrapper. - "Performance" entry updated for the per-DB cache shape.
…s + docs) Final-sweep adversarial review surfaced six issues this commit fixes: C1 (CRITICAL): watcher ingest never updated _table_catalog. process_ready_with_recovery's success branch removed the .ready / data files and bumped stats but never called the catalog upsert. Effect: a watcher-driven ingest into persistent (or any user- attached writable target after iters 3-4) left the table un- catalogued. set_table_metadata against that table would error TableNotFound even though the data exists. Fix: take the engine mutex on the success branch and call upsert_stub_in with load_tool="watch_directory" + the resolved target_db. Errors are logged but never block the success bookkeeping. Test: watcher_ingests_into_persistent_target now also asserts the _table_catalog row exists with load_tool='watch_directory' after ingest. M1: set_metadata_in created _table_catalog as a side effect of returning EmptyData / TableNotFound errors. ensure_exists_in ran BEFORE the empty-fields validation and BEFORE the missing-row check, so a no-op or misguided call permanently mutated the target DB's schema. Fix: validate fields.is_empty() first; let get_in itself return None when the catalog table doesn't yet exist (no seed). Removes ensure_exists_in from set_metadata_in entirely — by the time we know we're writing, get_in has already returned an existing row, which means the catalog already exists. M2: the LLM-facing readme.rs (returned by get_readme) did not mention the persistent attachment, the database/persist tool parameters, or per-DB _table_catalog. From the model's perspective post-merge, the orientation document was stale. Added a "Workspace model" section listing the three database kinds (ephemeral / persistent / user-attached), the tools that accept database/persist, and the watcher / detach interaction. M3: stale "// short-circuits non-persistent targets (per-DB catalog ships in a follow-up iter)" comments at three sites in server.rs (load_data, load_file, load_files). Iter 4 is the follow-up; the helper now routes per-DB. Comments rewritten. m2: README.md had a byte-for-byte duplicated block of three sections (Daemon management / Recovery from hyperd crashes / Other behavioral flags) at lines 225-279. Pre-existing, surfaced during this review. Deleted the second copy. m5: top-level tool descriptions (the strings the LLM sees in tools/list) didn't mention the new database / persist capability on load_files, watch_directory, export, or set_table_metadata. The struct field doc-comments were correct but rmcp's JsonSchema-derived descriptions don't include them. Added a brief sentence to each of the four tool descriptions. Deferred to a follow-up PR (also surfaced in the same review): - M4: bootstrap reconcile only targets persistent; user-attached catalogs go stale over a long-lived session. Not introduced by this branch. - M5: alias canonicalization is case-sensitive in the registry but case-insensitive in the catalog cache. Latent footgun if a user attaches as "User_DB" and detaches as "user_db". Not introduced by this branch either.
Merged
8 tasks
StefanSteiner
added a commit
that referenced
this pull request
May 26, 2026
…ute reconcile, e2e harness (#33) * fix(mcp): canonicalize attach aliases to lowercase at attach time Pre-fix the registry stored aliases verbatim (case-sensitive `==`) while `Engine::catalog_present_cache` and the cache helpers all lowercased their keys. Reachable user scenario: 1. `attach_database(alias="User_DB", writable=true)` — registry stores `"User_DB"`. 2. `load_data(database="User_DB", ...)` — populates `"User_DB"."public"."_table_catalog"`. Cache key: `"user_db"`. 3. `detach_database(alias="user_db")` — registry says "no such alias" (case mismatch), returns `Ok(false)`. Cache stays populated under `"user_db"`. 4. `attach_database(alias="user_db", path=...)` to a different file — cache hit on `"user_db"` skips the existence probe. If the new file lacks `_table_catalog`, subsequent reads silently fail. Fix: lowercase at every alias boundary. - `AttachRegistry::attach` lowercases `req.alias` after `validate_alias`. - `AttachRegistry::get` and `detach` lowercase their input arg. - `Engine::resolve_target_db` lowercases non-persistent aliases so qualified SQL identifiers match the form `ATTACH DATABASE` used (Hyper is case-sensitive on quoted identifiers). - `qualified_catalog_in` and `resolve_catalog_alias` lowercase the user-attached branch. - `detach_database` watcher-active check simplifies from `eq_ignore_ascii_case` to `==` since both sides are now canonical. `validate_alias` is unchanged; error messages preserve user-typed casing for invalid input. Three new round-trip tests in `per_tool_database_tests.rs` cover: attach "User_DB" → list shows "user_db" + get("USER_DB") finds it; detach via differing-case alias succeeds; qualified catalog SQL written through user-typed mixed case lands in the canonical lowercase catalog row. Breaking change documented in CHANGELOG under "Changed (breaking — pre-1.0)". * fix(mcp): reconcile user-attached target's catalog after execute `after_execute_catalog_update` only reconciled persistent's `_table_catalog`. Raw DDL like `execute(database="foo", sql="DROP TABLE bar")` against a user-attached writable DB removed the table but left its stub row stranded in `"foo"."public"."_table_catalog"` indefinitely. Bootstrap reconcile and the post-execute reconcile both walked persistent only, so subsequent `describe(database="foo")` calls would keep listing the dropped table. Fix: thread the resolved `target_db` from the `execute` handler into `after_execute_catalog_update`, then reconcile persistent first and the user target if non-persistent. The `eq_ignore_ascii_case` guard on `PERSISTENT_ALIAS` is defensive — Iter 1 made `resolve_target_db` return canonical lowercase, but the helper handles a non-canonical string passed directly by anyone calling it in the future. The stale `// No-op in bare mode.` comment at the call site is rewritten to explain the new threading behavior; the `--bare` flag was removed in PR #32. New engine-level test `execute_drop_table_in_user_attached_reconciles_that_dbs_catalog` mirrors the post-fix reconcile pattern: stub a row in user_db's catalog, DROP the table via raw DDL, run both reconciles, and assert the stub is gone from user_db while persistent stays clean. CHANGELOG entry under "Fixed". * test(mcp): add end-to-end MCP test harness via tokio::io::duplex Adds a 9-test end-to-end harness that exercises tool calls through the full rmcp dispatch path — params deserialization, request-context plumbing, error mapping — that engine-level tests can't reach. Phase 3a investigation chose Option A (rmcp's own pattern from `tests/test_tool_macros.rs`): connect server + minimal `ClientHandler` across a `tokio::io::duplex(64KiB)` pair, then drive tools via the public `client.call_tool(CallToolRequestParams)` entry point. No faking of `Peer<RoleServer>` required. Adds `rmcp = { ..., features = ["client"] }` to dev-dependencies. Tests cover: Happy paths (PR #31 rejections lifted by PR #32): - tool_load_files_persist_via_router_now_works - tool_load_file_merge_database_now_works - tool_export_hyper_database_now_works - tool_watch_directory_persist_via_router_now_works PR #31 routing / rejection: - ephemeral_only_plus_persist_returns_invalid_argument - database_persistent_case_insensitive_routes_correctly (capital P) - persist_true_plus_database_local_lets_database_win Iter 4-5 paths: - tool_set_table_metadata_database_persistent - tool_detach_database_rejects_when_watcher_active Each test owns its own harness (server task + embedded hyperd via `with_no_daemon=true` + tempdir-backed persistent file), and the watcher tests `unwatch_directory` before shutdown so the background tokio task joins cleanly. Each happy-path test asserts BOTH the tool's success AND a downstream effect (rows in the right database, file produced) — the success flag alone could be true while the rows landed in the wrong place. Dropped from the originally-planned 13 tests: - case-insensitive `"Local"` (the persist+local-wins test already exercises case-insensitive Local resolution inside resolve_db). - readonly-attached set_table_metadata + user-attached set_table_metadata via router. Both are exercised at engine level in `per_tool_database_tests.rs`; the router layer just plumbs params through, no new behavior to verify. * fix(mcp): final-sweep follow-ups — copy_query canonicalize + execute reconcile gating Final pre-merge adversarial sweep against the integrated branch caught two cross-cutting issues that no individual iteration owned: 1. **CRITICAL — `copy_query` bypassed alias canonicalization.** Pre-fix, attaching as `"My_DB"` (which Iter 1's `AttachRegistry` now stores lowercased as `"my_db"`) followed by `copy_query(target_database="My_DB", ...)` rendered SQL referring to `"My_DB"."public"."t"` and failed with "database does not exist" because Hyper is case-sensitive on quoted identifiers. `copy_query` is the one tool that doesn't go through `resolve_db` / `Engine::resolve_target_db`, so the Iter 1 sweep missed it. Fix: lowercase `target_database` after the `LOCAL_ALIAS` filter so both the registry lookup AND `perform_copy` → `qualified_name` agree on the canonical form. New regression test `tool_copy_query_target_database_mixed_case_canonicalizes` in the E2E harness exercises the full attach-then-copy round-trip. 2. **MAJOR — `after_execute_catalog_update` ran on every `execute`, doubling under M4.** `reconcile_in` walks `pg_catalog.pg_tables` and runs `2N + 2` SQL round-trips for N existing tables. With Iter 2's M-target fan-out, `execute(database="user_db", ...)` did `4N + 4`. For a user with 50 tables in each of persistent + user_db, a single row-level UPDATE triggered ~204 catalog round-trips. INSERT / UPDATE / DELETE can't change the table set, so the reconcile is wasted work for those. Fix: gate the call on `is_structural_sql` — same predicate that already guards `notify_resource_list_changed` two lines later. Now the reconcile only fires for CREATE / DROP / ALTER / TRUNCATE / RENAME, which is the semantic justification for it in the first place. Also fixed two `#[expect(clippy::unused_self)]` reasons that said "kept for symmetry" but actually mean "method-call dispatch requires &self even though the body uses only the engine argument" — line-level reviewer flagged the reasons as misleading. CHANGELOG updated with both fixes; the structural-gate note rolls into the existing M4 entry. * fix(tests): bump TestDaemon startup timeout from 30s to 60s The user reported intermittent failures of `daemon_mode_two_engines_share_same_hyperd` on macOS: panicked at hyperdb-mcp/tests/daemon_tests.rs:818: TestDaemon did not start within 30 seconds Root cause: the outer wait was the same length as `HyperProcess::new`'s own internal 30s timeout for the hyperd-callback connection. Under macOS load (parallel tests, hyperd subprocess churn, system resource pressure), hyperd startup can approach or exceed that limit. When it does, the outer assertion fires before the inner timeout can return its actual error message — masking the real cause behind a generic "did not start" panic. Bumping the outer timeout to 60s ensures the inner timeout has room to surface a real error via the existing `daemon_handle.is_finished()` branch. If hyperd genuinely can't start, we'll see "daemon thread errored: <real reason>" instead of the unhelpful generic message. Also tightens the panic message to clarify the failure mode (daemon thread still running, no discovery file written) so the next reader knows whether to look at hyperd-startup vs. discovery-write paths.
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the four documented v1 rejection paths from PR #31 plus the two catalog gaps surfaced in that PR's adversarial review. After this branch:
load_filewithmode: "merge"accepts any writabledatabase.export(format="hyper", database=X)snapshots the requested database.load_filesandwatch_directoryacceptdatabase/persist; their pool opens the target's.hyperfile directly._table_catalog, lazily seeded on first ingest.set_table_metadataacceptsdatabaseto route the catalog write.detach_databaserejects when an active watcher targets the alias.Iteration story
a5c2f89#[ignore]'d tests for cross-DB CTAS, DELETE-USING, INSERT-SELECT, ALTER ADD COLUMN, qualified pg_catalog probes — all pass).b0a7228Engine::table_exists_in,column_metadata_in,alter_table_add_columns_in.merge_via_temp_tablekeeps the temp in the target DB so all DML stays single-DB.ac124bfexport format=hypercross-DB.ExportOptions.source_dbplumbed intopopulate_export_target.7894e1dbuild_watcher_poolopens the target's.hyperfile directly as its workspace; from pool connections' POV the target IS primary, so SQL stays unqualified.WatcherHandle.target_dbrecorded so detach can reject and reconnect can re-resolve.b5f38c0_table_catalog. Pre-flight (3 more ignored tests) confirmed Hyper rejects PRIMARY KEY ("Index support is disabled"), ALTER ADD CONSTRAINT ("named constraints not implemented"), and ON CONFLICT (parser-level"syntax error: got ON"). So atomic-upsert at SQL level is dead. The replacement defense: every catalog write runs insideHyperMcpServer::with_engine, which holds the engine mutex for the duration — single-engine serialization. New*_in(target_db)family for every CRUD fn.Engine::catalog_present_cacheis nowMutex<HashMap<String, bool>>keyed by lowercased alias;clear_catalog_cache_for(alias)is called bydetach_database.fe05a97set_table_metadata.database. Routes catalog writes to the named DB.resolve_db(require_writable=true)rejects read-only targets BEFORE any catalog write.dcae0c2[Unreleased]reorganized with new entries.61e9a7bConcurrency contract for the catalog
Hyper rejects PRIMARY KEY and
INSERT … ON CONFLICT(verified by the smoke battery intests/cross_db_dml_smoke.rs— those 3 rejection-asserting tests serve as a regression alarm if a future Hyper release adds support). So the catalog table has no PK and the upsert is DELETE+INSERT in a transaction.Within one MCP server engine the lack of a PK is not a race risk: every catalog write runs inside
HyperMcpServer::with_engine(...), which holdsMutex<Option<Engine>>for the duration. Concurrent ingests serialize at the catalog write boundary even though the data writes run in parallel through the watcher pool's separate connections. Cross-process concurrency on the same.hyperfile is out of scope.Workflow
Followed the 5-phase plan-driven pattern:
61e9a7b.Reviewer findings (post-iter-7 sweep)
Fixed in this PR:
process_ready_with_recovery(the watcher ingest path) never calledafter_ingest_catalog_update. Watcher-driven ingests left tables un-catalogued;set_table_metadataagainst them returnedTableNotFound. Now stamps_table_catalogwithload_tool="watch_directory"on the engine's main connection. Testwatcher_ingests_into_persistent_targetnow also asserts the catalog row.set_metadata_inranensure_exists_inBEFORE field validation, so a no-op call (no fields set) would create_table_catalogin the target DB as a side effect of returningEmptyData. Re-ordered: validate first, letget_inreturn None when the catalog is absent — no schema mutation on error paths.readme.rs(returned byget_readme) had zero mention of the persistent attachment, the database/persist parameters, or per-DB_table_catalog. The orientation document the model is told to read first was stale. Added a "Workspace model" section.// short-circuits non-persistent targets (per-DB catalog ships in a follow-up iter)comments in server.rs. Iter 4 IS the follow-up iter; comments rewritten.Daemon management/Recovery from hyperd crashes/Other behavioral flags) in README.md surfaced during this review. Deleted the second copy.description = "..."strings the LLM sees intools/list) didn't advertise the new database/persist capability onload_files,watch_directory,export, orset_table_metadata. Field-level doc-comments were correct but rmcp doesn't include them. Added a brief sentence to each.Deferred to follow-up PRs
rmcp 1.7'sPeer::newispub(crate), so faking a request context for tests is fragile. Engine-level tests already cover the new code paths' correctness. Worth doing in a focused PR.statusJSON or by reconciling the target onexecute(database=…)."User_DB"and detaches as"user_db". Easy fix: canonicalize aliases at attach time.Breaking changes (pre-1.0)
Engine API surface:
Engine::catalog_present_in_persistent→catalog_present_in(alias, prober).Engine::mark_catalog_present→mark_catalog_present_for(alias).Engine::catalog_present_cache: Mutex<Option<bool>>→Mutex<HashMap<String, bool>>.Engine::clear_catalog_cache_for(alias)paired withdetach_database.Catalog API:
table_catalog::ensure_exists_in_database(engine, alias)is now a#[deprecated]wrapper overensure_exists_in(engine, Some(alias)).CHANGELOG
[Unreleased]documents both.Test plan
cargo fmt --check && cargo clippy --workspace --tests -- -D warnings— cleancargo test -p hyperdb-mcp— 30 test groups all greencargo test -p hyperdb-mcp --test cross_db_dml_smoke -- --ignored— 8 pass (5 from Iter 0, 3 from Iter 4 documenting Hyper rejections)61e9a7b