ci: release#6
Merged
Merged
Conversation
travisbreaks
pushed a commit
to travisbreaks/emdash
that referenced
this pull request
Apr 5, 2026
ascorbic
added a commit
that referenced
this pull request
Apr 26, 2026
* test(mcp): add integration coverage for documented bugs Adds 73 MCP integration tests across 8 files exercising the real MCP client/server pair against a real database, with no handler mocks. 43 tests fail today against bugs documented in the local MCP_BUGS.md log (#1, #2, #3, #4, #5, #6, #10, #11, #12) and will pass when the omnibus fix lands. Each failing test maps to a specific bug variant; the passing tests serve as regression guards. Refactors EmDashRuntime's 19-arg private positional constructor to a public single-argument constructor that takes an EmDashRuntimeParts object. Production code path is preserved (create() builds the parts and ends with new EmDashRuntime({ ... })); tests can now construct a real runtime around a pre-migrated test database without duplicating any handler logic. Adds an HTTP-level concurrency check to the smoke matrix to cover the parallel-401 race (the InMemoryTransport used by the integration suite cannot reach the production auth middleware path). * test(mcp): add comprehensive coverage for all MCP tools and edges Extends MCP integration coverage to every registered tool, plus the known gaps and edge cases. Adds 145 tests across 7 new files (218 total across the suite, 58 currently failing). New files: - schema.test.ts (46): list/get/create/delete collection, create/delete field, with validation, ownership, error envelope, and side-effect coverage. All passing — schema tools are largely correct. - taxonomy.test.ts (20): list/list_terms/create_term plus bug #7 (orphan collection drift) and bug #13 (no delete/update term gap). - media.test.ts (22): list/get/update/delete with ownership, mimeType filter, pagination, and bug #14 gap (no upload tool). Confirms media ownership extraction already handles null authorId correctly, pinning down the inconsistency with content extraction (bug #1). - menu.test.ts (10): read-only list/get plus bug #15 gap (no mutation tools). - search.test.ts (9): empty index, no-match, collection scoping, special-character handling, draft filtering, permissions. - content-misc.test.ts (25): content_duplicate, content_permanent_delete, content_translations + locale handling on get/create, _rev optimistic concurrency (happy + race), soft-delete visibility, edit-while-trashed, idempotency for publish/unpublish/schedule, and the content_unschedule MCP-tool gap. - input-schemas.test.ts (13): Zod-level argument validation across every tool — missing required, wrong type, out-of-range, enum violation. All passing — pins down the SDK boundary. Failure breakdown by bug: - bug #1 (null authorId): 10 in ownership.test.ts - bug #2 (stale draft data): 4 in drafts.test.ts - bug #3 (bare error strings): 6 in errors.test.ts + scattered - bug #4/#5/#6 (validation): 11 in validation.test.ts - bug #10 (publishedAt): 3 in lifecycle.test.ts - bug #11 (supports default): 2 in lifecycle.test.ts - bug #12 (cursor): 7 in pagination.test.ts - bug #7 + #13 (taxonomy): 8 in taxonomy.test.ts - bug #14 (media upload gap): 1 in media.test.ts - bug #15 (menu mutation gap): 3 in menu.test.ts - search FTS issues: 2 in search.test.ts - content_unschedule gap: 1 in content-misc.test.ts The remaining 58 failures collectively define the omnibus fix's acceptance criteria — turning these green will resolve every documented issue plus the gaps and edge cases this expansion surfaced. * fix(mcp): preserve error codes through tool response envelope The MCP server's `unwrap()` and `errorResult()` helpers stripped the structured error code from handler results, leaving callers with only a human-readable message. Tools that throw raw errors (schema, search, taxonomy, menu) similarly lost any signal beyond `error.message`. Replaces both helpers with three structured emitters: - respondData(data): success envelope - respondError(code, message, details?): emits the code as a stable `[CODE]` prefix on the message text AND attaches `_meta.code` so MCP-aware clients can read it programmatically - respondHandlerError(error, fallbackCode): for the catch-all sites that wrap raw thrown errors. Recognises the `apiError: { code }` annotation that handlers attach to thrown NOT_FOUND / CONFLICT errors (see api/handlers/content.ts:538), preserves the original message, falls back to `INTERNAL_ERROR` otherwise. `unwrap()` now propagates `error.code`, `error.message`, and `error.details` verbatim from the handler's ApiResult shape. Every call site of the old `errorResult()` is migrated: - ApiResult-returning paths use unwrap (already used everywhere) - SchemaRegistry / search / taxonomy / menu try/catch sites use respondHandlerError with tool-specific fallback codes (SCHEMA_LIST_ERROR, FIELD_CREATE_ERROR, SEARCH_ERROR, etc.) - String-error sites ("Collection 'X' not found", "Menu 'X' not found", "Taxonomy 'X' not found") use respondError("NOT_FOUND", ...) - Revision restore "missing collection or entry reference" uses respondError("VALIDATION_ERROR", ...) This is the A1 commit from .opencode/notes/MCP_FIX_PLAN.md. Flips the `_meta.code` propagation test in tests/integration/mcp/errors.test.ts from red to green. The remaining red tests in that file need handler-level work (Wave 2) that surfaces the underlying repository error instead of the current generic "Failed to create content" / "Failed to list content" placeholders. Backwards compatibility: existing MCP clients that match on the bare message will see a `[CODE] ` prefix added. The prefix is on a stable SCREAMING_SNAKE_CASE token, so regex matchers and substring tests on the original message continue to work. * fix(mcp): allow ownership checks on rows with null authorId The MCP server's extractContentAuthorId() previously threw an InternalError when content had no authorId — typically seed-imported rows. This blocked every mutating operation (update, delete, publish, unpublish, schedule, restore) for those rows, including for admins. The companion change in this commit's parent landed in mcp/server.ts: extractContentAuthorId() now returns "" instead of throwing, mirroring how media_update/media_delete already handle null authorId. The ownership check then defers to canActOnOwn(): an actor with the '*:edit_any' permission succeeds, and an actor with only '*:edit_own' gets a clean permission error rather than an internal one. Updates the unit test that previously asserted the throw was correct. The new tests confirm an ADMIN can edit null-authorId rows and an AUTHOR is denied with a permission message (not an internal one). * fix(content): clear publishedAt when content is unpublished ContentRepository.unpublish() previously left published_at populated when flipping status to draft, making 'currently published' indistinguishable from 'previously published' via the data alone. Clear published_at on unpublish so a missing/null timestamp unambiguously means the item is not live. Re-publishing assigns a fresh timestamp via the existing publish path. * fix(mcp): apply documented supports default in schema_create_collection Adds the changeset for this fix. The source change — defaulting `args.supports ?? ['drafts', 'revisions']` at the MCP boundary so the documented default is actually applied — landed alongside the error envelope refactor in 9986405 due to working-tree interleaving between parallel agents. Tracking it as a separate changeset so the user-visible behavior change is described independently. * fix(content,pagination,taxonomies): wave 2 omnibus fixes Combines the three Wave 2 fixes from .opencode/notes/MCP_FIX_PLAN.md because they entangled in the working tree during parallel agent execution. B (validation, bugs #4/#5/#6): wires generateZodSchema() into handleContentCreate / handleContentUpdate. Required fields enforced; select / multiSelect option lists honored; reference fields verified to point at a real, non-trashed target. New helper at api/handlers/validation.ts. Errors surface as { code: VALIDATION_ERROR, message } with all offending fields named in one message so callers can fix everything in a single round trip. C1 (cursor decoding, bug #12): decodeCursor() now throws InvalidCursorError on bad input instead of silently returning null. Updates every repository caller (content, byline, audit, redirect, comment, plugin-storage, media, user, sections, loader). API handler catch blocks recognise the error and return INVALID_CURSOR. The MCP boundary rejects empty-string cursors via z.string().min(1) on the input schemas. taxonomy_list_terms gets parallel treatment for its in-memory term-id cursor. G1 (taxonomy orphan filter, bug #7): handleTaxonomyList filters each taxonomy's collections array against the real collection set on read. Storage stays untouched so re-creating a deleted collection re-links automatically. Test fix-ups in the same commit (these are tests I committed earlier on this branch that didn't account for production behavior): - taxonomy.test.ts: account for the seeded 'category' and 'tag' taxonomies that migration 006 inserts; reshape bug-#7 test around the seed's 'posts' orphan rather than a manufactured one; correct taxonomy_create_term response envelope ({ term: { ... } }). * fix(mcp): wave 3+4 — error fidelity, gap-feature tools, search and draft fixes start - Handler-level error fidelity (bug #3): handleContentList and handleContentCreate / handleContentUpdate now translate database-level errors into structured codes (COLLECTION_NOT_FOUND, SLUG_CONFLICT, UNKNOWN_FIELD, VALIDATION_ERROR) instead of collapsing to a generic 'Failed to ...' message. - validation.ts: detects unknown keys in content data so callers get a useful error rather than silently dropped writes. - handleTermCreate: validates parentId exists and belongs to the same taxonomy. - New MCP tools (gap fills): - content_unschedule (bug — runtime had handleContentUnschedule but no MCP tool) - taxonomy_update_term, taxonomy_delete_term (bug #13) - menu_create, menu_update, menu_delete, menu_set_items (bug #15) - media_create (bug #14 — registers metadata for files uploaded out-of-band; binary uploads remain off the MCP transport) - Test fixups: content-handlers slug test no longer expects a non-string title to silently produce null slug (validation now rejects that). loader-cursor-pagination expects the loader's error envelope rather than a thrown error. - Taxonomy tests adjusted to account for the seed data inserted by migration 006 ('category' and 'tag'). * fix(content,search): draft data hydration and FTS robustness - EmDashRuntime.handleContentGet / handleContentUpdate / handleContentGetIncludingTrashed now overlay the current draft revision's data onto the response when draftRevisionId is set, with the previously-published values exposed alongside as 'liveData'. Resolves bug #2 — agents calling content_get after content_update on a revision-supporting collection now see their pending draft, not the stale live row. - searchSingleCollection swallows FTS5 syntax errors and returns no matches rather than leaking the SQLite error text to callers. Malformed user-typed queries (unbalanced quotes, stray operators, unclosed parens) produce empty results instead of a 500-style error. - Search test setup activates the FTS index via FTSManager.enableSearch(). Without it, the FTS table and triggers don't exist and the search-after-publish round trip silently produces no matches — same trap a real site would hit if the admin never enabled search. - Minor cleanup: deduplicate inline regex with COLLECTION_SLUG_PATTERN in mcp/server.ts; safer typeof-narrowed cast in hydrateDraftData. * fix(mcp): production correctness wave (F1-F6) F1: media_create now forwards authorId from MCP user to repository so ownership checks succeed on subsequent media_update/delete by the same author. F2: taxonomy_update_term gains parent validation (existence, same taxonomy, self-parent rejection) and cycle detection via parent-chain walk (bounded at 100). Shared helper used by create + update. F3: empty-string parentId/slug normalized to undefined/null at handler and repository layers; previously '' was persisted as a literal slug. F4: content validation runs in EmDashRuntime.handleContentUpdate BEFORE the draft revision write, so revision-supporting collections can no longer slip invalid data into history. The downstream API handler still validates as defense-in-depth. F5: validation.ts reference-target catch is narrowed to isMissingTableError; any other DB error now propagates instead of being silently reported as 'reference not found'. IN clause is also chunked at SQL_BATCH_SIZE for D1 bind-parameter safety. F6: menu_delete (REST + MCP) now deletes _emdash_menu_items first inside a transaction. D1 has FOREIGN KEYS off by default so cascade was a no-op there. Description updated to drop the misleading FK-cascade claim. taxonomy_delete_term description corrected to match the handler's 'cannot delete with children' behavior. Tests: media gap test replaced with full happy-path (AUTHOR creates, gets, updates own; OTHER_AUTHOR is denied). Taxonomy gap presence-only tests replaced with rename/relabel/reparent/detach/cross-taxonomy rejection/self-parent rejection/cycle rejection. Menu gap presence-only tests replaced with create+get round-trip, duplicate CONFLICT, update label, set_items empty, set_items 3-level nesting, set_items parentIndex>=i rejection, F6 delete-cascade verification. 2806 -> 2821 tests passing. * fix(mcp): error envelope codes for SchemaError + auth (F7) respondHandlerError now reads error.code from any Error subclass with a string code field — SchemaError ('RESERVED_SLUG', 'COLLECTION_EXISTS', etc.) and the new EmDashAuthError class flow through with their stable codes intact. Numeric codes (e.g. McpError) are skipped — _meta.code is reserved for stable string codes. Auth helpers (requireScope, requireRole, requireOwnership, requireDraftAccess, the inline publish-perm check) now throw EmDashAuthError with stable codes (INSUFFICIENT_SCOPE, INSUFFICIENT_PERMISSIONS) rather than McpError. The SDK's text-only fallback envelope was stripping the code. To make those throws survive the SDK's catch path, registerTool is locally monkey-patched at server-creation time so every tool callback is wrapped in try/catch that funnels into respondHandlerError — without requiring 41 callbacks to grow individual try/catch blocks. Tests: 3 new F7 cases verify _meta.code === 'INSUFFICIENT_SCOPE' / 'INSUFFICIENT_PERMISSIONS' / SchemaError code (non-fallback) instead of the SDK's previous code-less text envelope. 2821 -> 2824 tests passing. * test(mcp): tighten regex assertions, add happy paths, liveData type (F8-F13) F8: replace broken /not.found/ regex (the . was a metacharacter matching any byte) with a tight /\bNOT_FOUND\b|\bnot found\b/i. Applied across errors / schema / content-misc / menu / taxonomy / media / validation test files. F9: drop the id-alternation in echo regexes — /(NOT_FOUND|01NEVEREXISTED)/ passed if the id alone was echoed, defeating the test. Each NOT_FOUND test now asserts the code/word AND, separately via expect.toContain, the id (when the message body actually echoes it). F10: tighten overly broad collection/field 'word match' assertions to the specific code (COLLECTION_NOT_FOUND / FIELD_NOT_FOUND) plus an expect.toContain on the offending slug. F11: replaced empty-conditional dead branches with positive post-condition assertions: - publish twice / unpublish on draft now check status in the success branch as well as the error branch. - pagination.test 'limit beyond max' now also checks the error message in the rejection branch. - search 'special chars' now checks items is an array in the sanitized-success branch. - search 'scope filter' now also asserts items.length > 0 so the for- loop is reachable. - pagination 'revision_list malformed cursor' was a no-op (expect(result).toBeDefined()) — deleted, with a note explaining revision_list has no cursor parameter. F12: replaced presence-only listTools tests with happy-path coverage: - content_unschedule: schedule + unschedule + verify scheduledAt is null + re-publish succeeds. - (taxonomy/menu/media gap-feature happy paths landed in the previous commit alongside F2/F3/F6.) F13: ContentItem now declares an optional liveData?: Record<string, unknown> field, with JSDoc explaining the contract: populated by hydrateDraftData when a draft revision exists, otherwise undefined. New drafts.test asserts both branches. 2824 -> 2826 tests passing. * fix(mcp): F14-F18 + F23/F24 — supports default, ownership edge, menu refactor, hydration cleanup F14: SchemaRegistry.createCollection now defaults supports to ['drafts', 'revisions'] when undefined. MCP and admin UI defaults removed in favor of the canonical lower-layer default. F17: canActOnOwn treats empty-string ownerId as 'no recorded owner' so authorId-stripping doesn't accidentally grant edit-own to an empty-id user. Test added for the edge. F18: Menu MCP tools delegate to typed handlers in api/handlers/menus.ts. New handleMenuSetItems handler exposes the atomic-replace logic for both REST (future) and MCP. Drops the as-never casts in MCP layer. F23: hydrateDraftData clones the response instead of mutating in place. Future request-cache layers won't observe stale-after-mutation bugs. F24: Strip leading-underscore keys from revision data before merging into the response. _slug etc. are runtime-internal and don't belong on the surfaced data field. F15: hydrateDraftData runs BEFORE afterSave hook so plugins see the just-saved draft data, not the live columns, for revision-supporting collections. F16: hydration error path logs '[emdash] draft hydration failed:' instead of swallowing silently. * fix(content): F19 remove dead unknown-field catch; F25 tighten conflict matchers Validation now rejects unknown keys upfront with VALIDATION_ERROR so the post-INSERT 'no such column' / 'column does not exist' catch branches in handleContentCreate and handleContentUpdate were unreachable. Drop them. Tighten the unique-constraint detection to match 'unique constraint failed' or 'duplicate key' specifically, instead of bare 'unique' (which false-positives on any error message containing the word) or 'constraint failed' (which catches NOT NULL and CHECK violations too). Make the create and update paths symmetric: both produce sanitized SLUG_CONFLICT messages and a generic 'Unique constraint violation' for non-slug uniques, so we don't leak raw DB error text in either path. * fix(auth,mcp,search): F20 menu/taxonomy scopes; F21 tighten fts5 swallow F20: add menus:manage and taxonomies:manage to the API token scope vocabulary (mapped to EDITOR via SCOPE_MIN_ROLE). Switch the seven MCP mutation tools (taxonomy_create_term, taxonomy_update_term, taxonomy_delete_term, menu_create, menu_update, menu_delete, menu_set_items) from content:write to the dedicated scope. REST already uses these names as permissions; bringing scopes in line lets a token hold menu/taxonomy management without granting general content writes. F21: extract isFts5SyntaxError() helper that matches specifically on 'fts5: syntax error' and 'unknown special query' rather than the broad 'fts5' / 'syntax error' string. The old filter would swallow internal table-corruption errors. Apply the helper to both searchSingleCollection and getSuggestions, and log the swallow at warn level so silent failures show up in the logs. * fix(types,mcp,tests): F22 cursor DoS guard; F26 drop harness cast; lint cleanups F22: reject cursors longer than 4096 chars in decodeCursor() before calling decodeBase64() (which is O(N) in input size). Also clamp the MCP and REST input schemas at 2048 chars so oversized cursors fail early at the schema boundary instead of allocating against giant strings inside the repository helper. New pagination test exercises the schema-boundary rejection. F26: drop the 'as unknown as EmDashHandlers' cast in the test harness so real interface drift surfaces instead of being hidden. Add the optional ensureSearchHealthy member to EmDashHandlers — it's a real public surface (search routes call it as emdash.ensureSearchHealthy?.()) that middleware bolts on at runtime; declaring it optional is the honest type and matches the production wiring. Lint cleanups: rename the shadowed 'user' in rbac.test.ts F17 case to 'orphanedUser'; in hydrateDraftData spread the already-narrowed r.data rather than rebroadening to unknown and re-narrowing to Record<string, unknown>. * fix(mcp): apply 3rd-pass adversarial review fixes Production: - F20 backwards compat: add IMPLICIT_SCOPE_GRANTS table so existing content:write tokens continue to work for menu/taxonomy mutations. Reverse direction is not granted. Unit + MCP integration tests. - menu_set_items: existence check moved inside the transaction (closes TOCTOU window where a concurrent menu_delete leaves orphan items on D1) and explicit guard against negative parentIndex. - menu_list / menu_get MCP tools delegate to the typed handlers; drops the last 'as never' casts and aligns response shape with REST (handleMenuList already includes itemCount). - handleMenuGet NOT_FOUND message includes the menu name. - Cycle-detection MAX_DEPTH off-by-one: a chain of exactly MAX_DEPTH ancestors is now accepted (the depth-exceeded error fires only when there's still chain to walk). - Field validation moved entirely to EmDashRuntime (handleContentCreate + handleContentUpdate). Raw API handlers trust pre-validated input; every production caller goes through the runtime wrapper. Removes the duplicate validation pass on the non-revision path. Tests: - validation.test.ts:297 — drop the echo-id alternation; assert field, id, and 'not found' separately. - Three /not\.X/ metachar bugs introduced during F11/F12 (no.changes, not.published, not.empty) escaped to literal phrases. - taxonomy_delete_term leaf test now verifies removal via follow-up taxonomy_list_terms. - ownership null-author tests gain positive _meta.code assertions. - search empty-query and special-char tests pin the swallow contract (success + empty items, never the syntax-error leak). - errors.test.ts orderBy assertion: require the offending column name AND a stable VALIDATION_ERROR code rather than broad alternation. - New cursor-decoder unit tests covering the 4096-byte DoS guard. - New handleMenuSetItems unit tests covering negative / forward parentIndex and missing-menu rejection. - F20 changeset (witty-rocks-knock.md) and F25 changeset (tame-hotels-sort.md) document the user-visible behavior changes. * fix(mcp): pass-4 critical fixes — prototype pollution, comment hygiene, error code mapping H-1: IMPLICIT_SCOPE_GRANTS now backed by a Map<string, readonly string[]> instead of a plain object. Bracket access on the prototype chain (__proto__, constructor, toString, etc.) no longer reaches Function or Object.prototype values that would crash hasScope() with TypeError or accidentally satisfy the check. Test added covering all four chain keys. H-2: 'downstream API handler also validates' comment in EmDashRuntime.handleContentUpdate was false after pass-3 (validation moved entirely to runtime). Removed the misleading sentence. H-3: decodeCursor signature change (T | null -> T, throws) noted in the loud-cursors changeset for plugin authors who use the low-level repositories barrel directly. Strip 'see MCP_BUGS.md #N' references from committed code (3 src files, 6 test files, 1 smoke test). Comments rewritten to be self-contained. Re-export InvalidCursorError from the package root and handleMenuSetItems + MenuSetItemsInput from the api/handlers barrel — both were declared public surface but missed from their respective barrel files. Register INSUFFICIENT_SCOPE (403), INSUFFICIENT_PERMISSIONS (403), and SLUG_CONFLICT (409) in ErrorCode + mapErrorStatus. They were already emitted in the codebase but missing from the central registry, which meant a future apiError() call with one of these codes would silently fall through to the default 400. handleMenuSetItems / handleMenuUpdate / handleMenuDelete NOT_FOUND messages now echo the menu name, matching handleMenuGet. * fix(mcp): pass-4 medium fixes — concurrent-deletion cursor, N+1 menu list, test coverage - taxonomy_list_terms: switch from term-id cursor to base64 keyset over (label, id). Tolerates concurrent deletion of the cursor-term: cursor is a position rather than a row reference, so a missing row just means we skip past it instead of erroring. - handleMenuList: replace N+1 (count per menu) with a single LEFT JOIN + GROUP BY. Postgres-safe number coercion for the count aggregate. - validateParentTerm cycle/depth walk now runs on create as well as update, so a malicious or buggy caller can't grow the chain past MAX_DEPTH (100). Cycle check still scoped to update where termId exists. Two integration tests added: chain of exactly 100 ancestors is accepted, chain of 102 is rejected. - search empty-query and special-char tests seed published content so a regression that interprets bad input as 'match all' would surface a non-empty list and fail. - handleMenuSetItems missing-menu test verifies the transaction rolls back: the menu_items table is empty after the call. - unpublish-on-draft test asserts version is unchanged across the idempotent call (catches phantom-revision / always-bump regressions). - publish twice test pins publishedAt to a known past timestamp before the second publish, replacing the racy 5ms timing trick. A regression that drops COALESCE preservation now fails deterministically. - schema_delete_collection 'has content' test tightened: requires the literal 'has content' phrase and 'force: true' guidance instead of a loose word alternation. - New test: validation rejects non-string value for a string field. Locks in the runtime-layer Zod check after F19 removed the raw handler's UNKNOWN_FIELD catch. - Query-counts snapshot regenerated and matches — no perf regressions from the LEFT JOIN, the cursor switch, or the validation move. * fix(mcp): pass-5 — stable taxonomy ordering, missing rollback/itemCount tests, depth limit docs Bugs: - TaxonomyRepository.findByName/findChildren now order by (label asc, id asc). Without an id tiebreaker the SQL ordering of tied-label rows is dialect- dependent, which breaks the (label, id) keyset cursor — page 2 could duplicate or skip page 1 rows. Regression test added with three terms sharing one label, walked one-at-a-time through pagination. - handleMenuList LEFT JOIN had no correctness coverage. Added a test seeding three menus with known item counts (0, 1, 3); asserts each itemCount and verifies the empty menu appears (guards INNER JOIN regression). - handleMenuSetItems missing-menu rollback test was vacuously true (the table was empty before the call). Now seeds an unrelated menu with an item, runs the failing call, asserts the unrelated item survives. - Added taxonomy_list_terms 'survives concurrent deletion of cursor-term' test — the headline pass-4 fix had no test for the actual concurrent- deletion scenario it was designed to handle. Test tightening: - MAX_DEPTH boundary regex tightened from /maximum depth|MAX_DEPTH|exceeds/ to /maximum depth/i, plus a positive _meta.code === VALIDATION_ERROR assertion. - Non-string title test gains _meta.code === VALIDATION_ERROR. - Stale comment in malformed-cursor test about 'silently resets to start' removed (no longer true after pass-4 keyset switch). Docs: - taxonomy_create_term and taxonomy_update_term descriptions now mention the 100-ancestor depth limit and (for update) cycle detection. - New changeset (poor-buckets-clap.md) covers the taxonomy_list_terms cursor format change and the parent validation expansion. * fix(mcp): pass-6 — bump cursor changeset to minor + clarify depth-limit description Pass-6 reviewer flagged two non-trivial items: - The poor-buckets-clap changeset declared 'patch' but the taxonomy_list_terms cursor format change can produce INVALID_CURSOR for clients with persisted cursors. Per AGENTS.md (post pre-release, real installs depend on current behavior) — bump to minor and emphasize the break clearly. - taxonomy_create_term and taxonomy_update_term descriptions were ambiguous: 'Parent chains are limited to 100 ancestors deep' could read as 'you can have up to 100'. Actual behavior: any pre-existing chain ≥ 100 rejects all new descendants. Clarify. * fix(admin): wire menus:manage and taxonomies:manage scopes into the admin UI The contract test 'admin wire constants match server VALID_SCOPES' caught two scopes added in the omnibus PR (taxonomies:manage and menus:manage) that I'd added to packages/auth but not to the admin client + create-token form. Adds them to API_TOKEN_SCOPES, the form's SCOPE_VALUES list, and updates the contract-test expected default for new collections from ['drafts'] to ['drafts', 'revisions'] (matching the registry default lifted in F14). * fix(search): drop warn log on swallowed FTS5 syntax errors Copilot review on PR #777 flagged that logging swallowed FTS5 syntax errors at warn level was both a log-spam vector (anyone can fire malformed queries in a loop) and an information-disclosure surface (SQLite/D1 embed the raw query in the error message). Removes both warn logs. Behaviour is unchanged: malformed queries still return empty results / continue suggestion loop. Comments updated to explain why the swallow is intentionally silent. * fix(mcp): revision_restore writes to draft, not live, on revision-capable collections For collections that support revisions, the live row's data columns hold the published values and the draft lives in a row in the revisions table pointed to by draft_revision_id. The restore handler was bypassing this model: it called ContentRepository.update directly, overwriting the live columns with the source revision's data and leaving any pending draft unchanged. Per the tool's documented contract ('Replaces the current draft with the specified revision's data. Not automatically published...'), this was the wrong direction. Live should be left alone; the draft should become a copy of the chosen revision. EmDashRuntime.handleRevisionRestore now branches on whether the collection supports revisions: * Revision-capable: create a new draft revision carrying the source revision's data, update draft_revision_id + updated_at on the row. Live columns untouched. Response is hydrated so the returned data reflects the new draft state immediately (closes the bug-#2-shaped staleness in the restore response). * Non-revision: fall through to the legacy raw handler, which still writes to the live row. This is unchanged behavior for opt-out collections. The previous integration test for this case (drafts.test.ts:397) republished v2 before restoring v1, so the assertions passed regardless of whether restore touched draft, live, or both. Replaced with two repros that exercise the actual bug surface (live=v1, draft=v2 -> restore v1 leaves live=v1 and makes draft=v1; and no-draft -> restore creates a new draft). Both tests have been verified to fail against the pre-fix runtime. * style: format --------- Co-authored-by: emdashbot[bot] <emdashbot[bot]@users.noreply.github.com>
0aveRyan
pushed a commit
to 0aveRyan/emdash
that referenced
this pull request
Apr 27, 2026
…s#777) * test(mcp): add integration coverage for documented bugs Adds 73 MCP integration tests across 8 files exercising the real MCP client/server pair against a real database, with no handler mocks. 43 tests fail today against bugs documented in the local MCP_BUGS.md log (emdash-cms#1, emdash-cms#2, emdash-cms#3, emdash-cms#4, emdash-cms#5, emdash-cms#6, emdash-cms#10, emdash-cms#11, emdash-cms#12) and will pass when the omnibus fix lands. Each failing test maps to a specific bug variant; the passing tests serve as regression guards. Refactors EmDashRuntime's 19-arg private positional constructor to a public single-argument constructor that takes an EmDashRuntimeParts object. Production code path is preserved (create() builds the parts and ends with new EmDashRuntime({ ... })); tests can now construct a real runtime around a pre-migrated test database without duplicating any handler logic. Adds an HTTP-level concurrency check to the smoke matrix to cover the parallel-401 race (the InMemoryTransport used by the integration suite cannot reach the production auth middleware path). * test(mcp): add comprehensive coverage for all MCP tools and edges Extends MCP integration coverage to every registered tool, plus the known gaps and edge cases. Adds 145 tests across 7 new files (218 total across the suite, 58 currently failing). New files: - schema.test.ts (46): list/get/create/delete collection, create/delete field, with validation, ownership, error envelope, and side-effect coverage. All passing — schema tools are largely correct. - taxonomy.test.ts (20): list/list_terms/create_term plus bug emdash-cms#7 (orphan collection drift) and bug emdash-cms#13 (no delete/update term gap). - media.test.ts (22): list/get/update/delete with ownership, mimeType filter, pagination, and bug emdash-cms#14 gap (no upload tool). Confirms media ownership extraction already handles null authorId correctly, pinning down the inconsistency with content extraction (bug emdash-cms#1). - menu.test.ts (10): read-only list/get plus bug emdash-cms#15 gap (no mutation tools). - search.test.ts (9): empty index, no-match, collection scoping, special-character handling, draft filtering, permissions. - content-misc.test.ts (25): content_duplicate, content_permanent_delete, content_translations + locale handling on get/create, _rev optimistic concurrency (happy + race), soft-delete visibility, edit-while-trashed, idempotency for publish/unpublish/schedule, and the content_unschedule MCP-tool gap. - input-schemas.test.ts (13): Zod-level argument validation across every tool — missing required, wrong type, out-of-range, enum violation. All passing — pins down the SDK boundary. Failure breakdown by bug: - bug emdash-cms#1 (null authorId): 10 in ownership.test.ts - bug emdash-cms#2 (stale draft data): 4 in drafts.test.ts - bug emdash-cms#3 (bare error strings): 6 in errors.test.ts + scattered - bug emdash-cms#4/emdash-cms#5/emdash-cms#6 (validation): 11 in validation.test.ts - bug emdash-cms#10 (publishedAt): 3 in lifecycle.test.ts - bug emdash-cms#11 (supports default): 2 in lifecycle.test.ts - bug emdash-cms#12 (cursor): 7 in pagination.test.ts - bug emdash-cms#7 + emdash-cms#13 (taxonomy): 8 in taxonomy.test.ts - bug emdash-cms#14 (media upload gap): 1 in media.test.ts - bug emdash-cms#15 (menu mutation gap): 3 in menu.test.ts - search FTS issues: 2 in search.test.ts - content_unschedule gap: 1 in content-misc.test.ts The remaining 58 failures collectively define the omnibus fix's acceptance criteria — turning these green will resolve every documented issue plus the gaps and edge cases this expansion surfaced. * fix(mcp): preserve error codes through tool response envelope The MCP server's `unwrap()` and `errorResult()` helpers stripped the structured error code from handler results, leaving callers with only a human-readable message. Tools that throw raw errors (schema, search, taxonomy, menu) similarly lost any signal beyond `error.message`. Replaces both helpers with three structured emitters: - respondData(data): success envelope - respondError(code, message, details?): emits the code as a stable `[CODE]` prefix on the message text AND attaches `_meta.code` so MCP-aware clients can read it programmatically - respondHandlerError(error, fallbackCode): for the catch-all sites that wrap raw thrown errors. Recognises the `apiError: { code }` annotation that handlers attach to thrown NOT_FOUND / CONFLICT errors (see api/handlers/content.ts:538), preserves the original message, falls back to `INTERNAL_ERROR` otherwise. `unwrap()` now propagates `error.code`, `error.message`, and `error.details` verbatim from the handler's ApiResult shape. Every call site of the old `errorResult()` is migrated: - ApiResult-returning paths use unwrap (already used everywhere) - SchemaRegistry / search / taxonomy / menu try/catch sites use respondHandlerError with tool-specific fallback codes (SCHEMA_LIST_ERROR, FIELD_CREATE_ERROR, SEARCH_ERROR, etc.) - String-error sites ("Collection 'X' not found", "Menu 'X' not found", "Taxonomy 'X' not found") use respondError("NOT_FOUND", ...) - Revision restore "missing collection or entry reference" uses respondError("VALIDATION_ERROR", ...) This is the A1 commit from .opencode/notes/MCP_FIX_PLAN.md. Flips the `_meta.code` propagation test in tests/integration/mcp/errors.test.ts from red to green. The remaining red tests in that file need handler-level work (Wave 2) that surfaces the underlying repository error instead of the current generic "Failed to create content" / "Failed to list content" placeholders. Backwards compatibility: existing MCP clients that match on the bare message will see a `[CODE] ` prefix added. The prefix is on a stable SCREAMING_SNAKE_CASE token, so regex matchers and substring tests on the original message continue to work. * fix(mcp): allow ownership checks on rows with null authorId The MCP server's extractContentAuthorId() previously threw an InternalError when content had no authorId — typically seed-imported rows. This blocked every mutating operation (update, delete, publish, unpublish, schedule, restore) for those rows, including for admins. The companion change in this commit's parent landed in mcp/server.ts: extractContentAuthorId() now returns "" instead of throwing, mirroring how media_update/media_delete already handle null authorId. The ownership check then defers to canActOnOwn(): an actor with the '*:edit_any' permission succeeds, and an actor with only '*:edit_own' gets a clean permission error rather than an internal one. Updates the unit test that previously asserted the throw was correct. The new tests confirm an ADMIN can edit null-authorId rows and an AUTHOR is denied with a permission message (not an internal one). * fix(content): clear publishedAt when content is unpublished ContentRepository.unpublish() previously left published_at populated when flipping status to draft, making 'currently published' indistinguishable from 'previously published' via the data alone. Clear published_at on unpublish so a missing/null timestamp unambiguously means the item is not live. Re-publishing assigns a fresh timestamp via the existing publish path. * fix(mcp): apply documented supports default in schema_create_collection Adds the changeset for this fix. The source change — defaulting `args.supports ?? ['drafts', 'revisions']` at the MCP boundary so the documented default is actually applied — landed alongside the error envelope refactor in 9986405 due to working-tree interleaving between parallel agents. Tracking it as a separate changeset so the user-visible behavior change is described independently. * fix(content,pagination,taxonomies): wave 2 omnibus fixes Combines the three Wave 2 fixes from .opencode/notes/MCP_FIX_PLAN.md because they entangled in the working tree during parallel agent execution. B (validation, bugs emdash-cms#4/emdash-cms#5/emdash-cms#6): wires generateZodSchema() into handleContentCreate / handleContentUpdate. Required fields enforced; select / multiSelect option lists honored; reference fields verified to point at a real, non-trashed target. New helper at api/handlers/validation.ts. Errors surface as { code: VALIDATION_ERROR, message } with all offending fields named in one message so callers can fix everything in a single round trip. C1 (cursor decoding, bug emdash-cms#12): decodeCursor() now throws InvalidCursorError on bad input instead of silently returning null. Updates every repository caller (content, byline, audit, redirect, comment, plugin-storage, media, user, sections, loader). API handler catch blocks recognise the error and return INVALID_CURSOR. The MCP boundary rejects empty-string cursors via z.string().min(1) on the input schemas. taxonomy_list_terms gets parallel treatment for its in-memory term-id cursor. G1 (taxonomy orphan filter, bug emdash-cms#7): handleTaxonomyList filters each taxonomy's collections array against the real collection set on read. Storage stays untouched so re-creating a deleted collection re-links automatically. Test fix-ups in the same commit (these are tests I committed earlier on this branch that didn't account for production behavior): - taxonomy.test.ts: account for the seeded 'category' and 'tag' taxonomies that migration 006 inserts; reshape bug-emdash-cms#7 test around the seed's 'posts' orphan rather than a manufactured one; correct taxonomy_create_term response envelope ({ term: { ... } }). * fix(mcp): wave 3+4 — error fidelity, gap-feature tools, search and draft fixes start - Handler-level error fidelity (bug emdash-cms#3): handleContentList and handleContentCreate / handleContentUpdate now translate database-level errors into structured codes (COLLECTION_NOT_FOUND, SLUG_CONFLICT, UNKNOWN_FIELD, VALIDATION_ERROR) instead of collapsing to a generic 'Failed to ...' message. - validation.ts: detects unknown keys in content data so callers get a useful error rather than silently dropped writes. - handleTermCreate: validates parentId exists and belongs to the same taxonomy. - New MCP tools (gap fills): - content_unschedule (bug — runtime had handleContentUnschedule but no MCP tool) - taxonomy_update_term, taxonomy_delete_term (bug emdash-cms#13) - menu_create, menu_update, menu_delete, menu_set_items (bug emdash-cms#15) - media_create (bug emdash-cms#14 — registers metadata for files uploaded out-of-band; binary uploads remain off the MCP transport) - Test fixups: content-handlers slug test no longer expects a non-string title to silently produce null slug (validation now rejects that). loader-cursor-pagination expects the loader's error envelope rather than a thrown error. - Taxonomy tests adjusted to account for the seed data inserted by migration 006 ('category' and 'tag'). * fix(content,search): draft data hydration and FTS robustness - EmDashRuntime.handleContentGet / handleContentUpdate / handleContentGetIncludingTrashed now overlay the current draft revision's data onto the response when draftRevisionId is set, with the previously-published values exposed alongside as 'liveData'. Resolves bug emdash-cms#2 — agents calling content_get after content_update on a revision-supporting collection now see their pending draft, not the stale live row. - searchSingleCollection swallows FTS5 syntax errors and returns no matches rather than leaking the SQLite error text to callers. Malformed user-typed queries (unbalanced quotes, stray operators, unclosed parens) produce empty results instead of a 500-style error. - Search test setup activates the FTS index via FTSManager.enableSearch(). Without it, the FTS table and triggers don't exist and the search-after-publish round trip silently produces no matches — same trap a real site would hit if the admin never enabled search. - Minor cleanup: deduplicate inline regex with COLLECTION_SLUG_PATTERN in mcp/server.ts; safer typeof-narrowed cast in hydrateDraftData. * fix(mcp): production correctness wave (F1-F6) F1: media_create now forwards authorId from MCP user to repository so ownership checks succeed on subsequent media_update/delete by the same author. F2: taxonomy_update_term gains parent validation (existence, same taxonomy, self-parent rejection) and cycle detection via parent-chain walk (bounded at 100). Shared helper used by create + update. F3: empty-string parentId/slug normalized to undefined/null at handler and repository layers; previously '' was persisted as a literal slug. F4: content validation runs in EmDashRuntime.handleContentUpdate BEFORE the draft revision write, so revision-supporting collections can no longer slip invalid data into history. The downstream API handler still validates as defense-in-depth. F5: validation.ts reference-target catch is narrowed to isMissingTableError; any other DB error now propagates instead of being silently reported as 'reference not found'. IN clause is also chunked at SQL_BATCH_SIZE for D1 bind-parameter safety. F6: menu_delete (REST + MCP) now deletes _emdash_menu_items first inside a transaction. D1 has FOREIGN KEYS off by default so cascade was a no-op there. Description updated to drop the misleading FK-cascade claim. taxonomy_delete_term description corrected to match the handler's 'cannot delete with children' behavior. Tests: media gap test replaced with full happy-path (AUTHOR creates, gets, updates own; OTHER_AUTHOR is denied). Taxonomy gap presence-only tests replaced with rename/relabel/reparent/detach/cross-taxonomy rejection/self-parent rejection/cycle rejection. Menu gap presence-only tests replaced with create+get round-trip, duplicate CONFLICT, update label, set_items empty, set_items 3-level nesting, set_items parentIndex>=i rejection, F6 delete-cascade verification. 2806 -> 2821 tests passing. * fix(mcp): error envelope codes for SchemaError + auth (F7) respondHandlerError now reads error.code from any Error subclass with a string code field — SchemaError ('RESERVED_SLUG', 'COLLECTION_EXISTS', etc.) and the new EmDashAuthError class flow through with their stable codes intact. Numeric codes (e.g. McpError) are skipped — _meta.code is reserved for stable string codes. Auth helpers (requireScope, requireRole, requireOwnership, requireDraftAccess, the inline publish-perm check) now throw EmDashAuthError with stable codes (INSUFFICIENT_SCOPE, INSUFFICIENT_PERMISSIONS) rather than McpError. The SDK's text-only fallback envelope was stripping the code. To make those throws survive the SDK's catch path, registerTool is locally monkey-patched at server-creation time so every tool callback is wrapped in try/catch that funnels into respondHandlerError — without requiring 41 callbacks to grow individual try/catch blocks. Tests: 3 new F7 cases verify _meta.code === 'INSUFFICIENT_SCOPE' / 'INSUFFICIENT_PERMISSIONS' / SchemaError code (non-fallback) instead of the SDK's previous code-less text envelope. 2821 -> 2824 tests passing. * test(mcp): tighten regex assertions, add happy paths, liveData type (F8-F13) F8: replace broken /not.found/ regex (the . was a metacharacter matching any byte) with a tight /\bNOT_FOUND\b|\bnot found\b/i. Applied across errors / schema / content-misc / menu / taxonomy / media / validation test files. F9: drop the id-alternation in echo regexes — /(NOT_FOUND|01NEVEREXISTED)/ passed if the id alone was echoed, defeating the test. Each NOT_FOUND test now asserts the code/word AND, separately via expect.toContain, the id (when the message body actually echoes it). F10: tighten overly broad collection/field 'word match' assertions to the specific code (COLLECTION_NOT_FOUND / FIELD_NOT_FOUND) plus an expect.toContain on the offending slug. F11: replaced empty-conditional dead branches with positive post-condition assertions: - publish twice / unpublish on draft now check status in the success branch as well as the error branch. - pagination.test 'limit beyond max' now also checks the error message in the rejection branch. - search 'special chars' now checks items is an array in the sanitized-success branch. - search 'scope filter' now also asserts items.length > 0 so the for- loop is reachable. - pagination 'revision_list malformed cursor' was a no-op (expect(result).toBeDefined()) — deleted, with a note explaining revision_list has no cursor parameter. F12: replaced presence-only listTools tests with happy-path coverage: - content_unschedule: schedule + unschedule + verify scheduledAt is null + re-publish succeeds. - (taxonomy/menu/media gap-feature happy paths landed in the previous commit alongside F2/F3/F6.) F13: ContentItem now declares an optional liveData?: Record<string, unknown> field, with JSDoc explaining the contract: populated by hydrateDraftData when a draft revision exists, otherwise undefined. New drafts.test asserts both branches. 2824 -> 2826 tests passing. * fix(mcp): F14-F18 + F23/F24 — supports default, ownership edge, menu refactor, hydration cleanup F14: SchemaRegistry.createCollection now defaults supports to ['drafts', 'revisions'] when undefined. MCP and admin UI defaults removed in favor of the canonical lower-layer default. F17: canActOnOwn treats empty-string ownerId as 'no recorded owner' so authorId-stripping doesn't accidentally grant edit-own to an empty-id user. Test added for the edge. F18: Menu MCP tools delegate to typed handlers in api/handlers/menus.ts. New handleMenuSetItems handler exposes the atomic-replace logic for both REST (future) and MCP. Drops the as-never casts in MCP layer. F23: hydrateDraftData clones the response instead of mutating in place. Future request-cache layers won't observe stale-after-mutation bugs. F24: Strip leading-underscore keys from revision data before merging into the response. _slug etc. are runtime-internal and don't belong on the surfaced data field. F15: hydrateDraftData runs BEFORE afterSave hook so plugins see the just-saved draft data, not the live columns, for revision-supporting collections. F16: hydration error path logs '[emdash] draft hydration failed:' instead of swallowing silently. * fix(content): F19 remove dead unknown-field catch; F25 tighten conflict matchers Validation now rejects unknown keys upfront with VALIDATION_ERROR so the post-INSERT 'no such column' / 'column does not exist' catch branches in handleContentCreate and handleContentUpdate were unreachable. Drop them. Tighten the unique-constraint detection to match 'unique constraint failed' or 'duplicate key' specifically, instead of bare 'unique' (which false-positives on any error message containing the word) or 'constraint failed' (which catches NOT NULL and CHECK violations too). Make the create and update paths symmetric: both produce sanitized SLUG_CONFLICT messages and a generic 'Unique constraint violation' for non-slug uniques, so we don't leak raw DB error text in either path. * fix(auth,mcp,search): F20 menu/taxonomy scopes; F21 tighten fts5 swallow F20: add menus:manage and taxonomies:manage to the API token scope vocabulary (mapped to EDITOR via SCOPE_MIN_ROLE). Switch the seven MCP mutation tools (taxonomy_create_term, taxonomy_update_term, taxonomy_delete_term, menu_create, menu_update, menu_delete, menu_set_items) from content:write to the dedicated scope. REST already uses these names as permissions; bringing scopes in line lets a token hold menu/taxonomy management without granting general content writes. F21: extract isFts5SyntaxError() helper that matches specifically on 'fts5: syntax error' and 'unknown special query' rather than the broad 'fts5' / 'syntax error' string. The old filter would swallow internal table-corruption errors. Apply the helper to both searchSingleCollection and getSuggestions, and log the swallow at warn level so silent failures show up in the logs. * fix(types,mcp,tests): F22 cursor DoS guard; F26 drop harness cast; lint cleanups F22: reject cursors longer than 4096 chars in decodeCursor() before calling decodeBase64() (which is O(N) in input size). Also clamp the MCP and REST input schemas at 2048 chars so oversized cursors fail early at the schema boundary instead of allocating against giant strings inside the repository helper. New pagination test exercises the schema-boundary rejection. F26: drop the 'as unknown as EmDashHandlers' cast in the test harness so real interface drift surfaces instead of being hidden. Add the optional ensureSearchHealthy member to EmDashHandlers — it's a real public surface (search routes call it as emdash.ensureSearchHealthy?.()) that middleware bolts on at runtime; declaring it optional is the honest type and matches the production wiring. Lint cleanups: rename the shadowed 'user' in rbac.test.ts F17 case to 'orphanedUser'; in hydrateDraftData spread the already-narrowed r.data rather than rebroadening to unknown and re-narrowing to Record<string, unknown>. * fix(mcp): apply 3rd-pass adversarial review fixes Production: - F20 backwards compat: add IMPLICIT_SCOPE_GRANTS table so existing content:write tokens continue to work for menu/taxonomy mutations. Reverse direction is not granted. Unit + MCP integration tests. - menu_set_items: existence check moved inside the transaction (closes TOCTOU window where a concurrent menu_delete leaves orphan items on D1) and explicit guard against negative parentIndex. - menu_list / menu_get MCP tools delegate to the typed handlers; drops the last 'as never' casts and aligns response shape with REST (handleMenuList already includes itemCount). - handleMenuGet NOT_FOUND message includes the menu name. - Cycle-detection MAX_DEPTH off-by-one: a chain of exactly MAX_DEPTH ancestors is now accepted (the depth-exceeded error fires only when there's still chain to walk). - Field validation moved entirely to EmDashRuntime (handleContentCreate + handleContentUpdate). Raw API handlers trust pre-validated input; every production caller goes through the runtime wrapper. Removes the duplicate validation pass on the non-revision path. Tests: - validation.test.ts:297 — drop the echo-id alternation; assert field, id, and 'not found' separately. - Three /not\.X/ metachar bugs introduced during F11/F12 (no.changes, not.published, not.empty) escaped to literal phrases. - taxonomy_delete_term leaf test now verifies removal via follow-up taxonomy_list_terms. - ownership null-author tests gain positive _meta.code assertions. - search empty-query and special-char tests pin the swallow contract (success + empty items, never the syntax-error leak). - errors.test.ts orderBy assertion: require the offending column name AND a stable VALIDATION_ERROR code rather than broad alternation. - New cursor-decoder unit tests covering the 4096-byte DoS guard. - New handleMenuSetItems unit tests covering negative / forward parentIndex and missing-menu rejection. - F20 changeset (witty-rocks-knock.md) and F25 changeset (tame-hotels-sort.md) document the user-visible behavior changes. * fix(mcp): pass-4 critical fixes — prototype pollution, comment hygiene, error code mapping H-1: IMPLICIT_SCOPE_GRANTS now backed by a Map<string, readonly string[]> instead of a plain object. Bracket access on the prototype chain (__proto__, constructor, toString, etc.) no longer reaches Function or Object.prototype values that would crash hasScope() with TypeError or accidentally satisfy the check. Test added covering all four chain keys. H-2: 'downstream API handler also validates' comment in EmDashRuntime.handleContentUpdate was false after pass-3 (validation moved entirely to runtime). Removed the misleading sentence. H-3: decodeCursor signature change (T | null -> T, throws) noted in the loud-cursors changeset for plugin authors who use the low-level repositories barrel directly. Strip 'see MCP_BUGS.md #N' references from committed code (3 src files, 6 test files, 1 smoke test). Comments rewritten to be self-contained. Re-export InvalidCursorError from the package root and handleMenuSetItems + MenuSetItemsInput from the api/handlers barrel — both were declared public surface but missed from their respective barrel files. Register INSUFFICIENT_SCOPE (403), INSUFFICIENT_PERMISSIONS (403), and SLUG_CONFLICT (409) in ErrorCode + mapErrorStatus. They were already emitted in the codebase but missing from the central registry, which meant a future apiError() call with one of these codes would silently fall through to the default 400. handleMenuSetItems / handleMenuUpdate / handleMenuDelete NOT_FOUND messages now echo the menu name, matching handleMenuGet. * fix(mcp): pass-4 medium fixes — concurrent-deletion cursor, N+1 menu list, test coverage - taxonomy_list_terms: switch from term-id cursor to base64 keyset over (label, id). Tolerates concurrent deletion of the cursor-term: cursor is a position rather than a row reference, so a missing row just means we skip past it instead of erroring. - handleMenuList: replace N+1 (count per menu) with a single LEFT JOIN + GROUP BY. Postgres-safe number coercion for the count aggregate. - validateParentTerm cycle/depth walk now runs on create as well as update, so a malicious or buggy caller can't grow the chain past MAX_DEPTH (100). Cycle check still scoped to update where termId exists. Two integration tests added: chain of exactly 100 ancestors is accepted, chain of 102 is rejected. - search empty-query and special-char tests seed published content so a regression that interprets bad input as 'match all' would surface a non-empty list and fail. - handleMenuSetItems missing-menu test verifies the transaction rolls back: the menu_items table is empty after the call. - unpublish-on-draft test asserts version is unchanged across the idempotent call (catches phantom-revision / always-bump regressions). - publish twice test pins publishedAt to a known past timestamp before the second publish, replacing the racy 5ms timing trick. A regression that drops COALESCE preservation now fails deterministically. - schema_delete_collection 'has content' test tightened: requires the literal 'has content' phrase and 'force: true' guidance instead of a loose word alternation. - New test: validation rejects non-string value for a string field. Locks in the runtime-layer Zod check after F19 removed the raw handler's UNKNOWN_FIELD catch. - Query-counts snapshot regenerated and matches — no perf regressions from the LEFT JOIN, the cursor switch, or the validation move. * fix(mcp): pass-5 — stable taxonomy ordering, missing rollback/itemCount tests, depth limit docs Bugs: - TaxonomyRepository.findByName/findChildren now order by (label asc, id asc). Without an id tiebreaker the SQL ordering of tied-label rows is dialect- dependent, which breaks the (label, id) keyset cursor — page 2 could duplicate or skip page 1 rows. Regression test added with three terms sharing one label, walked one-at-a-time through pagination. - handleMenuList LEFT JOIN had no correctness coverage. Added a test seeding three menus with known item counts (0, 1, 3); asserts each itemCount and verifies the empty menu appears (guards INNER JOIN regression). - handleMenuSetItems missing-menu rollback test was vacuously true (the table was empty before the call). Now seeds an unrelated menu with an item, runs the failing call, asserts the unrelated item survives. - Added taxonomy_list_terms 'survives concurrent deletion of cursor-term' test — the headline pass-4 fix had no test for the actual concurrent- deletion scenario it was designed to handle. Test tightening: - MAX_DEPTH boundary regex tightened from /maximum depth|MAX_DEPTH|exceeds/ to /maximum depth/i, plus a positive _meta.code === VALIDATION_ERROR assertion. - Non-string title test gains _meta.code === VALIDATION_ERROR. - Stale comment in malformed-cursor test about 'silently resets to start' removed (no longer true after pass-4 keyset switch). Docs: - taxonomy_create_term and taxonomy_update_term descriptions now mention the 100-ancestor depth limit and (for update) cycle detection. - New changeset (poor-buckets-clap.md) covers the taxonomy_list_terms cursor format change and the parent validation expansion. * fix(mcp): pass-6 — bump cursor changeset to minor + clarify depth-limit description Pass-6 reviewer flagged two non-trivial items: - The poor-buckets-clap changeset declared 'patch' but the taxonomy_list_terms cursor format change can produce INVALID_CURSOR for clients with persisted cursors. Per AGENTS.md (post pre-release, real installs depend on current behavior) — bump to minor and emphasize the break clearly. - taxonomy_create_term and taxonomy_update_term descriptions were ambiguous: 'Parent chains are limited to 100 ancestors deep' could read as 'you can have up to 100'. Actual behavior: any pre-existing chain ≥ 100 rejects all new descendants. Clarify. * fix(admin): wire menus:manage and taxonomies:manage scopes into the admin UI The contract test 'admin wire constants match server VALID_SCOPES' caught two scopes added in the omnibus PR (taxonomies:manage and menus:manage) that I'd added to packages/auth but not to the admin client + create-token form. Adds them to API_TOKEN_SCOPES, the form's SCOPE_VALUES list, and updates the contract-test expected default for new collections from ['drafts'] to ['drafts', 'revisions'] (matching the registry default lifted in F14). * fix(search): drop warn log on swallowed FTS5 syntax errors Copilot review on PR emdash-cms#777 flagged that logging swallowed FTS5 syntax errors at warn level was both a log-spam vector (anyone can fire malformed queries in a loop) and an information-disclosure surface (SQLite/D1 embed the raw query in the error message). Removes both warn logs. Behaviour is unchanged: malformed queries still return empty results / continue suggestion loop. Comments updated to explain why the swallow is intentionally silent. * fix(mcp): revision_restore writes to draft, not live, on revision-capable collections For collections that support revisions, the live row's data columns hold the published values and the draft lives in a row in the revisions table pointed to by draft_revision_id. The restore handler was bypassing this model: it called ContentRepository.update directly, overwriting the live columns with the source revision's data and leaving any pending draft unchanged. Per the tool's documented contract ('Replaces the current draft with the specified revision's data. Not automatically published...'), this was the wrong direction. Live should be left alone; the draft should become a copy of the chosen revision. EmDashRuntime.handleRevisionRestore now branches on whether the collection supports revisions: * Revision-capable: create a new draft revision carrying the source revision's data, update draft_revision_id + updated_at on the row. Live columns untouched. Response is hydrated so the returned data reflects the new draft state immediately (closes the bug-emdash-cms#2-shaped staleness in the restore response). * Non-revision: fall through to the legacy raw handler, which still writes to the live row. This is unchanged behavior for opt-out collections. The previous integration test for this case (drafts.test.ts:397) republished v2 before restoring v1, so the assertions passed regardless of whether restore touched draft, live, or both. Replaced with two repros that exercise the actual bug surface (live=v1, draft=v2 -> restore v1 leaves live=v1 and makes draft=v1; and no-draft -> restore creates a new draft). Both tests have been verified to fail against the pre-fix runtime. * style: format --------- Co-authored-by: emdashbot[bot] <emdashbot[bot]@users.noreply.github.com>
ascorbic
added a commit
that referenced
this pull request
May 10, 2026
* feat(aggregator): DID resolver with known_publishers cache
First piece of the records-consumer slice.
Schema additions to 0001_init.sql (still pre-deploy, no follow-up
migration needed):
- publishers + publisher_verifications for the publisher.* NSIDs
- dead_letters for verification-failure forensics (distinct from
the configured Cloudflare DLQ, which is for transient retry
exhaustion — different failure modes, different destinations)
- signing_key / signing_key_id columns on known_publishers so the
table doubles as the DID-doc resolution cache; 24h TTL applied
at query time
WANTED_COLLECTIONS extended with publisher.profile and
publisher.verification.
DidResolver: pure constructor-injected class with a DidDocCache
interface (Map-backed in tests, known_publishers-backed in prod via
createD1DidDocCache). Materialises cached multibase keys to PublicKey
instances via @atcute/crypto, exhaustive on the p256/secp256k1
discriminated union. invalidate() for the re-resolve-after-key-rotation
path the verification step needs.
14 tests covering cache hit/miss/expiry/invalidate, malformed-DID
rejection, missing PDS / missing #atproto verification method
rejection, and the D1 binding contract (round-trip, first_seen_at
preserved across updates, end-to-end with the resolver).
* feat(aggregator): records consumer with PDS-verified ingest
Replaces the no-op queue() handler in index.ts with the full verification
+ ingest pipeline for the four record collections in scope (package.profile,
package.release, publisher.profile, publisher.verification).
Per-job pipeline:
1. Resolve PDS endpoint + signing key via DidResolver (cached in
known_publishers, 24h TTL).
2. Fetch CAR via com.atproto.sync.getRecord, hand to @atcute/repo's
verifyRecord (MST + signature in one call), capture carBytes for
record_blob storage.
3. Cross-check verified vs Jetstream-supplied bytes; verified always
wins; mismatch logs as `[aggregator] jetstream-discrepancy` for
future Jetstream-correctness monitoring.
4. Lexicon-validate via @atcute/lexicons safeParse against the schema
from @emdash-cms/registry-lexicons.
5. Per-collection structural checks:
- package.profile: record.id must equal at://did/collection/rkey
(the publisher could lie in the body even though MST verifies);
slug optional, falls back to rkey, must match rkey if present
- package.release: rkey must equal `<package>:<encoded-version>`,
version must parse as semver, version_sort computed from a 10-digit
zero-padded major.minor.patch + prerelease fragment
- publisher.profile: rkey must be 'self'; contact entries must have
at least one of url/email (lexicon can't express that constraint)
- publisher.verification: facts stored as observed; validity check
(current handle/displayName matches bound values) is read-time
6. Write to D1 with upserts on the appropriate primary key. Releases
use INSERT … DO NOTHING; on a different-content same-version retry,
audit a release_duplicate_attempts row with reason IMMUTABLE_VERSION.
Delete handling:
- Hard-delete for one-per-DID rows (package.profile, publisher.profile)
- Soft-delete (tombstoned_at) for everything else
- 0001_init.sql: releases FK now ON DELETE CASCADE so an out-of-order
delete (publisher deletes profile before releases drain) doesn't get
blocked by FK violation; the publisher's intent is the whole package
going away
Error policy:
- PDS network/timeout/5xx: message.retry()
- 404, oversized response, signature/MST/lexicon failure, structural
failure: write dead_letters row with structured reason + payload,
message.ack(). Never retry — these are malicious or broken upstream
and we know retrying won't help
- Unexpected programming errors: log loud, dead-letters, ack — never
crash the worker (would block the queue slot)
Test coverage (37 new tests, 65 aggregator tests total):
- Per-collection writers: insert, upsert, lexicon failure, structural
rejection (RKEY_MISMATCH, CONTACT_VALIDATION_FAILED, INVALID_VERSION)
- Releases: version_sort computation; same-content replay is silent;
different-content replay audits release_duplicate_attempts
- Delete: hard-delete vs tombstone per collection; verification
upsert clears tombstone (re-publish recovers)
- Dispatcher: ack on success, retry on transient PDS error, retry on
network error, ack+dead_letters on permanent PDS error, delete
short-circuits PDS fetch
Known gaps (deferred):
- End-to-end happy-path test using FakePublisher + MockPds requires a
node-pool vitest project; left for a follow-up. Each layer is
independently tested; the wiring between them is the gap.
- releases.cts column mirrors verified_at because the lexicon doesn't
expose a creation timestamp and verifyRecord doesn't surface the
commit rev. Tracked in the writer with a TODO.
- Slice 3 artifact-mirror enqueue is a TODO comment in the release
writer; mirrored_artifacts stays empty for now.
* fix(aggregator): adversarial review fixes for records consumer
Address findings from the adversarial review of the records consumer
slice. Per-finding rationale:
BLOCKERS / CORRECTNESS
- B1: refreshPackageLatest() recomputes packages.latest_version +
capabilities after release insert / un-tombstone / tombstone-on-delete.
Before: profile writer bound `null` for both columns with a comment
promising the release writer would populate them; release writer never
did. Read APIs that read latest_version got NULL forever.
- B2: same-content release on a tombstoned row clears tombstoned_at
instead of silently no-op'ing through DO NOTHING. Otherwise a
delete-then-republish-same-content round-trip leaves the release
invisible to readers with no audit row to explain why.
- B3: release.extensions field is now validated as a plain object
containing a releaseExtension-keyed entry that passes
PackageReleaseExtension.mainSchema. emdash_extension column stores
only the validated payload (not arbitrary record.extensions).
Previously the consumer wrote whatever the publisher sent, including
scalars like "lol" — read API parsing would have thrown.
- M1: package.profile.security[] entries enforce "at least one of
url|email" per lexicon. Was only enforced for publisher.profile.
- Mi7: release writer pre-checks parent profile existence and throws
the new MissingDependencyError → controller.retry() instead of
letting the FK violation bubble up as UNEXPECTED_ERROR with no
recovery path. Out-of-order Jetstream delivery (release event before
its profile event) now retries until the profile arrives or hits
max_retries → DLQ for the reconciliation pass.
SHOULD-FIX / SAFETY
- M3: version_sort rejects components or prerelease numerics longer
than 10 digits as INVALID_VERSION (the pad-width ceiling).
- M5/M6/N1: mapPdsReason parameter typed via the imported
VerificationFailureReason union so a new reason added in
pds-verify.ts becomes a compile-time error here. PDS_NETWORK_ERROR
(the unreachable transient case) now throws "unreachable" instead
of silently dead-lettering as UNEXPECTED_ERROR. Exhaustive `never`
default catches future variants.
- M9: removed JSON.stringify discrepancy comparison between Jetstream
copy and verified PDS copy. The verified copy always wins so the
comparison is a monitoring signal only, but JSON.stringify isn't
canonical (key order, undefined-vs-missing) so it would fire
false-positive warnings constantly in real traffic. Add a
CBOR-canonical comparator when this monitoring becomes load-bearing.
- M2 (piggybacked): applyDelete for releases parses rkey to use the
PK index instead of the partial idx_releases_latest seek-then-scan.
- Mi5: applyDelete for unknown collections throws IngestError
UNKNOWN_COLLECTION instead of silently warning + acking. The
dispatcher's delete branch catches IngestError → forensics + ack so
unknown collections land in dead_letters instead of disappearing.
- Mi6: record.package field validated against the package-slug regex
(`^[a-zA-Z][a-zA-Z0-9_-]*$`). Closes the ambiguous-rkey hole where
package="foo:bar" + version="1.0.0" produced the same rkey as
package="foo" + version="bar:1.0.0".
- B4: processBatch accepts an optional depsOverride parameter for
testability. Production wiring path is now unit-test reachable.
LINT / CODE QUALITY
- jetstream-client.ts: replaced `event as unknown as
JetstreamCommitEvent` with a type-predicate `isCommitEvent` so the
narrowing is explicit. Same runtime semantics.
- records-do.ts: dropped redundant `<Env>` type argument on
`extends DurableObject` — `Cloudflare.Env` is the default.
- records-consumer.ts: introduced isPlainObject() type guard;
dropped redundant constructor on MissingDependencyError.
REVIEW FINDINGS NOT FIXED
- M4 (DID syntax allowing underscore): phantom finding. The original
regex already allowed `_` everywhere — the reviewer misread the
character class. Verified via xxd. No-op change reverted.
- M7 (cache invalidate race): not load-bearing under serial dispatch.
Documented for the eventual parallel-dispatcher follow-up.
- M8 (release_duplicate_attempts noise during retry windows): observe
first; the proposed mitigation (skip audit if verified_at younger
than 10min) adds complexity that may itself be wrong.
- Mi1, Mi2, N2, N3: cosmetic.
83 aggregator tests pass (was 65). Typecheck clean. Zero lint
diagnostics in apps/aggregator.
* fix(aggregator): round 2 adversarial review fixes
Round-2 review found 8 byproduct bugs introduced by the round-1 fixes
themselves. Per-finding rationale:
HIGH
- #1: refreshPackageLatest TOCTOU race. Two concurrent consumer
invocations could each SELECT max version_sort, then race their
UPDATEs, leaving packages.latest_version regressed (final write
wins from the older snapshot). Replaced the SELECT-then-UPDATE
pair with a single UPDATE using correlated subqueries; D1
serialises the read+write at write-commit time, so concurrent
callers always compute against the actual current state.
Capabilities now extracted via SQLite's json1 functions
(json_each + json_group_array on $.declaredAccess).
- #2: stale latest_version on refresh failure. INSERT committed,
refresh threw, dispatcher acked → permanent inconsistency
because subsequent same-content retries hit DO NOTHING and the
prior conditional-only refresh skipped. Now insert + refresh run
in db.batch() — D1 wraps in one transaction, so either both
commit or both roll back. Same fix in applyDelete (tombstone +
refresh batched). Refresh runs unconditionally on every release
ingest path (idempotent; race-safe via #1).
MEDIUM
- #3: mapPdsReason throw escaped the catch. Function-arg evaluation
ran mapPdsReason(err.reason) BEFORE writeDeadLetter, so a throw
there crashed the whole batch. Now wrapped in its own try/catch
with an UNEXPECTED_ERROR fallback so the dead_letters write still
happens.
- #4: isCommitEvent predicate trusted `kind` only. A producer
emitting `{kind: "commit"}` without a structurally valid `commit`
field would slip through and crash the ingestor at
`event.commit.collection`; cursor wouldn't advance, Jetstream
would replay the malformed event forever. Predicate now verifies
commit.collection / commit.rkey / commit.operation are present
strings. Test stub updated to match what real producers emit.
- #5: parseReleaseRkey URIError on malformed %-encoding. A delete
with rkey "demo:1.0.0%XX" threw URIError → not caught as
IngestError → controller.retry() → 5 wasted attempts before DLQ.
Now caught and returned as null so applyDelete silently no-ops.
- #6: DLQ had no consumer. Configured a second consumer in
wrangler.jsonc draining `emdash-aggregator-records-dlq`. New
drainDeadLetterBatch handler logs each job to Workers logs +
writes a `dead_letters` forensics row, then acks. Until
reconciliation lands, this prevents permanent silent drops of
legitimate-but-out-of-order release events that exhausted retries.
- #7: release_duplicate_attempts unbounded under spam. Hostile
publisher pumping the same different-content payload could fill
the audit table indefinitely. Added UNIQUE(did, package, version,
attempted_record_blob) constraint + ON CONFLICT DO NOTHING on the
insert, so true duplicates dedupe at the storage layer.
LOW
- #8: previously-claimed "MissingDependencyError retry" test
actually only exercised the PDS_HTTP_ERROR(5xx) path. Added a
proper test using the new ConsumerDeps.verify injection point —
exercises the writer's parent-profile pre-check and asserts the
dispatcher routes MissingDependencyError → controller.retry().
WIRING / TESTABILITY
- ConsumerDeps gained an optional `verify` override, defaulting to
fetchAndVerifyRecord. Tests can inject a stub that returns a
synthetic VerifiedPdsRecord without standing up a real CAR
fixture. This is what makes the MissingDependencyError test
reachable inside the workers pool.
- index.ts queue() handler now dispatches by batch.queue name to
either processBatch (records) or drainDeadLetterBatch (DLQ).
LINT CLEANUP
- jetstream-client.ts: defined a wider `MaybeCommitEvent` parameter
type for the predicate so commit-field inspection doesn't need an
unsafe cast. The type's `commit?` is structurally-typed so any
RawJetstreamSubscription<E> with E extends {kind: string}
remains assignable.
- records-consumer.ts: removed unused refreshPackageLatest wrapper;
callers use refreshPackageLatestStmt directly inside batches.
89 tests pass (was 83). Typecheck clean. Zero lint diagnostics in
apps/aggregator.
Reviewer's verdicts on round-1 fixes I claimed: all present and
structurally correct. The new bugs were byproducts, not regressions.
* fix(aggregator): round 3 adversarial review fixes
Round-3 review found 5 bugs in the round-2 fixes themselves. Per-finding
rationale:
MEDIUM
- #1: drainDeadLetterBatch silently dropped messages on D1 failure.
The try/catch around writeDeadLetter swallowed errors, then the
unconditional message.ack() outside the try acked anyway —
forensics lost AND message gone. The configured max_retries: 3 in
wrangler.jsonc was dead config because the handler never threw.
Fix: move ack() inside the try; on error, controller.retry() so
workerd redelivers per the DLQ consumer's max_retries.
- #2: FTS thrashing on idempotent refresh. Round-2 traded
releaseSetChanged guard for "always run refresh" to fix the
insert-success / refresh-failure consistency bug. But the
unconditional refresh hit the packages_au AFTER UPDATE trigger,
which DELETEs and re-INSERTs into packages_fts on every row touch
— even when latest_version + capabilities are unchanged. Every
same-content Jetstream replay (the common case) reindexed FTS for
no reason. Fix: extend REFRESH_PACKAGE_LATEST_SQL's WHERE clause
to short-circuit at SQL level when both target columns already
hold the computed values. Avoids the trigger fire entirely; keeps
the round-2 atomicity (insert + refresh still in one batch).
MINOR
- #3: applyDelete malformed rkey was a silent no-op. Round-2 fix
correctly stopped the URIError-causes-retry path but overcorrected
by removing the audit trail. Operators investigating "why didn't
this delete take effect?" had nothing to look at. Now throws
IngestError("RKEY_MISMATCH") so the dispatcher writes a
dead_letters row before acking.
- #4: release_duplicate_attempts.rejected_at was frozen at first
attempt. Hostile publisher pumping the same bytes for a year showed
in the audit table as a single row dated a year ago. Conflict
clause now DO UPDATE SET rejected_at = excluded.rejected_at so the
audit row tracks the latest attempt; same-bytes deduping at the
storage layer still works.
- #5: idx_release_duplicates was redundant with the new UNIQUE
constraint's implicit index (the UNIQUE on (did, package, version,
attempted_record_blob) covers all (did, package, version) prefix
lookups). Dropped the explicit index — fewer indexes to maintain
on each write.
ASKED-ABOUT, NO BUG FOUND
Reviewer verified all 12 prompt concerns either lead to fixes above
or are clean: D1 batch atomicity, correlated-subquery semantics under
SQLite snapshot isolation, json_each(NULL) behavior, MaybeCommitEvent
parameter assignability, ConsumerDeps.verify injection correctness,
duplicate dead_letters writes, jetstream-client test stub fidelity,
concurrent dead_letters writes. No round-1 or round-2 finding
regressed.
92 tests pass (was 89). Typecheck clean. Zero lint diagnostics in
apps/aggregator.
The slice has now had three rounds of adversarial review with all
findings addressed in-tree. Outstanding deferred items (M7 cache
race under hypothetical parallel dispatcher, M8 audit-noise
mitigation, B4 production wiring still untested by integration
test) tracked for follow-up PRs.
* fix(aggregator): Copilot review feedback
Seven Copilot comments on PR #975. One real bug + four doc/comment
fixes + one test cleanup + one cosmetic.
REAL BUG
- json_group_array(key) over json_each enumerates in unspecified
order, so the capabilities JSON string could vary between runs even
when the key set is unchanged — defeating the WHERE-AND-IS-NOT
short-circuit and re-firing the packages_au trigger on every
idempotent refresh. Wrapped both subqueries in
`(SELECT key FROM json_each(...) ORDER BY key)` so the resulting
array is order-stable.
DOC FIXES
- pds-verify.ts 404 comment claimed "ack without forensics", but the
consumer writes a dead_letters row for RECORD_NOT_FOUND (and tests
assert that behavior). Updated comment to match the implemented
policy: forensics for both the legitimate-race case and the
programming-error case, distinguished by reason code for queryability.
- parseReleaseRkey docstring claimed "callers should treat null as
no-op", but applyDelete now throws IngestError on null to surface
the malformed delete in dead_letters. Updated docstring to describe
the actual contract.
- dead_letters table comment listed reason values that don't match
the implemented `DeadLetterReason` union. Replaced with the actual
list and noted the union lives in records-consumer.ts. Also
clarified payload is UTF-8 JSON bytes, not raw record bytes.
- wrangler.jsonc DLQ-consumer comment claimed "repeated failures get
acked after retries". Actual behavior: handler calls retry() on D1
failure, max_retries: 3 then workerd drops (no DLQ-of-DLQ).
TEST CLEANUP
- Removed an empty placeholder test that documented why the dispatcher's
retry-on-MissingDependency wasn't covered. The proper coverage now
exists in the dispatcher suite via ConsumerDeps.verify injection;
the placeholder was dead.
- Stripped review-finding shorthand codes (Mi6, B3, round-3 #1, etc.)
from describe-block names. The codes were leaking review-process
scaffolding into permanent test descriptions.
91 tests pass (was 92 — dropped the empty placeholder). Typecheck
clean. Zero lint diagnostics in apps/aggregator.
* fix(aggregator): second round of Copilot review feedback
Four more Copilot comments. Two real bugs + two doc fixes.
REAL BUGS
- DidResolver.invalidate() bumped known_publishers.last_seen_at
backwards to 1970-01-01. The implementation called cache.upsert()
with `new Date(0)`, and the D1 binding writes the same timestamp
to all three of `pds_resolved_at`, `first_seen_at`, and
`last_seen_at`. Net effect: invalidating a publisher's cached
signing key (after a key rotation) corrupted the
membership/observation timestamps as a side-effect.
Fix: added a dedicated `expire(did)` method to the DidDocCache
interface. The D1 binding's expire only touches
`pds_resolved_at` (sets to epoch); the Map-backed test cache does
the same in-memory. Resolver's `invalidate()` delegates to
`cache.expire()` instead of forcing the issue through `upsert()`.
- isCommitEvent didn't validate `commit.cid`. The ingestor reads
`event.commit.cid` for non-delete operations, so a malformed
commit event with no cid would slip through and produce a
RecordsJob with `cid: undefined` — which the consumer would then
try to verify against, with no useful error path.
Fix: extended the predicate to require `cid` to be a string when
`operation !== "delete"`. Delete events legitimately have no cid
so the check is conditional. Test stub updated to match.
DOC FIXES
- applyDelete comment for package.profile claimed "we don't cascade
because doing so silently throws away publication history". The
schema FK has been ON DELETE CASCADE since the round-1 fixes (so
out-of-order Jetstream delivery doesn't fail with FK violation
when a profile-delete arrives before its release-deletes). Updated
the comment to describe the actual cascade behavior + the audit
history that DOES survive (release_duplicate_attempts,
dead_letters).
- release_duplicate_attempts schema comment said the consumer used
`ON CONFLICT … DO NOTHING`, but the round-3 fix changed it to
`DO UPDATE SET rejected_at = excluded.rejected_at` so operators
can read freshness from the audit row. Updated the comment.
95 tests pass (was 91; added 2 jetstream-client cid-validation
tests + 2 did-resolver expire tests). Typecheck clean. Zero lint
diagnostics in apps/aggregator.
* fix(aggregator): third round of review feedback
ask-bonk + Copilot caught 5 real bugs and 1 stale comment.
REAL BUGS
- Duplicate-version detection compared raw CAR bytes. CARs include
the publisher's commit + MST proof, which churns whenever the
publisher writes any other record in the same repo — so a benign
re-fetch of an unchanged record produced different bytes and got
misclassified as an immutability violation.
Fix: compare the verified record CID instead. CIDs are
content-addressed and stable for unchanged records. Schema
updated: `release_duplicate_attempts` gained `attempted_cid TEXT
NOT NULL` and the UNIQUE constraint now uses it instead of
`attempted_record_blob`. The blob is still kept for forensics so
operators can inspect what was actually attempted; the conflict
clause refreshes both `rejected_at` and `attempted_record_blob`
so the latest envelope wins. Comparison logic in the writer reads
the existing CID out of `signature_metadata` JSON via a small
`parseCid` helper.
- `processBatch` for-loop didn't isolate per-message failures. A
`writeDeadLetter` throw inside `processMessage` (e.g. transient D1
hiccup mid-batch) escaped to the loop and halted it, leaving
subsequent messages without ack/retry. With max_batch_size 25,
one D1 hiccup could waste up to 24 PDS verification round-trips
on redelivery. Now each message is wrapped in try/catch with a
retry() on uncaught throw, matching the shape `drainDeadLetterBatch`
already uses.
- `computeVersionSort`'s `.zzz` final-release sentinel didn't
actually beat all valid prereleases — `1.0.0-zzzz` sorts AFTER
`1.0.0` because `"zzzz"` is lexically greater than `"zzz"`. Switched
the sentinel to `~` (ASCII 126), one above the prerelease
alphabet's max char (`z` at 122). Final releases now sort after
any prerelease at the same major.minor.patch.
- `parseReleaseRkey` validated nothing about its components — a
malformed rkey like `demo:1.0.0:extra` parsed as
`pkg=demo`, `version=1.0.0:extra`, then UPDATE matched no row and
silently acked. Now validates pkg with PACKAGE_SLUG_RE and version
with SEMVER_RE; returns null on mismatch so applyDelete throws
IngestError → forensics row.
- `isCommitEvent` accepted any string for `commit.operation`. An
unknown operation slipping through would produce a RecordsJob the
consumer can't dispatch on — landing as UNEXPECTED_ERROR in
dead_letters. Now restricts to {create, update, delete}.
DOC FIX
- pds-verify.test.ts header claimed end-to-end coverage of the
`verifyRecord` handoff exists in records-consumer.test.ts via
MockPds + FakePublisher. Actually it's stubbed via
ConsumerDeps.verify — the fixture can't load in the workers test
pool. Updated comment to explain the actual coverage shape.
102 tests pass (was 95). Typecheck clean. Zero lint diagnostics.
ascorbic
added a commit
that referenced
this pull request
May 13, 2026
Addresses 7 findings from the round-6 adversarial review and documents the eighth. #1 (high) Capability consent bypass [registry.ts, RegistryPluginDetail.tsx] The drift check was gated on the client sending acknowledgedDeclaredAccess. If the publisher's release record had no extension, the admin saw an empty permission dialog, omitted the acknowledgement, and the server skipped the check entirely -- letting a bundle whose manifest declares real capabilities slip through behind an empty consent UI. Server now extracts capabilities from the bundle manifest after download and refuses with DECLARED_ACCESS_REQUIRED if the bundle declares any capabilities and no acknowledgement was sent. Client always sends the list (empty when no extension) so the new server check is always armed. #2 (high) Concurrent install bundle deletion [registry.ts] Two parallel installs of the same (did, slug, version) both passed the pre-existing-row check, both uploaded to the same deterministic R2 prefix, and one then won the state-row PK race. The loser's catch block deleted the R2 bundle the winner had just written. On state-write failure we now re-query the state row: if a winner exists, we lost the race and must not touch the R2 bundle. Cleanup runs only when the failure is a real DB error, not a lost concurrent install. #3 (high) SSRF via DNS-resolving public hostnames [registry.ts, ssrf.ts moved] Literal-IP blocklist alone left a DNS-rebinding gap: any public DNS service resolving an attacker-chosen hostname to loopback / RFC1918 / 169.254.169.254 passed the URL check. The import pipeline already shipped resolveAndValidateExternalUrl which does Cloudflare DoH resolution and rejects on any forbidden resolved address; reuse it for artifact downloads. Move src/import/ssrf.ts to src/security/ssrf.ts to reflect that it's not import-specific. Leave a re-export shim at the old path so 13 existing callers keep working unchanged. Add #security/* path alias. #5 (high) Aggregator-supplied handles treated as verified [PublisherHandle.tsx] usePublisherHandle returned status: 'ok' with the aggregator-supplied handle whenever one was present, skipping local DID->handle round-trip. A compromised aggregator could label an attacker DID as e.g. 'stripe.com' and the UI would render it as verified. Always run LocalActorResolver via resolveDidToHandle; use the aggregator handle only for a cross-check. If the aggregator's claim differs from the verified handle, mark the publisher invalid. #6 (medium) Postgres migration 038 schema-qualification [038_registry_plugin_state.ts] The columns probe queried information_schema.columns without filtering by table_schema. A _plugin_state table in another schema (multi-tenant Postgres, per-test schemas) could make the migration skip the column adds. Filter by table_schema = current_schema(). #7 (medium) Install errors leak full artifact URLs [registry.ts] fetchArtifact recorded each full URL in the joined error message that bubbled up to the admin client. Artifacts hosted on storage backends often carry presigned tokens in the query string; failed installs were leaking those into HTTP responses and logs. Strip query and fragment when building client-visible errors (origin + path only); log the full URL server-side for debugging. #8 (medium) Credentialed aggregator URLs accepted [config.ts] validateAggregatorUrl accepted https://user:pass@example.com. The normalized URL ends up in the admin manifest and is shipped to every admin browser; browser fetch() also rejects credentialed URLs outright. Reject them at config-validation time. #4 (high, documented not fixed) Aggregator-trust-root scope [types.ts] Full MST proof / publisher signature verification is not in this PR; the server still trusts the aggregator-supplied (did, slug, checksum, artifact URL). Expand the JSDoc on EmDashConfig.experimental.registry to spell out exactly what the v1 trust contract is, what EmDash does verify independently (checksum, manifest id/version/capabilities), and what it doesn't (release-record signatures, replay). Recommendation: point aggregatorUrl only at an aggregator you operate or trust at centralized-source level until signature verification lands.
ascorbic
added a commit
that referenced
this pull request
May 14, 2026
…#1011) * fix(deps): catalog-pin zod so trusted plugins typecheck Astro bundles its own Zod and re-exports it as 'astro/zod'. Trusted plugins like @emdash-cms/plugin-forms import their route schemas via 'astro/zod', then pass those schemas to definePlugin() in core. With emdash's 'zod: ^4.3.5' resolving independently of Astro's caret, pnpm kept two Zod 4 patches in the tree (e.g. 4.3.6 alongside 4.4.1). Zod 4 embeds its semver in the type system, so two patches of Zod 4 are not assignable to each other. The forms plugin's route schemas (ZodObject<..., $strip>) were rejected by PluginRoute<TInput>['input'] (ZodType<unknown, unknown, $ZodTypeInternals<unknown, unknown>>) with 'Type "3" is not assignable to type "4"' on the internal version field. The native definePlugin overload silently failed, TS fell through to the StandardPluginDefinition overload, and reported a misleading 'id does not exist' error -- masking 8 cascading errors. Catalog-pinning Zod forces a single workspace-wide instance and restores normal overload resolution. No code changes needed in core or plugins/forms. Also adds a pnpm-workspace.yaml comment explaining the gotcha so the next person doesn't bump emdash's pin past Astro's range. * feat(registry): experimental decentralized plugin registry Adds opt-in support for installing sandboxed plugins from the decentralized plugin registry described in RFC #694. Enabled via `experimental.registry.aggregatorUrl` in the EmDash integration options; when set, the admin UI replaces marketplace browse/install with the registry path. Server: new install handler (RFC verification chain), endpoint at POST /_emdash/api/admin/plugins/registry/install, migration 038 adds `source = 'registry'` plus `registry_publisher_did` / `registry_slug` columns on `_plugin_state`, runtime sync split into shared marketplace + registry tiers via a normalized opaque `r_<hash>` plugin id. Browser: aggregator XRPC calls go direct from the admin UI via @emdash-cms/registry-client. Install POST runs through the server. Includes a minimum-release-age policy with a per-publisher exclude allowlist, enforced both client-side (UX) and server-side (gate). Hardening (5 rounds of adversarial review): bundle id rewritten to the derived pluginId before storage, aggregator identity cross-checked, artifact and aggregator URLs validated for SSRF (https-only in prod, IPv6 brackets handled), per-request and total budgets on every outbound call, decompressed bundle capped at 256 KiB to match the RFC publish-time limit, migration 038 idempotent on both SQLite and Postgres. Known gaps tracked for follow-up: full MST signature verification against the publisher's PDS, multibase multihash decoding (hex SHA-256 is accepted today), registry plugin update + uninstall handlers. * fix(registry-lexicons): drop codegen from build script The generated lexicon types are committed to git so consumers don't need the codegen toolchain. Running lex-cli generate as part of the default build pipeline broke Cloudflare Pages builds for sites that pull registry-lexicons in transitively, because lex-cli imports lex.config.ts directly and Node in the CF Pages build environment can't load .ts natively. Codegen moves to a separate `regen` script (`pnpm regen` runs codegen + full build). Maintainers run it when they edit the lexicons; consumers just consume the committed output. * fix(registry): copilot review fixes - Drift check normalizes capabilities (filter strings, dedupe, sort) on both browser and server so reorderings or junk entries can't trigger spurious rejection. Adds a shared normalizeCapabilities helper in registry/config.ts and a mirror in admin/lib/api/registry.ts. - RegistryPluginDetail no longer trusts the aggregator-supplied ext?.capabilities as already-validated string[]; runs it through normalizeCapabilities before display and before send. - Fix stale '32 MiB' docstring on extractBundle (cap is actually MAX_DECOMPRESSED_BUNDLE_BYTES = 256 KiB). - Fix plugin-id.ts JSDoc: validatePluginIdentifier regex is /^[a-z][a-z0-9_-]*$/ (allows hyphens); the prior 'cannot collide with marketplace ids' claim was too strong and is now framed as 'syntactically distinct, plus an explicit pre-existing-row check in the install handler.' * fix(registry): address review findings + CI failures CI fixes: - Rename normalizeCapabilities -> canonicalCapabilitiesForDriftCheck to avoid namespace clash with the existing capability normalizer exported from @emdash-cms/plugin-types via core's index. The old name shadowed plugin-types' helper at the top level of core's dist, which made the definePlugin() overload set look ambiguous to TS in plugins/forms and caused a typecheck cascade there. - [...seen].toSorted() instead of [...seen].sort() to clear the e18e/prefer-spread-syntax + unicorn/no-array-sort lint errors. Review findings (ask-bonk[bot]): - HIGH: drift check tripped on every install when the release record's extension was empty. The browser now omits acknowledgedDeclaredAccess when capabilities is empty, opting out of the server-side drift gate for the (currently common) case where publishers haven't filled in the extension block. The bundle's real capabilities are still bound to the checksum-verified bytes. - HIGH: DID-only publishers (no resolvable handle) could be linked from the browse grid but never installed because the server rejects handles without a '.'. Cards now render as non-interactive with a 'Publisher handle unresolved' badge; the detail page surfaces a matching warning and disables Install. - MEDIUM: registry-enabled sites were unconditionally routing existing marketplace plugin detail URLs to RegistryPluginDetail, breaking deep links. Detail-route selection now discriminates by param shape (pluginId.includes('/')) rather than the manifest flag. - MEDIUM: state-row write failure after storeBundleInR2 left orphan bundles. Best-effort cleanup in the catch via deleteBundleFromR2. - LOW: parseDurationSeconds runs on the user-supplied integration option per install (not the already-normalized manifest shape). Wrap in try/catch and surface as REGISTRY_POLICY_INVALID rather than letting it bubble to a generic INSTALL_FAILED. - LOW: validator-pattern doc drift in plugin-id.ts (already fixed in the prior commit). * fix(registry): move registry config types to their own module The new RegistryConfig + ExperimentalConfig interfaces lived alongside definePlugin's overloads in astro/integration/runtime.ts. tsdown + rolldown's chunking decided to inline a bigger subset of plugin-related types into the entry chunk as a result, which broke definePlugin() overload resolution for trusted plugins building against core's dist on CI (plugins/forms failed with 'id does not exist in type StandardPluginDefinition'). Move both types to packages/core/src/registry/types.ts (still re-exported from runtime.ts for backwards compatibility) so the chunking matches main's layout and definePlugin's overloads resolve as before. * fix(registry): wire up real-world install + display polish Aggregator (apps/aggregator): - Add CORS to /xrpc/* so the admin UI can call it from any origin (preflight 204, response headers on every method). Aggregator is a public read-only service; * is correct here. Core (packages/core): - Implement multibase-multihash checksum verification by re-encoding our SHA-256 digest in the same 'b<base32>' shape the registry CLI produces, rather than decoding the publisher's checksum. Same trust contract, no base32 decoder needed. Bare hex SHA-256 still accepted as a convenience fallback. - Switch install handler to take 'did' (not handle) so packages whose handle the aggregator couldn't resolve are still installable. The browser resolves handle→DID via the aggregator before posting and sends DID directly; the server skips resolvePackage and goes straight to getPackage. - Coerce 'experimental.registry' bare-string shorthand into the full RegistryConfig object via 'coerceRegistryConfig'. 'registry: "..."' is now equivalent to 'registry: { aggregatorUrl: "..." }'. - Plumb 'experimental' through the integration's serializableConfig so the manifest endpoint actually sees the user's registry block. Previously it was being stripped, so the admin UI never branched to the registry path. - Split RegistryConfig + ExperimentalConfig types into their own module (registry/types.ts) so they don't get bundled into the astro/integration/runtime.ts dist chunk -- the wider inlining was breaking definePlugin overload resolution for trusted plugins building against core's dist. Admin (packages/admin): - New <PublisherHandle> component + usePublisherHandle hook with tri-state result ('ok' / 'invalid' / 'missing'). Renders @handle, 'Unverified publisher' (red), or DID respectively. Uses @atcute/identity-resolver's LocalActorResolver for bidirectional handle verification, localStorage-cached for 24h. - Detail page disables install on 'invalid' status (publisher claims a handle that doesn't round-trip back to its DID -- impersonation risk). Surfaces 'We couldn't verify this publisher's identity' alert in plain language. - Detail page reads installed state from fetchPlugins() and swaps the Install button to 'Installed' (disabled) when the package already has a 'source = "registry"' row matching its DID + slug. React Query's existing ['plugins'] invalidation handles the post-install UI update. - Browse cards reuse <PublisherHandle> (variant='card') and link by handle when available, DID otherwise. Detail page parses either form from the URL. - Browser sends 'did' (not handle) in the install POST. Workspace: - '@cloudflare/kumo' moved to the pnpm catalog and bumped to ^1.16.0 workspace-wide. Older 1.10.0 was missing Sidebar export and being hoisted into the admin via packages/blocks's transitive dep. - Add '@atcute/multibase' to core (for checksum encoding) and '@atcute/identity-resolver' to admin (for DID->handle resolution). - Update DEFAULT_AGGREGATOR_URL + DiscoveryClient doc example from 'experimental-registry.emdashcms.com' to 'registry.emdashcms.com' (the actual production host). * fix(registry): adversarial review round 6 findings Addresses 7 findings from the round-6 adversarial review and documents the eighth. #1 (high) Capability consent bypass [registry.ts, RegistryPluginDetail.tsx] The drift check was gated on the client sending acknowledgedDeclaredAccess. If the publisher's release record had no extension, the admin saw an empty permission dialog, omitted the acknowledgement, and the server skipped the check entirely -- letting a bundle whose manifest declares real capabilities slip through behind an empty consent UI. Server now extracts capabilities from the bundle manifest after download and refuses with DECLARED_ACCESS_REQUIRED if the bundle declares any capabilities and no acknowledgement was sent. Client always sends the list (empty when no extension) so the new server check is always armed. #2 (high) Concurrent install bundle deletion [registry.ts] Two parallel installs of the same (did, slug, version) both passed the pre-existing-row check, both uploaded to the same deterministic R2 prefix, and one then won the state-row PK race. The loser's catch block deleted the R2 bundle the winner had just written. On state-write failure we now re-query the state row: if a winner exists, we lost the race and must not touch the R2 bundle. Cleanup runs only when the failure is a real DB error, not a lost concurrent install. #3 (high) SSRF via DNS-resolving public hostnames [registry.ts, ssrf.ts moved] Literal-IP blocklist alone left a DNS-rebinding gap: any public DNS service resolving an attacker-chosen hostname to loopback / RFC1918 / 169.254.169.254 passed the URL check. The import pipeline already shipped resolveAndValidateExternalUrl which does Cloudflare DoH resolution and rejects on any forbidden resolved address; reuse it for artifact downloads. Move src/import/ssrf.ts to src/security/ssrf.ts to reflect that it's not import-specific. Leave a re-export shim at the old path so 13 existing callers keep working unchanged. Add #security/* path alias. #5 (high) Aggregator-supplied handles treated as verified [PublisherHandle.tsx] usePublisherHandle returned status: 'ok' with the aggregator-supplied handle whenever one was present, skipping local DID->handle round-trip. A compromised aggregator could label an attacker DID as e.g. 'stripe.com' and the UI would render it as verified. Always run LocalActorResolver via resolveDidToHandle; use the aggregator handle only for a cross-check. If the aggregator's claim differs from the verified handle, mark the publisher invalid. #6 (medium) Postgres migration 038 schema-qualification [038_registry_plugin_state.ts] The columns probe queried information_schema.columns without filtering by table_schema. A _plugin_state table in another schema (multi-tenant Postgres, per-test schemas) could make the migration skip the column adds. Filter by table_schema = current_schema(). #7 (medium) Install errors leak full artifact URLs [registry.ts] fetchArtifact recorded each full URL in the joined error message that bubbled up to the admin client. Artifacts hosted on storage backends often carry presigned tokens in the query string; failed installs were leaking those into HTTP responses and logs. Strip query and fragment when building client-visible errors (origin + path only); log the full URL server-side for debugging. #8 (medium) Credentialed aggregator URLs accepted [config.ts] validateAggregatorUrl accepted https://user:pass@example.com. The normalized URL ends up in the admin manifest and is shipped to every admin browser; browser fetch() also rejects credentialed URLs outright. Reject them at config-validation time. #4 (high, documented not fixed) Aggregator-trust-root scope [types.ts] Full MST proof / publisher signature verification is not in this PR; the server still trusts the aggregator-supplied (did, slug, checksum, artifact URL). Expand the JSDoc on EmDashConfig.experimental.registry to spell out exactly what the v1 trust contract is, what EmDash does verify independently (checksum, manifest id/version/capabilities), and what it doesn't (release-record signatures, replay). Recommendation: point aggregatorUrl only at an aggregator you operate or trust at centralized-source level until signature verification lands. * fix(registry): adversarial review round 7 followups Two LOW findings from the round-7 review (PR #1011 comment). NSID exact-match in RegistryPluginDetail.tsx Round-6 left a startsWith() match on the release-extension key. RFC 0001 fixes the NSID for the release extension; accepting prefix variants (...releaseExtensionV2, ...releaseExtension.deprecated) would let a publisher render a different capability list than the canonical key would. Use exact-equality keyed lookup. Registry plugin uninstall affordance in PluginManager.tsx Registry-installed plugins appear in PluginManager but the Uninstall button is gated on isMarketplace. Admins see a permanent-looking install with no way to remove it short of editing the DB and R2 by hand. Add an inline note for source === 'registry' rows that says uninstall isn't available yet and points the admin at the disable toggle. Full uninstall handler lands in a follow-up PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.