feat(phase6c-pr2): authoring lifecycle + actor_audit + multi-point catalog enforcement#27
Conversation
…talog enforcement Phase 6c PR-2 (per docs/phase6c-plan.md v5.1 §3.2). Closes the catch-22 where role_dispatch was UI-disabled because no surface existed to set candidate.execution_role. The role choice now flows through the apply payload (and an inline candidate editor), is validated against the catalog at four entry points, and every mutation writes a structured audit row. ## Backend - Migration 030: new actor_audit table (generic discriminator on subject_kind so other fields can adopt the same model later). Append-only, no FK on subject_id (history-preserving by design). - New backend/internal/audit/ package: Record(tx, ...) inside caller's transaction; QueryLatest(ctx, db, ...) for reads. ActorKind enum: user / api_key / router / system / connector. Confidence is rejected for non-router actors so downstream router-only queries cannot be polluted (critic round 1 #1). - Role.Category field on backend/internal/roles/catalog.go (all 6 roles tagged "role"; the dispatcher meta-role lands in PR-3). TestCatalogMatchesPromptDir now cross-checks the frontmatter category column. - BacklogCandidateStore.Update(id, req, actor) and ApplyToTaskWithMode (id, mode, executionRole, actor) signatures changed to carry ActorInfo. Both validate execution_role against the catalog, write the column, and emit the audit row inside the same transaction. New typed errors: ErrBacklogCandidateMissingExecutionRole, ErrBacklogCandidateUnknownExecutionRole. - BacklogCandidateStore.EnrichWithAuthoring populates BacklogCandidate.ExecutionRoleAuthoring from the latest audit row so the GET / PATCH / apply responses surface the who/when/rationale trail without a second API round-trip (risk-reviewer H1 fix). Pre-Phase-6c rows have no audit history and surface null — backfill is intentionally not done. - TaskStore.ClaimNextDispatchTask drains stale-role tasks (role no longer in catalog) by transitioning queued -> failed with error_kind=role_not_found and an actor_kind=system audit row, then continues to find the next valid task. Drain is bounded by staleDrainCap (16 per call) so a poisoned queue cannot wedge the connector loop (critic round 1 #6). - New GET /api/roles handler (public, no auth — catalog is public source data). Filters category="role"; meta-roles introduced in PR-3 will be excluded from this surface. - New buildAuthoringActor(r, rationale) helper: routes session callers to ActorUser, API-key callers to ActorAPIKey (id="api-key:<key id>"), else logs an "unauthenticated" defense-in-depth row. Resolves critic round 1 #3 + risk-reviewer M1 (the apply/PATCH paths previously collapsed both auth channels into ActorUser with empty actor_id when api-key was active). ## Frontend - New types/roles.ts: KNOWN_ROLE_IDS hand-mirror of the Go catalog + isKnownRoleId narrow function. Drift-detected by roles.test.ts which fetches /api/roles via mock and asserts set equality. - New ExecutionRoleAuthoring type on BacklogCandidate; the candidate response now carries who/when/why for execution_role. - usePlanningWorkspaceData fetches /api/roles on mount; availableRoles is RoleInfo[] | null with the null sentinel meaning "still loading" so the panel can suppress the stale-role warning during the fetch (critic round 1 #5 + risk-reviewer L1). - CandidateReviewPanel: role_dispatch radio is always enabled (the Phase 5 disabled-on-no-role gating is removed). When selected, an inline role <select> shows each catalog role with version + estimated minutes. Apply button is disabled until a role is chosen. Stale-role warning fires only after /api/roles resolves with the candidate's role missing from it. - New CandidateRoleEditor: chip + inline edit popover so operators can pre-tag candidates with a role outside the apply flow. PATCH carries the actor through the same audit path as the apply endpoint. ## Tests - audit/audit_test.go: 9 tests covering record + query + rollback + concurrent insert + ordering + router confidence required + non-router confidence rejected. - store: replaced two Phase 5 contract tests ("unknown role accepted", "rune-truncation defense") with Phase 6c PR-2 catalog-enforcement counterparts. Existing Update callers updated to pass audit.ActorInfo{}. - handlers/connector_dispatch_test.go: three new tests covering TestClaimNextTask_StaleRoleTransitionsToFailed, TestClaimNextTask_DrainsStaleRoleThenClaimsNext, TestRolesEndpoint_ReturnsCatalog. - frontend: 7 CandidateRoleEditor tests, 5 new CandidateReviewPanel tests covering role select rendering + Apply gating + stale warning, 6 drift tests against /api/roles. ## Reviews completed - make pre-pr: green twice (SQLite + PostgreSQL + frontend build + lint + typecheck). - critic subagent (round 1): 10 findings, 4 Mandatory all addressed (#1 confidence rule, #3 actor disambiguation, #5 stale-warning race, #6 drain loop cap). 3 Should-fix addressed (#2 documented, #7 documented, #9 verified). - /security-review: 0 findings at confidence >= 8. - risk-reviewer: 3 HIGH, 4 MED, 6 LOW. HIGH/MED all addressed inline (H1 audit read path via EnrichWithAuthoring; H2 DECISIONS entry added; H3 documented as PR-4 surfacing path; M1 buildAuthoringActor; M2/M3 docs/api-surface.md + docs/data-model.md updated). ## Docs - docs/api-surface.md: PATCH + apply contracts updated, new GET /api/roles documented, breaking-change call-out for the apply payload. - docs/data-model.md: actor_audit table documented; execution_role description updated. - DECISIONS.md: new 2026-04-26 entry recording the implementation refinements that diverged from the 2026-04-25 entry's pre-implementation API names + cascade- delete description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Implements Phase 6c PR-2 authoring lifecycle for execution_role via a generic actor_audit ledger, and enforces role-catalog validity across PATCH/apply/claim-next-task/frontend.
Changes:
- Add
actor_audit(migration 030) plusinternal/auditwriter/reader, and surface latest authoring metadata asexecution_role_authoringon candidate responses. - Enforce role catalog at PATCH/apply boundaries and drain stale-role tasks in
ClaimNextDispatchTask(queued→failed + audit row, capped drain loop). - Add public
GET /api/rolesand wire frontend dropdown + drift tests + inline role editor UI.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/types/roles.ts | Adds frontend-maintained mirror of backend role ids + type guard. |
| frontend/src/types/roles.test.ts | Drift test ensuring frontend role id mirror matches /api/roles. |
| frontend/src/types/index.ts | Extends BacklogCandidate with execution_role_authoring and defines ExecutionRoleAuthoring type. |
| frontend/src/pages/ProjectDetail/planning/hooks/usePlanningWorkspaceData.ts | Loads /api/roles, manages apply-time role selection, adds candidate role PATCH helper. |
| frontend/src/pages/ProjectDetail/planning/CandidateRoleEditor.tsx | New inline chip/editor to PATCH candidate execution_role. |
| frontend/src/pages/ProjectDetail/planning/CandidateRoleEditor.test.tsx | Tests for CandidateRoleEditor behavior (display/edit/save/cancel/disabled). |
| frontend/src/pages/ProjectDetail/planning/CandidateReviewPanel.tsx | Integrates CandidateRoleEditor, always-enabled auto-dispatch mode, role <select>, stale warning, apply gating. |
| frontend/src/pages/ProjectDetail/planning/CandidateReviewPanel.test.tsx | Updates/adds tests for new radio/select/apply gating/stale warning behavior. |
| frontend/src/pages/ProjectDetail/PlanningTab.tsx | Threads new workspace props (roles + chosen role + update callback) into panel. |
| frontend/src/pages/ProjectDetail/PlanningTab.test.tsx | Mocks listRoles for workspace mount. |
| frontend/src/pages/ProjectDetail.test.tsx | Mocks listRoles to keep page tests passing. |
| frontend/src/api/client.ts | Extends apply payload to include execution_role; adds listRoles() + RoleInfo. |
| docs/data-model.md | Documents actor_audit table and strengthened execution_role semantics. |
| docs/api-surface.md | Updates PATCH/apply contracts, adds /api/roles, documents execution_role_authoring. |
| backend/internal/store/task_store.go | Adds stale-role drain loop and audit write on queued→failed for stale role sources. |
| backend/internal/store/backlog_candidate_store_test.go | Updates store.Update signature usages to include audit.ActorInfo. |
| backend/internal/store/backlog_candidate_store.go | Adds catalog enforcement + transactional audit recording; adds EnrichWithAuthoring; updates apply signature. |
| backend/internal/store/backlog_candidate_execution_role_test.go | Updates tests for new apply contract and PATCH rejection of unknown roles. |
| backend/internal/router/router.go | Wires new public /api/roles route. |
| backend/internal/roles/catalog_test.go | Extends drift test to include category validation. |
| backend/internal/roles/catalog.go | Adds role Category field and constants; sets categories for existing roles. |
| backend/internal/models/requirement.go | Adds execution_role_authoring to BacklogCandidate and adds apply payload execution_role. |
| backend/internal/handlers/roles.go | Implements GET /api/roles filtered to category=role. |
| backend/internal/handlers/planning_runs.go | Plumbs actor-aware auditing into PATCH/apply; enriches responses with authoring projection. |
| backend/internal/handlers/handlers_test.go | Updates backlog candidate store Update calls for new signature. |
| backend/internal/handlers/connector_dispatch_test.go | Adds tests for stale-role draining and /api/roles endpoint. |
| backend/internal/handlers/authoring_actor.go | Adds actor builder distinguishing API-key vs session user for audit attribution. |
| backend/internal/handlers/apply_execution_mode_test.go | Updates apply-mode tests for new required execution_role in role_dispatch mode. |
| backend/internal/audit/audit_test.go | Adds coverage for record/query, router confidence rules, rollback semantics, concurrency. |
| backend/internal/audit/audit.go | Introduces audit.Record + audit.QueryLatest with actor-kind + confidence validation. |
| backend/db/migrations/030_authoring_audit.sql | Adds actor_audit table and indexes. |
| backend/db/migrations/030_authoring_audit.down.sql | Down migration dropping indexes/table. |
| backend/cmd/server/main.go | Registers RolesHandler in server deps. |
| DECISIONS.md | Records PR-2 refinement decisions and constraints. |
| // Phase 6c PR-2: load /api/roles once on mount; the panel uses this | ||
| // to render the dropdown options. listRoles is cheap (catalog is | ||
| // server-side static data) and the response feeds the drift test | ||
| // catch in roles.test.ts. | ||
| // | ||
| // Loading sentinel: `null` distinguishes "fetch in flight" from | ||
| // "fetch succeeded with an empty list" (per critic round 1 #5 + | ||
| // risk-reviewer L1). The CandidateReviewPanel suppresses the | ||
| // stale-role warning while loading to avoid false positives on | ||
| // mount / on transient network failure. | ||
| const [availableRoles, setAvailableRoles] = useState<RoleInfo[] | null>(null) | ||
| useEffect(() => { | ||
| listRoles() | ||
| .then(resp => setAvailableRoles(resp.data ?? [])) | ||
| .catch(() => setAvailableRoles([])) | ||
| }, []) |
There was a problem hiding this comment.
The comment says the null loading sentinel suppresses stale-role warnings “on transient network failure”, but the catch path sets availableRoles to [], which makes downstream code treat the catalog as loaded-but-empty (and can trigger stale warnings / an empty dropdown with no loading indicator). Either keep availableRoles as null (or add an explicit error state) on fetch failure, or update the comment and ensure the UI communicates the failure clearly.
| // Phase 6c PR-2: catalog enforcement. Parse role_id from source | ||
| // ("role_dispatch:<role>") and check IsKnown. Stale-role tasks are | ||
| // transitioned to failed in this same transaction and the outer | ||
| // loop retries to find the next valid task. | ||
| roleID := parseRoleIDFromSource(task.Source) | ||
| if !roles.IsKnown(roleID) { | ||
| now := time.Now().UTC() | ||
| execResultJSON, _ := json.Marshal(map[string]string{ | ||
| "error_kind": models.ErrorKindRoleNotFound, | ||
| "error_message": fmt.Sprintf("role %q is not in the current catalog", roleID), | ||
| }) |
There was a problem hiding this comment.
When task.Source is the legacy value "role_dispatch" (no : suffix), parseRoleIDFromSource returns an empty string, and the resulting failure message becomes role "" is not in the current catalog. If these legacy tasks can still exist, consider explicitly detecting the missing role suffix and producing a clearer error_message/rationale (or a distinct error_kind) so operators can understand what went wrong.
Six line-level findings from copilot-pull-request-reviewer on PR #27, all addressed without simple/temporary workarounds: 1. CandidateReviewPanel pass-through: stop coercing the null loading sentinel to []; pass `availableRoles ?? null` so CandidateRoleEditor keeps its loading-state suppression. 2. CandidateRoleEditor staleness SoT: once /api/roles loads, base isStale on the runtime catalog (`availableRoles.find`), and only fall back to the static KNOWN_ROLE_IDS mirror while still loading. Survives staggered deploys where backend catalog leads frontend. 3. usePlanningWorkspaceData fetch failure: keep availableRoles=null on catch (never []), and add a sibling availableRolesError string state plumbed through PlanningTab → both panels. Editor + dropdown render an explicit "Failed to load roles: …" message so operators can distinguish transient failure from "catalog is empty". 4. task_store.parseRoleIDFromSource: split malformed-source from role-not-found. New error_kind=role_dispatch_malformed (with its own remediation hint) is emitted when the task source has no role suffix; role_not_found stays for catalog absence with a well-formed id. Helper now returns (roleID, hasSuffix). 5. Migration 030 comment: enumerate all five actor_kind values (added api_key) and remove an embedded `;` that re-tripped the SQLite migration-parser comment-split bug. 6. audit package: split ErrConfidenceNotAllowed (non-router supplied confidence) from ErrInvalidConfidence (router missing/out-of-range). Logs/debug now describe the actual failure mode. pre-pr (lint + SQLite + PostgreSQL + frontend typecheck + vitest + production build): all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… docs Critic round 4 raised two Mandatory and three Suggested findings on the prior commit (72e066d). All addressed: M1 — Connector parser parity. backend/internal/connector/service.go RunOnceTask was emitting ErrorKind: "unknown" for both the missing- suffix and unknown-role branches. Now mirrors the server: emits models.ErrorKindRoleDispatchMalformed for missing/empty role suffix and models.ErrorKindRoleNotFound for well-formed-but-absent ids, matching the diagnostic the server uses on the claim path. Updated godoc on parseRoleIDFromSource to reflect this lockstep. M2 — Backfill obligation documented. New 2026-04-27 DECISIONS entry constraint (d) explicitly waives backfill: Phase 6c is greenfield, no live deployment carries tasks rows with source="role_dispatch" (no colon), so the new error_kind has no historical backfill obligation. Future operators who observe this kind should investigate the candidate-applier code path. S1 — Malformed-source store-layer test. New TestClaimNextTask_MalformedSourceTransitionsToFailed in connector_dispatch_test.go covers all three sub-cases (no_colon, empty_suffix, whitespace_suffix) and asserts error_kind + error_message + audit row. Added seedQueuedTaskWithSource helper to seed arbitrary source strings. S2 — Frontend error-UI test coverage. Two new CandidateRoleEditor.test.tsx tests: - shows "Failed to load roles: …" alert when availableRolesError is set and the editor is opened - does NOT flag stale while availableRoles is loading (null) without an error (regression guard for the suppression logic). S4 — Server-push catalog assumption documented inline. New code comment in CandidateRoleEditor.tsx flags the assumption that availableRoles is fetched once on mount and never mutated mid- session, so a future Phase 6c PR-4/PR-5 SSE change to push catalog updates needs to revisit the staleness check. pre-pr (lint + SQLite + PostgreSQL + frontend typecheck + vitest + production build): all green. risk-reviewer: zero HIGH/MEDIUM, six LOW (all dispositioned). /security-review: zero findings at confidence >= 8. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Phase 6c PR-2 — 完整 authoring lifecycle、
actor_audit通用稽核 ledger,以及四個強制點 (PATCH / Apply / claim-next-task / Frontend) 的角色 catalog 驗證。actor_audit表 (migration 030) — 通用 append-only ledger,以subject_kind區分;支援actor_kind∈ {user, api_key, router, system, connector},router 必填 confidence、其他 kind 禁用 confidence。BacklogCandidateStore.Update/ApplyToTaskWithMode改為 transactional,execution_role變動時於同一 tx 寫入 audit row;EnrichWithAuthoring於 handler layer join 最新 audit row 餵到ExecutionRoleAuthoring欄位 (audit-as-SoT, 不冗餘儲存 set_by/at)。ClaimNextDispatchTask在 outer loop 偵測 stale role (catalog 移除後遺留任務),自動 queued→failed 並寫 audit row;以staleDrainCap = 16防止 connector 在 poisoned queue 下 wedge。Applypayload 新增execution_role;前端role_dispatchradio 一律 enabled,內嵌<select>即時挑角色,支援loading sentinel(null) 避免 mount 時誤觸 stale warning。GET /api/roles(公開,category=role過濾);前端KNOWN_ROLE_IDSdrift test 鎖住前後端 catalog 同步。buildAuthoringActorhelper 區分 api-key vs session-user actor,避免 PATCH/Apply 把 connector 寫成 user。詳細設計請見
docs/phase6c-plan.mdv5.1 §3.2 與DECISIONS.md2026-04-26 條目。Review trail
Test plan
make pre-pr全綠 (本地)GET /api/roles回傳僅含category=role的條目🤖 Generated with Claude Code