Skip to content

fix: skill_runner reads /tmp/skills (matches where install_skills writes)#4

Merged
ericodom merged 1 commit into
mainfrom
worktree-builtin-tools-deploy
Apr 12, 2026
Merged

fix: skill_runner reads /tmp/skills (matches where install_skills writes)#4
ericodom merged 1 commit into
mainfrom
worktree-builtin-tools-deploy

Conversation

@ericodom
Copy link
Copy Markdown
Contributor

Summary

  • `skill_runner.py` was pointing at `/app/skills` — a directory the Dockerfile creates empty and never fills. `install_skills.install_skill_from_s3()` downloads every per-request skill to `/tmp/skills`. The read path and the write path never aligned, so every skill silently failed to register on the parent agent.
  • Found this while validating built-in tools end-to-end. `chat-agent-invoke` logs correctly said "Injected built-in tool 'web-search' (provider=exa)", the AgentCore container received the payload, `install_skill_from_s3` pulled the files down, and then `register_skill_tools_grouped` logged `0 tool-mode tools, 0 agent-mode skills, 0 total` because it was looking in the wrong place. Claude reply: "I don't have a live web search tool available."

Test plan

  • Repro: observed the `0 tool-mode tools, 0 agent-mode skills` line in /thinkwork/dev/agentcore after sending a chat
  • After merge + deploy (needs container rebuild): send the same chat, expect Claude to call `web_search` and return actual Austin results
  • Re-test SerpAPI provider path for the same tool

Depends on the merge of #3.

install_skills.install_skill_from_s3() writes downloaded skills to
/tmp/skills/{skill_id}/, but skill_runner.SKILLS_DIR was set to
/app/skills — a directory the Dockerfile creates but never populates.
Every per-request skill registration therefore silently skipped every
skill (parse yields None because skill.yaml doesn't exist under /app).

Observed: chat-agent-invoke logs showed 'Injected built-in tool
web-search (provider=exa)' and the AgentCore container received the
payload, but 'Grouped skill registration: 0 tool-mode tools, 0
agent-mode skills, 0 total' — so Claude never saw web_search, never
called it, and replied "I don't have a web search tool available."

This fix aligns the read path with the write path. No tenant data
lives under /app/ anyway — Lambda /tmp is the only writable location
at runtime, and that's already what install_skills uses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ericodom ericodom merged commit 17c7b7b into main Apr 12, 2026
2 checks passed
ericodom added a commit that referenced this pull request Apr 20, 2026
…ck (#296)

Follow-up to #294 / #295. Marco bootstrap chain worked for the first
hop but silently stopped at step 2: job #2 enqueued #3 successfully,
but #3 (which hit max_new_pages again) didn't enqueue #4.

Root cause: the continuation bucket was computed as
`Date.now() + 300s`. When a chained job's own runtime exceeded its
bucket length (job ran 112s but bucket is 300s; once runtime plus the
offset crosses a bucket boundary, the computed "next" bucket equals the
bucket the job itself is running in). The dedupe key collided with the
child's own row, `ON CONFLICT DO NOTHING` swallowed the insert, and
the chain died without a visible error.

Anchor the offset on `job.created_at` instead. Each step in a chain
now produces a strictly-monotonic bucket: parent in bucket N ⇒ child
enqueued for bucket N+1, child in bucket N+1 ⇒ grandchild for N+2 —
regardless of how long any step took to run. Dedupe collisions can
only fire against external jobs (e.g., a memory-retain trigger that
hit the same bucket), which is the correct behavior.

Observed on dev: Marco job chain went 1→2, stopped. Post-fix,
re-triggering will exercise the continuation through the full 261
memories until the cursor drains.

No test regression (411/419, pre-existing 8 skips). Continuation
behavior itself is covered by the Marco integration that spawned
this fix.
ericodom added a commit that referenced this pull request Apr 20, 2026
…tion fix (#318)

Closes handoff plan items #3 and #4.1 from
plans/2026-04-20-006-handoff-cluster-enrichment-and-followups.md.

## #4.1 extractCityFromAddress dotted-abbreviation fix

Spanish/Canadian/Australian addresses like "..., San Miguel de Allende,
Gto., Mexico" previously produced candidate "Gto." (Guanajuato) because
the region-code walk only recognized `^[A-Z]{2,4}(\s|$)` patterns. The
audit showed 32 Marco records producing "Gto." and 22 producing "Q.R."
(Quintana Roo) as candidate cities — both now correctly resolve to the
preceding city slot.

Fix: `isDottedRegionAbbr` recognizes `[A-Z][a-z]{0,2}\.` groups
(up to 4 repetitions, ≤ 10 chars) and terminates the walk like the
existing US-style region codes do. Record-expander probe on Marco
confirms: candidate "San Miguel De Allende" (support=32) replaces "Gto."
and the equivalent cities replace "Q.R."

## #3 summary-expander → deterministic linker wiring

The audit showed `deriveParentCandidatesFromPageSummaries` produces 91
candidates on Marco (Toronto, Seattle, Honolulu hubs + many more) but
the deterministic linker only consumed `deriveParentCandidates(records)`
— so those candidates never became links.

### Type extension

`DerivedParentCandidate` gets `sourceKind?: "record" | "summary"`:
- record: `sourceRecordIds` are memory-record ids (existing path)
- summary: `sourceRecordIds` are page ids (new path)
Field is optional so existing test fixtures stay green; emitter defaults
to "record" when omitted.

### Emitter update

`emitDeterministicParentLinks` now builds two leaf indexes:
- `leavesByRecord` — keyed on memory-record id (existing)
- `leavesById` — keyed on page id, built from `[...scopePages,
  ...affectedPages]` (new)

Summary-kind candidates route through `leavesById`. A new optional
`scopePages` arg feeds the index so summary-kind candidates can resolve
leaves that weren't touched THIS batch — necessary because summary-based
candidates come from a scope-wide scan, not batch-local records.

### Compiler wiring

`applyPlan` now calls BOTH expanders and passes merged candidates to
the linker, along with `candidatePages` (already fetched for the
planner call) as `scopePages`. Bypasses the merge-across-kinds ambiguity
by concatenating both lists rather than merging — the emitter handles
each candidate according to its kind.

### Precision filter tightening

Summary-expander output adds `isLikelyCityToken` filter before pushing
into the byCity map:
- drops > 4-word fragments ("Prospect Interested In The Full PVL Product Line")
- drops < 3-char tokens ("St")
- drops street-suffix endings ("Congress Ave", "Queen St")

These are the noise categories the 04-20 audit surfaced; real cities
like "Buenos Aires" (2 words), "New York" (2 words), and "Montréal"
(1 word with accent) all pass through.

## Expected impact

On the next Marco recompile: `links_written_deterministic` should
increase from the current 14/batch (record-only path) by 30-60 links
(Toronto entity leaves → Toronto hub-if-exists-else-fuzzy-match,
Seattle / Honolulu / etc.) — validated by unit tests; live-compile
numbers land after deploy.

Backfill dry-run on Marco: 22 parent links (up from 21), all spot-
check precision-correct; wet run: 386 → 387 reference links (+1 net
new, most attempts are idempotent re-writes of existing edges).

## Test coverage

- 3 new dotted-abbreviation tests (Gto., Q.R., B.C.)
- 5 new summary-expander filter tests (word cap, short, street suffix,
  sourceKind tagging for both expanders)
- 5 new deterministic-linker tests for the summary-kind branch
  (scopePages leaf resolution, empty-pool skip, non-entity filter,
  back-compat when sourceKind omitted, both-kinds-fire for same parent)
- Total: 471 passed / 8 skipped (up from 458).
- Typecheck clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request Apr 20, 2026
…ks dedup (#319)

Captures the 2026-04-20 third-session end state. Supersedes
plans/2026-04-20-006-handoff-cluster-enrichment-and-followups.md now that
PRs #318 closed out items #3 (summary-expander wiring) and #4.1
(dotted-abbreviation fix).

Four remaining items in priority order:

1. Validate #318 on Marco (TOP PRIORITY). PR promised
   links_written_deterministic jump of +30-60 on the next live compile,
   but the backfill can't exercise that path. Trigger a manual compile
   and verify the prediction before building more on top.

2. Unit 6 mention cluster enrichment. Biggest remaining plan item. Brief
   on the key design decision (mentionClusterEnrichments[] JSON shape —
   Option A separate rows recommended over Option B inline-promotions).

3. wikiBacklinks dedup. Warm-up PR (<1 hour). listBacklinks misses the
   dedup pattern that listConnectedPages already has.

4. Trivial grab-bag: wipeWikiScope FK dependency, applier-split debt
   (good to bundle with #2 since cluster promotion adds more lines to
   the already-1300-line applyAggregationPlan).

Session tally: 6 PRs merged (#309, #311, #312, #316, #317, #318).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request Apr 22, 2026
Three drift incidents in five days (Apr 17 mig 0008, Apr 18 collision,
Apr 21 0018+0019) had the same root cause and the same named-but-
unshipped fix. Captures the pattern so the next author sees it before
shipping migration #4.

The drift-reporter fix shipped in PR #367; this is the learning that
explains why the reporter exists and what marker convention every
future unindexed migration must follow.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request Apr 22, 2026
Resolves the P1 tension the handoff flagged: the plan honestly scoped R13
to 'no token via Python-stdio-mediated writes or known-shape CloudWatch
patterns' and added T1b (intra-tenant template-author exfil), but the
brainstorm still read absolutely. Three edits align the goalposts:

1. R13 rescoped to match the plan, with named residual coverage gaps
   (os.write at fd level, subprocess env dumps, C-extension writes,
   multiprocessing workers, adversarial split-writes) tracked as the
   Stdout-bypass class alongside T1.
2. Success Criterion #4 softened to match — 'within R13's scope.'
3. T1b added as a first-class residual threat between T1 and T2, with
   v1 mitigations (1-hour TTL, shared-template-author review as
   compensating control, tenant-as-trust-boundary) and v2 hardening
   track (per-user ABAC session tags or in-process credential proxy,
   latter preferred because it addresses T1 and T1b simultaneously).
ericodom added a commit that referenced this pull request Apr 22, 2026
…ation (Unit 6) (#430)

## createTenant wiring

After INSERTing the tenant row, createTenant now invokes the
agentcore-admin Lambda's /provision-tenant-sandbox route (plan Unit 5)
via a new invokeProvisionTenantSandbox helper. The invoke uses
InvocationType: RequestResponse per feedback_avoid_fire_and_forget_lambda_invokes
so errors surface inside createTenant — but createTenant catches and
logs them so a sandbox outage doesn't turn into a tenant-onboarding
outage. The reconciler (Unit 6 follow-up) sweeps rows with null
sandbox_interpreter_*_id at its own cadence.

invokeProvisionTenantSandbox:
- Reads AGENTCORE_ADMIN_LAMBDA_ARN + AGENTCORE_ADMIN_TOKEN env vars;
  throws SandboxProvisioningConfigError if missing (distinguishable so
  missing config is a warn, not an error).
- Builds the API Gateway v2 envelope the handler is written for.
- 45s abort signal matches the handler's own budget.
- Translates statusCode 4xx → Error with server-side message; 2xx →
  structured ProvisionResult.

## updateTenantPolicy (new)

Platform-operator-only mutation for sandbox_enabled + compliance_tier
policy changes. Separate from updateTenant because the changes are
security-boundary shifts and are audited in tenant_policy_events.

Gate: caller's email must appear in the THINKWORK_PLATFORM_OPERATOR_EMAILS
allowlist (comma-separated env var). This is the swap-out point for
formal RBAC when it lands.

Transition semantics encoded in a pure computeTransition helper (11
tests):
- No-op when nothing changes.
- sandbox_enabled true rejected when compliance_tier != standard.
- compliance_tier → non-standard coerces sandbox_enabled off, producing
  a paired audit event so the transition is reproducible from the
  audit log alone.
- tier-first ordering means 'enable sandbox AND set tier to hipaa in
  one call' deterministically rejects.

Writes are wrapped in a db.transaction so the tenants UPDATE and the
tenant_policy_events INSERT land atomically.

## GraphQL

- Tenant type gains sandboxEnabled / complianceTier / sandboxInterpreter*Id
- New UpdateTenantPolicyInput + updateTenantPolicy mutation
- pnpm schema:build re-ran against the AppSync subscription schema

## Tests — 16 passing

- sandbox-provisioning.test.ts (5) — envelope shape, 200/202/4xx parsing,
  Lambda FunctionError surfacing
- updateTenantPolicy.test.ts (11) — every permutation of the transition
  decision tree: no-op, toggle true on standard, reject toggle true on
  regulated/hipaa, toggle false always ok, tier change coerces sandbox,
  composite tier+sandbox requests

## Deferred to follow-up

- Reconciler Lambda (EventBridge scheduled fill + drift passes) — lands
  when the agentcore-admin Lambda terraform resource lands. Currently
  handled reactively: sandbox failures on createTenant are logged and
  the next successful createTenant-invoke retry picks up partial state.
- SNS platform-security topic — per handoff P1 #4, dropped from v1
  because no named subscriber / SLA exists. Audit row in
  tenant_policy_events provides detection; notification is v2.
- Pre-existing tenants: a one-time operator action flips sandbox_enabled
  per-tenant during staged rollout (see plan Operational Notes).
ericodom added a commit that referenced this pull request Apr 24, 2026
…hon skill (#486)

Parity pass with packages/skill-catalog/thinkwork-admin/scripts/operations/*.py.
The @thinkwork/admin-ops package now exposes every op the Python skill
ships, and the admin-ops MCP server registers all of them as MCP tools.
Sets up deprecation of the skill: agents using mcp.thinkwork.ai can
do everything the skill's Python wrappers did.

Client
- AdminOpsClient gains a `graphql(query, variables?)` helper that POSTs
  to /graphql with the same Bearer, throws AdminOpsError on error
  responses. Mutations + most reads use GraphQL (matches the Python
  skill's wire); tenants module continues to use REST since those
  handlers already exist.

Ported modules (28 ops):
- teams.ts         5 mutations + 2 reads (createTeam, add/remove team agents + users, listTeams, getTeam)
- agents.ts        3 mutations + 3 reads (createAgent, setAgentSkills, setAgentCapabilities, listAgents, getAgent, listAllTenantAgents)
- templates.ts     5 mutations + 3 reads (createAgentTemplate, createAgentFromTemplate, syncTemplateToAgent, syncTemplateToAllAgents, acceptTemplateUpdate, listTemplates, getTemplate, listLinkedAgentsForTemplate)
- users.ts         0 mutations + 3 reads (me, getUser, listTenantMembers)
- artifacts.ts     0 mutations + 2 reads (listArtifacts, getArtifact)
- _fields.ts       shared GraphQL field-selection constants mirroring reads.py

MCP tool registration (packages/lambda/admin-ops-mcp.ts)
- 25 new tools covering every ported op. Each carries a JSON Schema
  inputSchema + a non-empty description. Tenant pinning from the
  authenticated key overrides any caller-supplied tenantId on
  downstream calls.
- The existing tools/list test asserts a curated must-have set
  rather than an exact-count equality, so future ports don't
  require test churn.

Tests
- 6 new tests in packages/admin-ops/src/teams.test.ts covering
  wire-format correctness (queries contain the right operation
  names, variables carry through, errors surface as AdminOpsError).
- Full monorepo test run: 1270+ tests passing.

Not in scope
- CLI migration of `thinkwork team/agent/template/user/artifact`
  subcommands — deferred to a follow-up PR.
- Removal of packages/skill-catalog/thinkwork-admin/ — deferred to
  PR #5 after seed (PR #4) promotes mcp.thinkwork.ai to tenants.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request Apr 24, 2026
…ction preflight (#485)

* refactor(sandbox): drop required_connections / OAuth preamble / connection preflight

Ends the sandbox's OAuth-into-os.environ path end-to-end. Admin UI
already stopped surfacing required_connections in #477; this closes
the loop across validator, preflight, dispatcher, container,
preamble, the pilot skill, concept doc, and runbook.

Why now: the OAuth preamble was a live realization of the T1/T1b/T2
residual-threat classes the concept doc itself warns about. The v2
in-process credential proxy is the planned structural fix; landing
that work cleanly requires this path gone. Agents that need OAuth-ed
work (Slack, GitHub) call composable-skill connector scripts instead.

What survives in the sandbox:
- execute_code is a pure-compute primitive
- The preamble still runs as executeCode call #1 — now a one-line
  sitecustomize readiness check that aborts the session if the stdio
  redactor didn't install (refuses to run user code on an unmitigated
  image). PREAMBLE_VERSION bumps to 2
- Preflight decision tree shrinks from 5 outcomes to 4; the
  missing-connection + ConnectionRevoked error classes are gone
- Dispatcher payload drops sandbox_secret_paths /
  sandbox_tenant_id / sandbox_user_id / sandbox_stage; only
  sandbox_interpreter_id + sandbox_environment survive
- packages/api/src/lib/sandbox-secrets.ts deleted outright
- Validator rejects required_connections on write (not silently
  accepts) so operators can't reintroduce it via raw GraphQL
- Hydration silently strips the key from legacy DB rows — no migration

Docs:
- Concept page loses the Preamble section, SandboxMissingConnection
  and ConnectionRevoked rows, required_connections YAML, and the T1
  residual row. T1b dropped since the exfil class no longer exists.
  The 11-step per-turn lifecycle shrinks to 9 steps
- Runbook loses failure modes #3 and #4; architecture-in-one-page
  simplified accordingly
- sandbox-pilot SKILL.md rewritten to pure compute (S3 upload via
  per-tenant IAM role, no Slack post, no GitHub token)

Verification: typecheck clean, 1050 api tests pass, docs site builds
79 pages clean, preamble emission verified via ast.parse + inline
regression assertions (no boto3, no SecretString, no os.environ[...],
no OAuth env-var names, no token prefixes).

Plan: docs/plans/2026-04-23-006-refactor-sandbox-drop-required-connections-plan.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(sandbox): ce-review autofix sweep — tool docstring, e2e fixture, warm-container cleanup

Applies ten safe_auto fixes surfaced by the ce:review pass across 9
reviewer personas. Highest-impact fixes:

- sandbox_tool.py: rewrote the execute_code Strands tool docstring to
  describe the pure-compute primitive. Removed the retired OAuth env
  var claims (GITHUB_ACCESS_TOKEN, SLACK_ACCESS_TOKEN, GCAL_ACCESS_TOKEN)
  and ConnectionRevoked error — 7 reviewers flagged this; the LLM
  reads this docstring as live tool guidance
- fixtures.ts: dropped required_connections from the e2e fixture's
  createAgentTemplate call (the validator now rejects it, every
  integration test would abort in setup). Removed vacuous
  syntheticTokens / seedConnections / putSecret plumbing that the
  token-leak assertion's forbiddenValues depended on
- sandbox-pilot.e2e.test.ts: kept the structural CloudWatch token-leak
  check but dropped the synthetic forbiddenValues — no synthetic
  tokens are injected, so the structural prefix check (ghp_/xoxb-/
  ya29./JWT) is the only meaningful guard
- invocation_env.py: unconditional pop of all six SANDBOX_* keys at
  invocation entry. Closes the warm-container carryover window where
  a pre-deploy invocation with interrupted cleanup could leak stale
  interpreter id + retired OAuth paths into the next invocation
- sandbox-invocation-log.ts: dropped connection_revoked from
  ALLOWED_EXIT_STATUSES + added a regression test asserting the
  value is now rejected
- sandbox_preamble.py: `if not installed()` → `if installed() is not
  True` (fail-closed against mocks returning truthy non-True).
  Dropped `from __future__ import annotations` (no-op after dataclass
  removal) and f-string with only module-constant interpolation
- skill.yaml: rewrote the retired required_connections / OAuth
  preamble comment block to describe pure-compute + per-tenant IAM
  S3 access
- SKILL.md: replaced non-existent SandboxDisabled error name with
  accurate description (dispatcher does not register the tool)

Verification:
- pnpm --filter @thinkwork/api typecheck clean
- pnpm --filter @thinkwork/api test — 1051/1051 pass (+1 new guard)
- pnpm --filter @thinkwork/docs build — 79 pages clean
- Python inline regressions: warm-container stale-key clear;
  preamble identity check + no OAuth markers; tool docstring clean

Residual findings (manual follow-up):
- ADV-001: legacy-row hydration — handlers cast agent.sandbox without
  running through validateTemplateSandbox. Functionally safe (field
  never read) but plan comment overstates the strip behavior
- KT-001: SandboxEnvironmentId hand-rolled duplicate of
  SandboxEnvironment from database-pg schema
- W2: CAPABILITIES.md doesn't mention the sandbox / connector-skill
  pattern to the agent system prompt

Review artifact: .context/compound-engineering/ce-review/20260423-194430-67ea771d/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request Apr 24, 2026
U4: add Astro redirects: { '/pricing': '/cloud' } in astro.config.mjs
and delete pricing.astro. Astro's static build emits /pricing/index.html
with <meta http-equiv='refresh'>, <link rel='canonical' href=.../cloud>,
and <meta name='robots' content='noindex'> — inbound links (search
results, Stripe cancels from older mobile builds, external shares)
continue to resolve, and the redirect stub is hidden from search
engines so duplicate-content risk is zero.

Update every known /pricing reference across the monorepo:
- apps/www/src/pages/m/checkout-complete.astro — mobile checkout
  fallback link "Return to pricing" → "Return to plans" → /cloud.
- apps/www/src/env.d.ts — consumer comment refresh.
- packages/pricing-config/src/plans.ts — consumer comment refresh.
- apps/admin/src/routes/onboarding/welcome.tsx — hardcoded https URL
  and visible anchor text "Return to pricing" → "Return to plans".
- apps/mobile/lib/stripe-checkout.ts — hardcoded cancelUrl to
  https://thinkwork.ai/cloud. Older installed mobile builds still
  point at /pricing and will hit the redirect; that's acceptable.
- terraform/modules/app/lambda-api/handlers.tf — STRIPE_CHECKOUT_
  CANCEL_URL now ends in /cloud so canceled checkouts land directly
  rather than bouncing through the redirect.

Leaves packages/api/src/handlers/stripe-checkout.ts telemetry string
'www-pricing' unchanged per plan Key Technical Decision #4 (analytics
continuity).
ericodom added a commit that referenced this pull request Apr 24, 2026
…te IA (#530)

* docs(plan): www Cloud/Services IA refactor plan

* feat(www): add /cloud route as copy of pricing page

U1: stand up /cloud serving current pricing content. Both routes
render identically at this point; subsequent units reframe /cloud,
add Services cross-link, and retire /pricing via redirect.

* feat(www): rename nav Pricing → Cloud

U2: flip the visible nav entry to "Cloud" pointing at /cloud. Single
edit in copy.ts propagates to desktop and mobile Header.astro renderings.

* feat(www): reframe /cloud as ThinkWork Cloud + add Services cross-link

U3: update hero/meta copy from generic "Pricing / Infrastructure you
own" to "ThinkWork Cloud / Hosted agent infrastructure, deployed
inside your AWS." Add a bulleted clarifier making scope explicit
(hosted plans vs separate services vs separate AWS usage vs self-hosted
via docs). Add a soft cross-link block between PricingGrid and
FinalCTA pointing to /services — prose + inline text link, not a
button, to keep the "soft pointer" posture asymmetric with the
first-class Cloud Hosting card arriving on /services in U5.

PricingGrid and the inline Stripe checkout script are unchanged —
the new sections live outside the PricingGrid DOM so [data-plan-cta]
/ [data-plan-error] selectors continue to match. Also refresh the
stale "Do NOT cross-link to /pricing" services-export comment now
that the IA split makes the handoff expected in both directions.

* refactor: retire /pricing route in favor of /cloud

U4: add Astro redirects: { '/pricing': '/cloud' } in astro.config.mjs
and delete pricing.astro. Astro's static build emits /pricing/index.html
with <meta http-equiv='refresh'>, <link rel='canonical' href=.../cloud>,
and <meta name='robots' content='noindex'> — inbound links (search
results, Stripe cancels from older mobile builds, external shares)
continue to resolve, and the redirect stub is hidden from search
engines so duplicate-content risk is zero.

Update every known /pricing reference across the monorepo:
- apps/www/src/pages/m/checkout-complete.astro — mobile checkout
  fallback link "Return to pricing" → "Return to plans" → /cloud.
- apps/www/src/env.d.ts — consumer comment refresh.
- packages/pricing-config/src/plans.ts — consumer comment refresh.
- apps/admin/src/routes/onboarding/welcome.tsx — hardcoded https URL
  and visible anchor text "Return to pricing" → "Return to plans".
- apps/mobile/lib/stripe-checkout.ts — hardcoded cancelUrl to
  https://thinkwork.ai/cloud. Older installed mobile builds still
  point at /pricing and will hit the redirect; that's acceptable.
- terraform/modules/app/lambda-api/handlers.tf — STRIPE_CHECKOUT_
  CANCEL_URL now ends in /cloud so canceled checkouts land directly
  rather than bouncing through the redirect.

Leaves packages/api/src/handlers/stripe-checkout.ts telemetry string
'www-pricing' unchanged per plan Key Technical Decision #4 (analytics
continuity).

* feat(www): consolidate /services into single card grid with Cloud Hosting handoff

U5: remove featured/secondary variant split in ServiceCard and
services.astro; every card now uses the same visual treatment. Drop
the two separate SectionShells (featured grid + 'Additional packages')
for a single consolidated section with 5 cards in one 3-column grid
(md:grid-cols-2 lg:grid-cols-3): Strategy Sprint, Pilot Launch,
Managed Operations, Workflow Expansion, and Cloud Hosting.

Governance & Eval + AI Program Advisory are dropped as named cards —
Governance substance folds into Managed Operations' includes list;
Advisory is cut. Cloud Hosting is a new 5th peer card introducing
ctaHref + ctaLabel fields on ServicePackage so it can deep-link
visitors to /cloud; other cards leave those fields unset and render
without per-card buttons (intake continues via the shared hero and
closing CTAs).

Preserves id="packages" on the consolidated SectionShell so the
hero's #packages anchor link still resolves. Variant discriminator
is removed entirely from the ServicePackage type.

* refactor(www): correct Cloud positioning + layout polish across /cloud and header

Fix the messaging on /cloud after live review:

- ThinkWork Cloud is the FULLY-HOSTED product (we operate the platform
  end-to-end) for teams that don't want to run the Enterprise Agent
  Harness themselves. The prior framing ("deployed inside your AWS")
  is the self-managed Enterprise product, not Cloud.
- Hero reframed: 'Fully managed AI agents, no infrastructure to run.'
  with a lede that names the Enterprise Agent Harness as the separate
  self-managed path.
- smallPrint + finePrint block removed from PricingGrid (cards now
  flow straight into the Services cross-link); the remaining data
  stays in copy.ts for a potential checkout-page surface.
- Cloud Hosting service card on /services reframed: 'Fully managed
  ThinkWork — no Agent Harness to run' with 'No AWS setup on your
  side' as its first include bullet.
- Shared plan summaries in packages/pricing-config updated so Starter
  and Enterprise no longer mention deploying inside the customer's
  AWS (mobile onboarding surfaces these too).

Add a /cloud-specific FinalCTA variant (finalCtaCloud in copy.ts)
with 'Fully managed' eyebrow, 'Adopt AI. Skip the infrastructure.'
headline, and matching points — swapped in via a new 'copy' prop on
FinalCTA. Homepage continues to use the default 'Your AWS · Your
rules' variant unchanged.

Layout polish:
- New 'tight' prop on FinalCTA drops its top border + gradient
  divider and reduces top padding, so /cloud flows cleanly out of
  the plans section.
- RECOMMENDED badge on PricingCard uses opaque #070a0f (matches the
  page body) instead of bg-brand/10 so the card border no longer
  shows through the pill.
- Services cross-link folded into the same SectionShell as the plans
  grid with tighter spacing, eliminating an entire empty section
  between the two.

Header.astro now indicates the current route with semibold white +
aria-current="page" on desktop and mobile nav; inactive items stay
text-slate-400. Trailing slashes and subpaths are handled.

* refactor(www): trim /services copy — drop lifecycle section and polish packaging

U6: cleanup pass after the packaging consolidation in U5.

- Remove the entire 'How it works / Engagement lifecycle' section
  (services.how in copy.ts + the 4-phase card grid in services.astro).
  With Strategy Sprint / Pilot Launch / Managed Operations / Workflow
  Expansion now as packages, the lifecycle narrates the same arc the
  cards already describe. One telling beats two.
- Hero headlineOutcome matches the Services posture from the brief:
  'We help teams scope, launch, operate, and expand governed AI
  workflows.'
- Positioning headline de-arced: 'One partner from first workflow to
  ongoing operations.' Positioning body and meta description also
  strip the references to Governance & Program Advisory that were
  dropped in U5.
- FAQ loses the 'What happens after launch?' entry — it named the
  removed packages and restated card content.
- Managed Operations card outcome + bestFor trimmed to stop echoing
  the body.
- Packages lede drops the 'Cloud Hosting sits alongside' explanation;
  the card speaks for itself.

With this /services reads as: hero → proof band → positioning →
5-card packages grid → 5-question FAQ → closing CTA.
ericodom added a commit that referenced this pull request May 5, 2026
fix: skill_runner reads /tmp/skills (matches where install_skills writes)
ericodom added a commit that referenced this pull request May 5, 2026
…tion fix (#318)

Closes handoff plan items #3 and #4.1 from
plans/2026-04-20-006-handoff-cluster-enrichment-and-followups.md.

## #4.1 extractCityFromAddress dotted-abbreviation fix

Spanish/Canadian/Australian addresses like "..., San Miguel de Allende,
Gto., Mexico" previously produced candidate "Gto." (Guanajuato) because
the region-code walk only recognized `^[A-Z]{2,4}(\s|$)` patterns. The
audit showed 32 Marco records producing "Gto." and 22 producing "Q.R."
(Quintana Roo) as candidate cities — both now correctly resolve to the
preceding city slot.

Fix: `isDottedRegionAbbr` recognizes `[A-Z][a-z]{0,2}\.` groups
(up to 4 repetitions, ≤ 10 chars) and terminates the walk like the
existing US-style region codes do. Record-expander probe on Marco
confirms: candidate "San Miguel De Allende" (support=32) replaces "Gto."
and the equivalent cities replace "Q.R."

## #3 summary-expander → deterministic linker wiring

The audit showed `deriveParentCandidatesFromPageSummaries` produces 91
candidates on Marco (Toronto, Seattle, Honolulu hubs + many more) but
the deterministic linker only consumed `deriveParentCandidates(records)`
— so those candidates never became links.

### Type extension

`DerivedParentCandidate` gets `sourceKind?: "record" | "summary"`:
- record: `sourceRecordIds` are memory-record ids (existing path)
- summary: `sourceRecordIds` are page ids (new path)
Field is optional so existing test fixtures stay green; emitter defaults
to "record" when omitted.

### Emitter update

`emitDeterministicParentLinks` now builds two leaf indexes:
- `leavesByRecord` — keyed on memory-record id (existing)
- `leavesById` — keyed on page id, built from `[...scopePages,
  ...affectedPages]` (new)

Summary-kind candidates route through `leavesById`. A new optional
`scopePages` arg feeds the index so summary-kind candidates can resolve
leaves that weren't touched THIS batch — necessary because summary-based
candidates come from a scope-wide scan, not batch-local records.

### Compiler wiring

`applyPlan` now calls BOTH expanders and passes merged candidates to
the linker, along with `candidatePages` (already fetched for the
planner call) as `scopePages`. Bypasses the merge-across-kinds ambiguity
by concatenating both lists rather than merging — the emitter handles
each candidate according to its kind.

### Precision filter tightening

Summary-expander output adds `isLikelyCityToken` filter before pushing
into the byCity map:
- drops > 4-word fragments ("Prospect Interested In The Full PVL Product Line")
- drops < 3-char tokens ("St")
- drops street-suffix endings ("Congress Ave", "Queen St")

These are the noise categories the 04-20 audit surfaced; real cities
like "Buenos Aires" (2 words), "New York" (2 words), and "Montréal"
(1 word with accent) all pass through.

## Expected impact

On the next Marco recompile: `links_written_deterministic` should
increase from the current 14/batch (record-only path) by 30-60 links
(Toronto entity leaves → Toronto hub-if-exists-else-fuzzy-match,
Seattle / Honolulu / etc.) — validated by unit tests; live-compile
numbers land after deploy.

Backfill dry-run on Marco: 22 parent links (up from 21), all spot-
check precision-correct; wet run: 386 → 387 reference links (+1 net
new, most attempts are idempotent re-writes of existing edges).

## Test coverage

- 3 new dotted-abbreviation tests (Gto., Q.R., B.C.)
- 5 new summary-expander filter tests (word cap, short, street suffix,
  sourceKind tagging for both expanders)
- 5 new deterministic-linker tests for the summary-kind branch
  (scopePages leaf resolution, empty-pool skip, non-entity filter,
  back-compat when sourceKind omitted, both-kinds-fire for same parent)
- Total: 471 passed / 8 skipped (up from 458).
- Typecheck clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 5, 2026
…ks dedup (#319)

Captures the 2026-04-20 third-session end state. Supersedes
plans/2026-04-20-006-handoff-cluster-enrichment-and-followups.md now that
PRs #318 closed out items #3 (summary-expander wiring) and #4.1
(dotted-abbreviation fix).

Four remaining items in priority order:

1. Validate #318 on Marco (TOP PRIORITY). PR promised
   links_written_deterministic jump of +30-60 on the next live compile,
   but the backfill can't exercise that path. Trigger a manual compile
   and verify the prediction before building more on top.

2. Unit 6 mention cluster enrichment. Biggest remaining plan item. Brief
   on the key design decision (mentionClusterEnrichments[] JSON shape —
   Option A separate rows recommended over Option B inline-promotions).

3. wikiBacklinks dedup. Warm-up PR (<1 hour). listBacklinks misses the
   dedup pattern that listConnectedPages already has.

4. Trivial grab-bag: wipeWikiScope FK dependency, applier-split debt
   (good to bundle with #2 since cluster promotion adds more lines to
   the already-1300-line applyAggregationPlan).

Session tally: 6 PRs merged (#309, #311, #312, #316, #317, #318).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 5, 2026
Three drift incidents in five days (Apr 17 mig 0008, Apr 18 collision,
Apr 21 0018+0019) had the same root cause and the same named-but-
unshipped fix. Captures the pattern so the next author sees it before
shipping migration #4.

The drift-reporter fix shipped in PR #367; this is the learning that
explains why the reporter exists and what marker convention every
future unindexed migration must follow.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 5, 2026
Resolves the P1 tension the handoff flagged: the plan honestly scoped R13
to 'no token via Python-stdio-mediated writes or known-shape CloudWatch
patterns' and added T1b (intra-tenant template-author exfil), but the
brainstorm still read absolutely. Three edits align the goalposts:

1. R13 rescoped to match the plan, with named residual coverage gaps
   (os.write at fd level, subprocess env dumps, C-extension writes,
   multiprocessing workers, adversarial split-writes) tracked as the
   Stdout-bypass class alongside T1.
2. Success Criterion #4 softened to match — 'within R13's scope.'
3. T1b added as a first-class residual threat between T1 and T2, with
   v1 mitigations (1-hour TTL, shared-template-author review as
   compensating control, tenant-as-trust-boundary) and v2 hardening
   track (per-user ABAC session tags or in-process credential proxy,
   latter preferred because it addresses T1 and T1b simultaneously).
ericodom added a commit that referenced this pull request May 5, 2026
…ation (Unit 6) (#430)

## createTenant wiring

After INSERTing the tenant row, createTenant now invokes the
agentcore-admin Lambda's /provision-tenant-sandbox route (plan Unit 5)
via a new invokeProvisionTenantSandbox helper. The invoke uses
InvocationType: RequestResponse per feedback_avoid_fire_and_forget_lambda_invokes
so errors surface inside createTenant — but createTenant catches and
logs them so a sandbox outage doesn't turn into a tenant-onboarding
outage. The reconciler (Unit 6 follow-up) sweeps rows with null
sandbox_interpreter_*_id at its own cadence.

invokeProvisionTenantSandbox:
- Reads AGENTCORE_ADMIN_LAMBDA_ARN + AGENTCORE_ADMIN_TOKEN env vars;
  throws SandboxProvisioningConfigError if missing (distinguishable so
  missing config is a warn, not an error).
- Builds the API Gateway v2 envelope the handler is written for.
- 45s abort signal matches the handler's own budget.
- Translates statusCode 4xx → Error with server-side message; 2xx →
  structured ProvisionResult.

## updateTenantPolicy (new)

Platform-operator-only mutation for sandbox_enabled + compliance_tier
policy changes. Separate from updateTenant because the changes are
security-boundary shifts and are audited in tenant_policy_events.

Gate: caller's email must appear in the THINKWORK_PLATFORM_OPERATOR_EMAILS
allowlist (comma-separated env var). This is the swap-out point for
formal RBAC when it lands.

Transition semantics encoded in a pure computeTransition helper (11
tests):
- No-op when nothing changes.
- sandbox_enabled true rejected when compliance_tier != standard.
- compliance_tier → non-standard coerces sandbox_enabled off, producing
  a paired audit event so the transition is reproducible from the
  audit log alone.
- tier-first ordering means 'enable sandbox AND set tier to hipaa in
  one call' deterministically rejects.

Writes are wrapped in a db.transaction so the tenants UPDATE and the
tenant_policy_events INSERT land atomically.

## GraphQL

- Tenant type gains sandboxEnabled / complianceTier / sandboxInterpreter*Id
- New UpdateTenantPolicyInput + updateTenantPolicy mutation
- pnpm schema:build re-ran against the AppSync subscription schema

## Tests — 16 passing

- sandbox-provisioning.test.ts (5) — envelope shape, 200/202/4xx parsing,
  Lambda FunctionError surfacing
- updateTenantPolicy.test.ts (11) — every permutation of the transition
  decision tree: no-op, toggle true on standard, reject toggle true on
  regulated/hipaa, toggle false always ok, tier change coerces sandbox,
  composite tier+sandbox requests

## Deferred to follow-up

- Reconciler Lambda (EventBridge scheduled fill + drift passes) — lands
  when the agentcore-admin Lambda terraform resource lands. Currently
  handled reactively: sandbox failures on createTenant are logged and
  the next successful createTenant-invoke retry picks up partial state.
- SNS platform-security topic — per handoff P1 #4, dropped from v1
  because no named subscriber / SLA exists. Audit row in
  tenant_policy_events provides detection; notification is v2.
- Pre-existing tenants: a one-time operator action flips sandbox_enabled
  per-tenant during staged rollout (see plan Operational Notes).
ericodom added a commit that referenced this pull request May 5, 2026
…hon skill (#486)

Parity pass with packages/skill-catalog/thinkwork-admin/scripts/operations/*.py.
The @thinkwork/admin-ops package now exposes every op the Python skill
ships, and the admin-ops MCP server registers all of them as MCP tools.
Sets up deprecation of the skill: agents using mcp.thinkwork.ai can
do everything the skill's Python wrappers did.

Client
- AdminOpsClient gains a `graphql(query, variables?)` helper that POSTs
  to /graphql with the same Bearer, throws AdminOpsError on error
  responses. Mutations + most reads use GraphQL (matches the Python
  skill's wire); tenants module continues to use REST since those
  handlers already exist.

Ported modules (28 ops):
- teams.ts         5 mutations + 2 reads (createTeam, add/remove team agents + users, listTeams, getTeam)
- agents.ts        3 mutations + 3 reads (createAgent, setAgentSkills, setAgentCapabilities, listAgents, getAgent, listAllTenantAgents)
- templates.ts     5 mutations + 3 reads (createAgentTemplate, createAgentFromTemplate, syncTemplateToAgent, syncTemplateToAllAgents, acceptTemplateUpdate, listTemplates, getTemplate, listLinkedAgentsForTemplate)
- users.ts         0 mutations + 3 reads (me, getUser, listTenantMembers)
- artifacts.ts     0 mutations + 2 reads (listArtifacts, getArtifact)
- _fields.ts       shared GraphQL field-selection constants mirroring reads.py

MCP tool registration (packages/lambda/admin-ops-mcp.ts)
- 25 new tools covering every ported op. Each carries a JSON Schema
  inputSchema + a non-empty description. Tenant pinning from the
  authenticated key overrides any caller-supplied tenantId on
  downstream calls.
- The existing tools/list test asserts a curated must-have set
  rather than an exact-count equality, so future ports don't
  require test churn.

Tests
- 6 new tests in packages/admin-ops/src/teams.test.ts covering
  wire-format correctness (queries contain the right operation
  names, variables carry through, errors surface as AdminOpsError).
- Full monorepo test run: 1270+ tests passing.

Not in scope
- CLI migration of `thinkwork team/agent/template/user/artifact`
  subcommands — deferred to a follow-up PR.
- Removal of packages/skill-catalog/thinkwork-admin/ — deferred to
  PR #5 after seed (PR #4) promotes mcp.thinkwork.ai to tenants.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 5, 2026
…ction preflight (#485)

* refactor(sandbox): drop required_connections / OAuth preamble / connection preflight

Ends the sandbox's OAuth-into-os.environ path end-to-end. Admin UI
already stopped surfacing required_connections in #477; this closes
the loop across validator, preflight, dispatcher, container,
preamble, the pilot skill, concept doc, and runbook.

Why now: the OAuth preamble was a live realization of the T1/T1b/T2
residual-threat classes the concept doc itself warns about. The v2
in-process credential proxy is the planned structural fix; landing
that work cleanly requires this path gone. Agents that need OAuth-ed
work (Slack, GitHub) call composable-skill connector scripts instead.

What survives in the sandbox:
- execute_code is a pure-compute primitive
- The preamble still runs as executeCode call #1 — now a one-line
  sitecustomize readiness check that aborts the session if the stdio
  redactor didn't install (refuses to run user code on an unmitigated
  image). PREAMBLE_VERSION bumps to 2
- Preflight decision tree shrinks from 5 outcomes to 4; the
  missing-connection + ConnectionRevoked error classes are gone
- Dispatcher payload drops sandbox_secret_paths /
  sandbox_tenant_id / sandbox_user_id / sandbox_stage; only
  sandbox_interpreter_id + sandbox_environment survive
- packages/api/src/lib/sandbox-secrets.ts deleted outright
- Validator rejects required_connections on write (not silently
  accepts) so operators can't reintroduce it via raw GraphQL
- Hydration silently strips the key from legacy DB rows — no migration

Docs:
- Concept page loses the Preamble section, SandboxMissingConnection
  and ConnectionRevoked rows, required_connections YAML, and the T1
  residual row. T1b dropped since the exfil class no longer exists.
  The 11-step per-turn lifecycle shrinks to 9 steps
- Runbook loses failure modes #3 and #4; architecture-in-one-page
  simplified accordingly
- sandbox-pilot SKILL.md rewritten to pure compute (S3 upload via
  per-tenant IAM role, no Slack post, no GitHub token)

Verification: typecheck clean, 1050 api tests pass, docs site builds
79 pages clean, preamble emission verified via ast.parse + inline
regression assertions (no boto3, no SecretString, no os.environ[...],
no OAuth env-var names, no token prefixes).

Plan: docs/plans/2026-04-23-006-refactor-sandbox-drop-required-connections-plan.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(sandbox): ce-review autofix sweep — tool docstring, e2e fixture, warm-container cleanup

Applies ten safe_auto fixes surfaced by the ce:review pass across 9
reviewer personas. Highest-impact fixes:

- sandbox_tool.py: rewrote the execute_code Strands tool docstring to
  describe the pure-compute primitive. Removed the retired OAuth env
  var claims (GITHUB_ACCESS_TOKEN, SLACK_ACCESS_TOKEN, GCAL_ACCESS_TOKEN)
  and ConnectionRevoked error — 7 reviewers flagged this; the LLM
  reads this docstring as live tool guidance
- fixtures.ts: dropped required_connections from the e2e fixture's
  createAgentTemplate call (the validator now rejects it, every
  integration test would abort in setup). Removed vacuous
  syntheticTokens / seedConnections / putSecret plumbing that the
  token-leak assertion's forbiddenValues depended on
- sandbox-pilot.e2e.test.ts: kept the structural CloudWatch token-leak
  check but dropped the synthetic forbiddenValues — no synthetic
  tokens are injected, so the structural prefix check (ghp_/xoxb-/
  ya29./JWT) is the only meaningful guard
- invocation_env.py: unconditional pop of all six SANDBOX_* keys at
  invocation entry. Closes the warm-container carryover window where
  a pre-deploy invocation with interrupted cleanup could leak stale
  interpreter id + retired OAuth paths into the next invocation
- sandbox-invocation-log.ts: dropped connection_revoked from
  ALLOWED_EXIT_STATUSES + added a regression test asserting the
  value is now rejected
- sandbox_preamble.py: `if not installed()` → `if installed() is not
  True` (fail-closed against mocks returning truthy non-True).
  Dropped `from __future__ import annotations` (no-op after dataclass
  removal) and f-string with only module-constant interpolation
- skill.yaml: rewrote the retired required_connections / OAuth
  preamble comment block to describe pure-compute + per-tenant IAM
  S3 access
- SKILL.md: replaced non-existent SandboxDisabled error name with
  accurate description (dispatcher does not register the tool)

Verification:
- pnpm --filter @thinkwork/api typecheck clean
- pnpm --filter @thinkwork/api test — 1051/1051 pass (+1 new guard)
- pnpm --filter @thinkwork/docs build — 79 pages clean
- Python inline regressions: warm-container stale-key clear;
  preamble identity check + no OAuth markers; tool docstring clean

Residual findings (manual follow-up):
- ADV-001: legacy-row hydration — handlers cast agent.sandbox without
  running through validateTemplateSandbox. Functionally safe (field
  never read) but plan comment overstates the strip behavior
- KT-001: SandboxEnvironmentId hand-rolled duplicate of
  SandboxEnvironment from database-pg schema
- W2: CAPABILITIES.md doesn't mention the sandbox / connector-skill
  pattern to the agent system prompt

Review artifact: .context/compound-engineering/ce-review/20260423-194430-67ea771d/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 5, 2026
…te IA (#530)

* docs(plan): www Cloud/Services IA refactor plan

* feat(www): add /cloud route as copy of pricing page

U1: stand up /cloud serving current pricing content. Both routes
render identically at this point; subsequent units reframe /cloud,
add Services cross-link, and retire /pricing via redirect.

* feat(www): rename nav Pricing → Cloud

U2: flip the visible nav entry to "Cloud" pointing at /cloud. Single
edit in copy.ts propagates to desktop and mobile Header.astro renderings.

* feat(www): reframe /cloud as ThinkWork Cloud + add Services cross-link

U3: update hero/meta copy from generic "Pricing / Infrastructure you
own" to "ThinkWork Cloud / Hosted agent infrastructure, deployed
inside your AWS." Add a bulleted clarifier making scope explicit
(hosted plans vs separate services vs separate AWS usage vs self-hosted
via docs). Add a soft cross-link block between PricingGrid and
FinalCTA pointing to /services — prose + inline text link, not a
button, to keep the "soft pointer" posture asymmetric with the
first-class Cloud Hosting card arriving on /services in U5.

PricingGrid and the inline Stripe checkout script are unchanged —
the new sections live outside the PricingGrid DOM so [data-plan-cta]
/ [data-plan-error] selectors continue to match. Also refresh the
stale "Do NOT cross-link to /pricing" services-export comment now
that the IA split makes the handoff expected in both directions.

* refactor: retire /pricing route in favor of /cloud

U4: add Astro redirects: { '/pricing': '/cloud' } in astro.config.mjs
and delete pricing.astro. Astro's static build emits /pricing/index.html
with <meta http-equiv='refresh'>, <link rel='canonical' href=.../cloud>,
and <meta name='robots' content='noindex'> — inbound links (search
results, Stripe cancels from older mobile builds, external shares)
continue to resolve, and the redirect stub is hidden from search
engines so duplicate-content risk is zero.

Update every known /pricing reference across the monorepo:
- apps/www/src/pages/m/checkout-complete.astro — mobile checkout
  fallback link "Return to pricing" → "Return to plans" → /cloud.
- apps/www/src/env.d.ts — consumer comment refresh.
- packages/pricing-config/src/plans.ts — consumer comment refresh.
- apps/admin/src/routes/onboarding/welcome.tsx — hardcoded https URL
  and visible anchor text "Return to pricing" → "Return to plans".
- apps/mobile/lib/stripe-checkout.ts — hardcoded cancelUrl to
  https://thinkwork.ai/cloud. Older installed mobile builds still
  point at /pricing and will hit the redirect; that's acceptable.
- terraform/modules/app/lambda-api/handlers.tf — STRIPE_CHECKOUT_
  CANCEL_URL now ends in /cloud so canceled checkouts land directly
  rather than bouncing through the redirect.

Leaves packages/api/src/handlers/stripe-checkout.ts telemetry string
'www-pricing' unchanged per plan Key Technical Decision #4 (analytics
continuity).

* feat(www): consolidate /services into single card grid with Cloud Hosting handoff

U5: remove featured/secondary variant split in ServiceCard and
services.astro; every card now uses the same visual treatment. Drop
the two separate SectionShells (featured grid + 'Additional packages')
for a single consolidated section with 5 cards in one 3-column grid
(md:grid-cols-2 lg:grid-cols-3): Strategy Sprint, Pilot Launch,
Managed Operations, Workflow Expansion, and Cloud Hosting.

Governance & Eval + AI Program Advisory are dropped as named cards —
Governance substance folds into Managed Operations' includes list;
Advisory is cut. Cloud Hosting is a new 5th peer card introducing
ctaHref + ctaLabel fields on ServicePackage so it can deep-link
visitors to /cloud; other cards leave those fields unset and render
without per-card buttons (intake continues via the shared hero and
closing CTAs).

Preserves id="packages" on the consolidated SectionShell so the
hero's #packages anchor link still resolves. Variant discriminator
is removed entirely from the ServicePackage type.

* refactor(www): correct Cloud positioning + layout polish across /cloud and header

Fix the messaging on /cloud after live review:

- ThinkWork Cloud is the FULLY-HOSTED product (we operate the platform
  end-to-end) for teams that don't want to run the Enterprise Agent
  Harness themselves. The prior framing ("deployed inside your AWS")
  is the self-managed Enterprise product, not Cloud.
- Hero reframed: 'Fully managed AI agents, no infrastructure to run.'
  with a lede that names the Enterprise Agent Harness as the separate
  self-managed path.
- smallPrint + finePrint block removed from PricingGrid (cards now
  flow straight into the Services cross-link); the remaining data
  stays in copy.ts for a potential checkout-page surface.
- Cloud Hosting service card on /services reframed: 'Fully managed
  ThinkWork — no Agent Harness to run' with 'No AWS setup on your
  side' as its first include bullet.
- Shared plan summaries in packages/pricing-config updated so Starter
  and Enterprise no longer mention deploying inside the customer's
  AWS (mobile onboarding surfaces these too).

Add a /cloud-specific FinalCTA variant (finalCtaCloud in copy.ts)
with 'Fully managed' eyebrow, 'Adopt AI. Skip the infrastructure.'
headline, and matching points — swapped in via a new 'copy' prop on
FinalCTA. Homepage continues to use the default 'Your AWS · Your
rules' variant unchanged.

Layout polish:
- New 'tight' prop on FinalCTA drops its top border + gradient
  divider and reduces top padding, so /cloud flows cleanly out of
  the plans section.
- RECOMMENDED badge on PricingCard uses opaque #070a0f (matches the
  page body) instead of bg-brand/10 so the card border no longer
  shows through the pill.
- Services cross-link folded into the same SectionShell as the plans
  grid with tighter spacing, eliminating an entire empty section
  between the two.

Header.astro now indicates the current route with semibold white +
aria-current="page" on desktop and mobile nav; inactive items stay
text-slate-400. Trailing slashes and subpaths are handled.

* refactor(www): trim /services copy — drop lifecycle section and polish packaging

U6: cleanup pass after the packaging consolidation in U5.

- Remove the entire 'How it works / Engagement lifecycle' section
  (services.how in copy.ts + the 4-phase card grid in services.astro).
  With Strategy Sprint / Pilot Launch / Managed Operations / Workflow
  Expansion now as packages, the lifecycle narrates the same arc the
  cards already describe. One telling beats two.
- Hero headlineOutcome matches the Services posture from the brief:
  'We help teams scope, launch, operate, and expand governed AI
  workflows.'
- Positioning headline de-arced: 'One partner from first workflow to
  ongoing operations.' Positioning body and meta description also
  strip the references to Governance & Program Advisory that were
  dropped in U5.
- FAQ loses the 'What happens after launch?' entry — it named the
  removed packages and restated card content.
- Managed Operations card outcome + bestFor trimmed to stop echoing
  the body.
- Packages lede drops the 'Cloud Hosting sits alongside' explanation;
  the card speaks for itself.

With this /services reads as: hero → proof band → positioning →
5-card packages grid → 5-question FAQ → closing CTA.
ericodom added a commit that referenced this pull request May 5, 2026
…ck (#296)

Follow-up to #294 / #295. Marco bootstrap chain worked for the first
hop but silently stopped at step 2: job #2 enqueued #3 successfully,
but #3 (which hit max_new_pages again) didn't enqueue #4.

Root cause: the continuation bucket was computed as
`Date.now() + 300s`. When a chained job's own runtime exceeded its
bucket length (job ran 112s but bucket is 300s; once runtime plus the
offset crosses a bucket boundary, the computed "next" bucket equals the
bucket the job itself is running in). The dedupe key collided with the
child's own row, `ON CONFLICT DO NOTHING` swallowed the insert, and
the chain died without a visible error.

Anchor the offset on `job.created_at` instead. Each step in a chain
now produces a strictly-monotonic bucket: parent in bucket N ⇒ child
enqueued for bucket N+1, child in bucket N+1 ⇒ grandchild for N+2 —
regardless of how long any step took to run. Dedupe collisions can
only fire against external jobs (e.g., a memory-retain trigger that
hit the same bucket), which is the correct behavior.

Observed on dev: Marco job chain went 1→2, stopped. Post-fix,
re-triggering will exercise the continuation through the full 261
memories until the cursor drains.

No test regression (411/419, pre-existing 8 skips). Continuation
behavior itself is covered by the Marco integration that spawned
this fix.
ericodom added a commit that referenced this pull request May 7, 2026
…ANT migration (#887)

* docs(plans): add Phase 3 U2 focused execution overlay

Focused execution overlay for U2 of the master Phase 3 plan: Aurora roles
+ Secrets Manager + GRANT migration. Splits the master plan's "RDS Proxy
+ separate endpoints" commitment into a follow-up unit (U12) since RDS
Proxy is greenfield infrastructure with no existing precedent in the
repo. Per-Lambda IAM scoping is documented as deferred (existing wildcard
inherits the new compliance/* secrets — acceptable interim).

Resolves all P0/P1 questions from the focused planning research:
- Postgres role provisioning mechanism: hand-rolled SQL with operator-
  supplied passwords (matches U1 invariant; defers postgresql provider
  dependency).
- Drift gate: extend scripts/db-migrate-manual.sh with probe_role for
  the new -- creates-role: marker type.
- Secret naming: slash-delimited thinkwork/${stage}/compliance/* per
  CLAUDE.md standard.
- Secret JSON shape: {username, password, host, port, dbname} (enriched
  vs the master's {username, password}).
- Auto-supplied passwords: operator-supplied via env vars, not committed
  tfvars; Terraform owns the secret container, not the value.

4 sub-units: drift-gate extension, SQL migration, Terraform secrets,
bootstrap script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(compliance): Aurora roles + Secrets Manager containers (U2)

Phase 3 U2 of the System Workflows revert. Provisions the three Aurora
roles that scope per-tier access to compliance.* (introduced in U1 via
PR #880) plus the Secrets Manager containers that hold their credentials.

What's in this PR

* `scripts/db-migrate-manual.sh` — adds `probe_role` and `creates-role:`
  marker support so hand-rolled migrations can declare cluster-global
  Postgres roles for the post-deploy drift gate. Mirrors the existing
  probe_extension shape (unqualified bare name, no schema prefix).

* `packages/database-pg/drizzle/0070_compliance_aurora_roles.sql` — the
  hand-rolled migration. Idempotent DO $$ blocks check pg_roles before
  CREATE ROLE; ALTER ROLE on existing roles rotates the password.
  format(%L, ...) handles SQL-quoting for arbitrary password content.
  GRANT matrix matches Decision #4:
    - compliance_writer:  USAGE on schema + INSERT only on audit_outbox
                          and export_jobs.
    - compliance_drainer: USAGE + SELECT/UPDATE on audit_outbox + SELECT
                          on actor_pseudonym + INSERT on audit_events.
    - compliance_reader:  USAGE + SELECT only on all four compliance.*
                          tables.

* `packages/database-pg/__tests__/migration-0070.test.ts` — 27 vitest
  assertions: structural shape, creates-role marker presence, idempotent
  DO block pattern, psql variable substitution, GRANT matrix per role,
  format(%L) usage for SQL-injection safety.

* `terraform/modules/data/aurora-postgres/main.tf` + `outputs.tf` — three
  `aws_secretsmanager_secret` containers at
  `thinkwork/${stage}/compliance/{writer,drainer,reader}-credentials`
  (slash-delimited per CLAUDE.md standard, enriched JSON shape) plus
  three new outputs. No `secret_version` resources — operator owns the
  value.

* `scripts/bootstrap-compliance-roles.sh` — wraps the per-stage one-time
  bootstrap: resolves master DB credentials, generates or accepts role
  passwords from env, populates the three Secrets Manager secrets via
  put-secret-value, runs the migration with psql -v substitution,
  verifies via \du + drift gate. Idempotent re-run safe.

Why this shape

The master plan's Decision #4 commits to "two distinct DB users +
separate RDS Proxy endpoints". Repo research surfaced two facts that
shape U2 scope:

  1. No RDS Proxy exists in the repo today. Lambdas connect direct.
     Introducing Proxy is greenfield infra (sub-module + IAM auth + SG
     + per-role endpoints) and deserves its own PR.
  2. No precedent exists for in-Terraform Postgres role management
     (cyrilgdn/postgresql provider not used; no null_resource shell-out
     pattern). Hand-rolled SQL applied via psql -f to dev BEFORE merge
     is the U1 invariant.

This PR ships the SOC2-required role separation. RDS Proxy moves to a
follow-up unit (master plan's "Deferred to Follow-Up Work" gets a new
U12 entry).

Apply-before-merge

Pre-merge operator workflow:
  STAGE=dev bash scripts/bootstrap-compliance-roles.sh

The script populates the three secrets (Terraform-applied containers
must already exist on dev — apply terraform-plan -target before the
bootstrap script if greenfield) and runs the migration.

Drift gate exits 0 against dev once roles + grants are in place.

Tests: 112 vitest assertions across 9 files (was 85 in U1 / PR #880).

Plan: docs/plans/2026-05-07-001-feat-compliance-u2-aurora-roles-plan.md
Master: docs/plans/2026-05-06-011-feat-compliance-audit-event-log-plan.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): apply autofix feedback

8 safe_auto fixes from ce-code-review (run 20260507-043805-679c4050):

scripts/bootstrap-compliance-roles.sh
* P1 — Add STAGE allowlist gate. Require CONFIRM_NONDEV=1 for staging/prod
  to prevent accidental production credential rotation. (adversarial adv-003)
* P1 — Stop leaking master DB password through python3 argv. Switch to
  stdin so the credential never appears in `ps aux` /
  /proc/<pid>/cmdline. (security SEC-002, correctness COR-002)
* P1 — Stop emitting plaintext password values to stderr on auto-
  generation. Generated passwords are retrievable via aws secretsmanager
  get-secret-value; printing them to stderr persists them in CloudWatch /
  GHA logs / operator shell history. (security SEC-001)
* P2 — Stop leaking compliance role passwords through psql -v argv.
  Replace -v writer_pass=... with a mktemp mode-0600 preamble file
  containing \set directives, consumed via psql -f. (security SEC-006,
  correctness COR-003, adversarial adv-005)
* P2 — Stop leaking secret payloads through `--secret-string "$json"`.
  Use --secret-string file://$payload_file with a mktemp 0600 file +
  trap-clean. (security RR-002, adversarial adv-005)
* P3 — Fix \du verification: psql ignores extra positional args, so the
  three-name form silently matches nothing. Use the glob \du compliance_*
  instead. (correctness COR-001, data-migrations DM-002)
* CI-safety — Resolve role passwords with three-way precedence: env
  override → existing Secrets Manager value → auto-generate. Prevents
  password rotation on every deploy when the script runs in CI; only
  greenfield bootstrap auto-generates.

packages/database-pg/drizzle/0070_compliance_aurora_roles.sql
* P2 — Add pre-flight schema-existence check. CREATE ROLE is non-
  transactional in Postgres; without this guard, applying 0070 against
  a database without 0069 would persist three roles with passwords but
  zero grants, and the drift gate would report APPLIED. The pre-flight
  DO block raises EXCEPTION before any role creation, converting silent
  partial failure into a hard stop. (data-migrations DM-001)

packages/database-pg/__tests__/migration-0070.test.ts
* P3 — Add per-role CREATE+ALTER format(%L) assertions. The original
  `formatMatches.length >= 6` assertion would silently pass if a future
  edit dropped the ALTER branch on one role. The new it.each assertion
  verifies both branches exist for each role explicitly. Also added one
  pre-flight schema-check assertion for the migration above.
  (security TG-001, +1 test from data-migrations)

Tests: 116/116 passing (was 112).

Residual actionable work (6 items routed to PR-body residuals):
  #9  per-Lambda IAM scoping for the thinkwork/* secrets wildcard
  #10 hardcoded RDS DNS suffix in bootstrap (cross-stage promotion deferred)
  #11 update CLAUDE.md to enumerate creates-role: marker
  #12 rollback documentation for REVOKE ALL → DROP ROLE sequence
  #13 ALTER DEFAULT PRIVILEGES for future schema mutations
  #14 ordering of put-secret-value vs psql apply (accepted tradeoff)

Run artifact: /tmp/compound-engineering/ce-code-review/20260507-043805-679c4050/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: add compliance-bootstrap job to deploy.yml

Phase 3 U2 chicken-egg: terraform-apply creates the three Secrets Manager
containers (writer/drainer/reader), but only bootstrap-compliance-roles.sh
can populate values + create matching Aurora roles + apply
0070_compliance_aurora_roles.sql. Without a CI step, migration-drift-check
would report the three creates-role: markers as MISSING after every
deploy until an operator manually ran the bootstrap.

The script is idempotent across re-runs: reads existing Secrets Manager
values and re-uses them on subsequent deploys (no password rotation),
only auto-generates on greenfield. Safe to run on every deploy.

New job `compliance-bootstrap`:
* needs [terraform-apply], runs after containers exist.
* Installs psql + jq, configures AWS creds, runs the bootstrap.
* migration-drift-check now needs [terraform-apply, compliance-bootstrap]
  so the drift gate verifies the roles created here.

Also added `?sslmode=require` to the bootstrap's DATABASE_URL for parity
with migration-drift-check; psql in CI errors on missing TLS without it.

Resolves the apply-before-merge gap surfaced in PR #887's body. After
this commit lands, the deploy flow self-heals on first apply: terraform
creates the secret containers (empty), bootstrap populates them +
creates the roles, drift gate verifies, deploy continues.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 7, 2026
…e only) (#911)

* feat(compliance): U6 — Strands runtime audit emit path (infrastructure only)

Phase 3 U6 of the compliance audit-event log. U1-U5 shipped the
schema, roles, write helper, outbox drainer, and TypeScript-runtime
call-site emits. U6 stands up the cross-runtime path: a narrow REST
endpoint the Python Strands runtime can post audit events to, with
a Python client that mirrors the existing _log_invocation pattern.

What ships:
  - POST /api/compliance/events — new compliance-events Lambda
    (packages/api/src/handlers/compliance.ts) authenticating via
    API_AUTH_SECRET. Cross-tenant guard via SELECT users.tenant_id;
    idempotency via SELECT-then-INSERT against client-supplied UUIDv7
    event_id, with 23505-fallback for the SELECT/INSERT race. Emits
    through the U3 helper inside db.transaction.
  - emitAuditEvent helper extension — added optional eventId field on
    EmitAuditEventInput so the cross-runtime client can supply a
    UUIDv7 that survives retries. Existing U5 callers unchanged
    (don't pass eventId; helper still generates server-side).
  - Python ComplianceClient
    (packages/agentcore-strands/agent-container/container-sources/compliance_client.py)
    with stdlib UUIDv7 helper (RFC 9562), env snapshot at __init__,
    3-attempt exponential backoff on 5xx/429, no retry on 4xx,
    snake_case → camelCase boundary conversion.
  - Boot-time client instantiation in server.py main() so a Phase 4
    caller picks up the singleton via `from server import
    _compliance_client`.
  - Terraform + build-script wiring (mirrors narrow-handler shape,
    NOT compliance-outbox-drainer — drainer uses compliance_drainer
    role; this handler uses master DATABASE_SECRET_ARN like every
    other narrow handler).

What does NOT ship:
  - No live `client.emit(...)` call sites in server.py. The only
    obvious candidate (Strands AGENTS.md edits) goes through
    /api/workspaces/files which already emits via U5's TypeScript
    path; emitting from Python on top would create duplicate audit
    rows. A guard test in test_compliance_client.py catches
    accidental scope creep. First non-duplicate caller is a Phase 4
    brainstorm.

Plan: docs/plans/2026-05-07-007-feat-compliance-u6-strands-emit-path-plan.md

Tests:
  - 19 handler integration tests (mocked db) covering happy path,
    idempotency (replay + 23505 race), auth, cross-tenant guard,
    body validation, method/path matching.
  - 16 Python client tests covering UUIDv7 shape, env snapshot,
    enabled/disabled state, retry behavior on 5xx/429/timeouts,
    no-retry on 4xx, idempotency-key/body matching, no-live-emit
    guard.
  - 4 new emit.ts unit tests covering optional eventId override.
  - Full api suite passing: 2245/2245.
  - compliance-events Lambda zip builds (72K).

Plan went through ce-doc-review headless; 3 P0s caught and resolved
in the plan body before ce-work:
  1. Original plan relied on onConflictDoNothing against a
     client-supplied event_id, but the U3 helper unconditionally
     generated server-side. Helper extended with optional eventId.
  2. Original plan referenced compliance_writer Aurora role; U5
     actually emits via the master db singleton. U6 follows U5's
     convention.
  3. Original "first call site" choice produced duplicate rows with
     U5; reframed as infrastructure-only with the duplicate-row
     reasoning documented in the plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): apply autofix feedback

Four safe_auto fixes from ce-code-review:

1. 23505 race-recovery now checks both `err.code` AND `err.cause.code`
   so the drizzle-orm-wrapped pg unique-violation actually gets caught.
   The handler's race-recovery comment claimed parity with the
   existing tasks.ts pattern but the implementation diverged — fixed.
   (correctness #1, P1)

2. Cross-tenant `users` SELECT now scoped to actorType === "user".
   System (e.g. "platform-credential") and agent (`agents.id`) actorIds
   are not user PKs; SELECTing users would have always 403'd them.
   API_AUTH_SECRET bearer is the trust boundary for non-user actors.
   (correctness #2, P2)

3. redactPayload errors mapped to 400 (alongside existing
   emitAuditEvent: prefix mapping). Posting a Phase 6 reservation
   eventType like `policy.evaluated` would have fallen through to
   500, which the Python client treats as retryable, generating a
   3× retry storm on a permanent caller bug. (security #2, P2)

4. Python retry shape clean-up. RETRY_DELAYS_SEC was (0.5, 1.0, 2.0)
   with a conditional that skipped the final sleep, making the 2.0s
   entry dead. Restructured to total_attempts = len(delays) + 1; N
   delays separate N+1 attempts. Trimmed to (0.5, 1.0) for 3 attempts
   with 1.5s sleep budget. Test monkeypatches updated.
   (correctness #4, P3)

3 new TypeScript regression tests cover the 23505 cause-chain path
and the system / agent actorType cross-tenant-bypass branches.

All 2248 TS + 16 Python tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ericodom added a commit that referenced this pull request May 29, 2026
mapTurnsToUserMessages paired the i-th turn to the i-th user message by
document position. In multi-player threads another human's message is a USER
message that triggers no turn, so the agent's 'Working…' disclosure got pinned
to that intervening message (rendering above the message that triggered it).
Pair each turn to the nearest-preceding user message by timestamp instead;
fall back to positional pairing when message createdAt is unavailable.

Needs live multiplayer validation to confirm Image #4 is resolved.
ericodom added a commit that referenced this pull request May 29, 2026
* feat(spaces): commit mention with Tab, dismiss with Escape [U4]

Both composers' mention autocomplete now commit the highlighted target on
Tab as well as Enter, and close the menu on Escape without committing. The
MentionMenu listbox gains aria-activedescendant so screen readers announce
the active option during arrow navigation.

* feat(spaces): add agent-mode derivation helper [U5]

deriveAgentDefault() decides the composer agent toggle's initial state:
ON in single-player threads, OFF when another human is in the loop (has
posted, or is @mentioned in the current draft). Shared by both composers
via a minimal local mention shape so the rule can't drift.

* feat(spaces): auto-derive follow-up agent toggle default [U6]

The follow-up composer's agent toggle now initializes from deriveAgentDefault
instead of always-on: ON in single-player threads, OFF once another human has
posted or is @mentioned in the draft. The manual choice persists within the
thread (per-thread override) and re-derives on thread switch.

* feat(spaces): auto-derive new-thread agent toggle default [U7]

The new-thread composer's agent toggle now derives from draft mentions:
ON with no user mention, OFF once another user is @mentioned. Manual choice
persists until the draft is sent/cleared. Completes the single- vs
multi-player default for Issue 4.

* feat(spaces): surface tagged threads in the sidebar live + on focus [U1][U2]

The urql document cache doesn't auto-invalidate on live events, so a thread
the caller was @mentioned into didn't appear without a manual refresh. The
shell now refetches the thread lists (coalesced, network-only) on two signals:
returning to the window (focus/visibility) and the tenant-scoped
onThreadUpdated subscription, which fires for the caller on createThread and
sendMessage. No new subscription field — reuses the existing event.

Needs live desktop validation: confirms the root cause of the 'even after
refresh' divergence (R1.2).

* fix(spaces): anchor turn activity to its triggering message [U3]

mapTurnsToUserMessages paired the i-th turn to the i-th user message by
document position. In multi-player threads another human's message is a USER
message that triggers no turn, so the agent's 'Working…' disclosure got pinned
to that intervening message (rendering above the message that triggered it).
Pair each turn to the nearest-preceding user message by timestamp instead;
fall back to positional pairing when message createdAt is unavailable.

Needs live multiplayer validation to confirm Image #4 is resolved.

* style(spaces): prettier formatting for agent-mode + composer [U5][U7]

* fix(api): mention participants can see threads in private spaces [U1]

callerVisibleThreadPredicate gated thread visibility on author-or-participant
AND space-membership. A user mentioned into a thread inside a private Space
they don't belong to is a participant but not a member, so the thread was
filtered out of their list — the desktop refetch couldn't surface what the
query excluded. A mention is a thread-level invite: an explicit participant
now bypasses the space-membership gate for THAT thread only (they still can't
see the rest of the private Space). Realigns the code with its own docstring.

* test(spaces): cover turn-pairing edge cases [U3]

Add coverage for multiple turns mapping to one user message (latest wins,
single disclosure) and a turn with no preceding user message (anchors to the
earliest message without throwing).
ericodom added a commit that referenced this pull request May 31, 2026
…hread counts; add scoped Thread List table (#1894)

Thread-detail / sidebar fixes:

1. Follow-up flash (#4): withTurnResponseFallback only tail-appended the
   latest completed turn's synthetic response, so the prior turn's (not yet
   durable) response was dropped the instant an optimistic follow-up user
   message arrived — the transcript flashed the answer out and showed only
   "Working…". Reconstruct a synthetic response after EACH completed turn's
   user message that lacks a durable reply, anchored in place across new turns.

2. Unread filter hides the open thread (#1): clicking a thread in a filtered
   section marked it locally-read and filtered it out the same frame. Add
   displayedUnreadThreads() to retain the selected thread in the displayed list
   (badge / mark-all still use the pure unread set); it drops out on deselect.

3. Space nav section undercount (#3): each space section bucketed the
   tenant-wide RECENT_LIMIT window client-side, so busy tenants starved a
   space's section vs its detail page. Fetch each space's own scoped threads
   (SpaceThreadsQuery, mirroring the detail page) and merge the bucketed seed.

4. Thread List table (#2): each section's "…" menu gains a "Thread list" item
   that opens /threads scoped to that section's space (Chats → default space,
   matching the nav). Server-paginated DataTable (table-fixed, no horizontal
   scroll, height-constrained so the pager stays visible) with Title, ID,
   Space, Status, Last activity columns, row-click-to-open, and delete.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant