workos · nicknisi · May 19, 2026 · May 18, 2026 · May 18, 2026 · May 18, 2026
diff --git a/.gitignore b/.gitignore
@@ -1,19 +1,17 @@
-# Case harness marker files (created during pipeline runs)
-.case-active
-.case-tested
-.case-manual-tested
-.case-reviewed
-
 # Repo-local Case runtime state. Target repos should ignore this directory too;
 # `ca bootstrap` adds the rule automatically.
 .case/
 
+# User-local project manifest (lives in ~/.config/case/projects.json)
+projects.json
+
 # Legacy in-repo runtime state retained only as a read fallback during migration.
 tasks/active/
 tasks/done/
 docs/learnings/
 docs/proposed-amendments/*.md
 !docs/proposed-amendments/.gitkeep
+!docs/proposed-amendments/README.md
 docs/run-log.jsonl
 docs/agent-versions/
 

diff --git a/AGENTS.md b/AGENTS.md
@@ -1,4 +1,4 @@
-# Case — WorkOS OSS Harness
+# Case — Agent Harness
 
 Spine repo for orchestrating agent work across WorkOS open source projects.
 Humans steer. Agents execute. When agents struggle, fix the harness.
@@ -21,9 +21,9 @@ echo "$SESSION"
 | authkit-session        | `../authkit-session`        | Framework-agnostic session management                | TS/pnpm |
 | authkit-tanstack-start | `../authkit-tanstack-start` | AuthKit TanStack Start SDK                           | TS/pnpm |
 | authkit-nextjs         | `../authkit-nextjs`         | AuthKit Next.js SDK                                  | TS/pnpm |
-| workos-node            | `../workos-node/main`       | WorkOS Node.js SDK                                   | TS/pnpm |
+| workos-node            | `../workos-node/main`       | WorkOS Node.js SDK                                   | TS/npm  |
 
-Full metadata (commands, remotes, language): `projects.json`
+Full metadata (commands, remotes, evidence strategy): `~/.config/case/projects.json`
 
 ## Navigation
 

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -13,7 +13,7 @@ It provides the cross-cutting knowledge, conventions, and task dispatch that no
 
 ## Philosophy
 
-- **Case exists to make agent-authored WorkOS OSS PRs reliable, reviewable, and self-improving.** Keep the core loop small unless reliability requires more.
+- **Case exists to make agent-authored PRs reliable, reviewable, and self-improving.** Keep the core loop small unless reliability requires more.
 - **Humans steer, agents execute.** Engineers define goals and acceptance criteria. Agents implement.
 - **Never write code directly.** All code changes in target repos flow through agents. Engineers only improve this harness.
 - **When agents struggle, fix the harness.** The fix is never "try harder" — it's a missing doc, playbook, convention, or enforcement rule.
@@ -48,34 +48,33 @@ Case depends on the skills plugin for product knowledge. They are complementary,
 ```
 AGENTS.md                 # Entry point for agents (routing map)
 CLAUDE.md                 # This file (meta-instructions for case itself)
-projects.json             # Manifest of target repos
-projects.schema.json      # JSON Schema for the manifest
+projects.schema.json      # JSON Schema for the project manifest
 docs/
   architecture/           # Canonical patterns per repo type
   conventions/            # Shared rules (commits, testing, PRs)
   golden-principles.md    # Invariants enforced across all repos
   playbooks/              # Step-by-step guides for recurring operations
 tasks/
   active/                 # Current task files for agent execution
-  done/                   # Completed tasks (moved after PR merge)
   templates/              # Reusable task templates
 src/commands/
   check.ts                # Cross-repo convention enforcement
   bootstrap.ts            # Per-repo readiness verification
+  onboard.ts              # Human-facing onboarding for a new repo
 ```
 
 ## Commands
 
 ```bash
-# Validate manifest
-node -e "JSON.parse(require('fs').readFileSync('projects.json','utf8'))"
-
-# Check conventions across repos
+# Check conventions across repos (also validates the manifest)
 ca check
 
 # Check a single repo
 ca check --repo cli
 
 # Bootstrap a repo for agent work
 ca bootstrap cli
+
+# Onboard a new repo
+ca onboard <path>
 ```
diff --git a/CONTEXT.md b/CONTEXT.md
@@ -7,7 +7,7 @@ Canonical vocabulary for the case pipeline. Every term used in code, specs, and
 | Term                      | Definition                                                                                                                                                 | Rejected Alternatives                                                   |
 | ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- |
 | **task**                  | A unit of agent work dispatched by the pipeline. Has a `taskId`, status, and associated event log.                                                         | `job`, `run` (too generic)                                              |
-| **phase**                 | A named pipeline stage that produces one `AgentResult`. One of: implement, verify, review, approve, close, retrospective.                                  | `step` (too generic), `stage` (ambiguous with CI)                       |
+| **phase**                 | A named pipeline stage that produces one `AgentResult`. One of: implement, verify, review, close, retrospective.                                           | `step` (too generic), `stage` (ambiguous with CI)                       |
 | **node**                  | A DAG vertex representing one phase execution at a specific revision cycle. E.g., `implement_0`, `verify_1`. Introduced in Phase 3.                        | `vertex` (too academic)                                                 |
 | **status**                | The lifecycle position of a task, derived from pipeline state. One of: active, implementing, verifying, reviewing, evaluating, closing, pr-opened, merged. | `state` (reserved for `PipelineState`, the full reconstructible object) |
 | **state**                 | The full reconstructible pipeline state object (`PipelineState`), produced by `reduceEvents()`.                                                            | `snapshot` (used in mill for a different concept)                       |
@@ -18,6 +18,7 @@ Canonical vocabulary for the case pipeline. Every term used in code, specs, and
 | **evaluator**             | Collective term for verifier and reviewer — the two phases that assess implementation quality.                                                             | `assessor`, `checker`                                                   |
 | **marker**                | A file written to `.case/<task-slug>/` as evidence of a completed phase. E.g., `tested`, `reviewed`.                                                       | `flag`, `sentinel`                                                      |
 | **evidence**              | Proof that a phase completed successfully. Includes marker files, SHA-256 hashed test output, screenshots.                                                 | `artifact` (too broad)                                                  |
+| **evidence strategy**     | One of: ui-screenshot, scenario-script, test-output. Declared per project in projects.json. Drives what kind of verification evidence the pipeline requires. |                                                                         |
 | **ast-grep rule**         | A YAML file defining a structural code pattern to match or ban. Processed by ast-grep against TypeScript ASTs. Lives in `ast-rules/`.                      | `lint rule` (too generic — we also have oxlint)                         |
 | **target rule**           | An ast-grep rule enforcing golden principles in target repos. Run by the implementer before committing. Lives in `ast-rules/target/`.                      | `repo rule`, `external rule`                                            |
 | **self-enforcement rule** | An ast-grep rule enforcing case's own codebase invariants. Run in CI and pre-commit. Lives in `ast-rules/self/`.                                           | `internal rule`, `meta rule`                                            |

diff --git a/README.md b/README.md
@@ -2,13 +2,13 @@
 
 <img width="500" height="500" alt="Case" src="docs/case-logo.svg" />
 
-Case is the reliability layer for agent-authored WorkOS OSS pull requests.
+Case is the reliability layer for agent-authored pull requests.
 
-Its job is narrow: turn a clearly scoped WorkOS OSS task into a reviewed PR with evidence, and make the next run better when this one fails. Case is not a generic agent platform, a dashboard product, or a place to accumulate every possible workflow idea. Humans steer. Agents execute. The harness keeps the work reviewable.
+Its job is narrow: turn a clearly scoped task into a reviewed PR with evidence, and make the next run better when this one fails. Case is not a generic agent platform, a dashboard product, or a place to accumulate every possible workflow idea. Humans steer. Agents execute. The harness keeps the work reviewable.
 
 ## Why It Exists
 
-Agents are useful when the surrounding system makes good work easier than bad work. Case provides that surrounding system for the WorkOS open source repos:
+Agents are useful when the surrounding system makes good work easier than bad work. Case provides that surrounding system:
 
 - A shared map of target repos, commands, architecture notes, and conventions.
 - A task format that separates human intent from machine-updated state.
@@ -18,7 +18,7 @@ Agents are useful when the surrounding system makes good work easier than bad wo
 
 The north star:
 
-> Case exists to make agent-authored WorkOS OSS PRs reliable, reviewable, and self-improving.
+> Case exists to make agent-authored PRs reliable, reviewable, and self-improving.
 
 ## Core Loop
 
@@ -106,6 +106,7 @@ ca 1234                 # create or resume a GitHub issue run
 ca DX-1234              # create or resume a Linear issue run
 ca --agent              # interactive steering session
 ca --agent 1234         # steering session with issue context
+ca onboard <path>       # add a repo to projects.json
 ca run --task <file>    # run an existing task JSON
 ca watch <task-slug>    # live-tail the event log
 ```
@@ -120,7 +121,7 @@ ca mark-manual-tested
 ca mark-reviewed --critical 0
 ca upload <file>
 ca snapshot <agent-name>
-ca create --repo <name> --title <title> --description <text>
+ca create --repo <name> --title <title> --description <text> --evidence <expectations>
 ca analyze-failure <task.json> <agent> <error>
 ca bootstrap <repo>
 ca check [--repo <repo>]
@@ -167,6 +168,8 @@ CASE_DATA_DIR=/tmp/case-test ca init
 
 Static package assets are versioned with Case and embedded into the standalone binary: `agents/`, markdown under `docs/`, and text rules under `ast-rules/`. When running from a checkout, disk files win so local prompt/doc edits are picked up immediately; set `CASE_PACKAGE_ROOT=/path/to/case` to force a specific checkout as the disk override.
 
+Each entry in `projects.json` may optionally include `credentials` (per-repo secrets needed for verification) and `verificationNotes` (free-form context the verifier should know about the repo).
+
 For portable binary installs, keep `projects.json` in `~/.config/case/` via `ca init --projects <path>` or `ca init --migrate-from <case-checkout>`. Repo paths in a portable `projects.json` should be absolute or relative to that `projects.json` file.
 
 ## Pipeline
@@ -182,6 +185,8 @@ Revision loops are evaluator-driven. A verifier or reviewer rubric failure can s
 
 Every run writes an append-only event log under `<target-repo>/.case/<task-slug>/events/`. `ca watch <task-slug>` renders those events while a run is active.
 
+Every task carries `evidenceExpectations` — the concrete artifacts the verifier must produce. The orchestrator writes these based on the target repo's `evidenceStrategy` so the verifier knows what counts as proof up front.
+
 ## Agent Roles
 
 | Agent         | Responsibility                                                       | Does Not Do                         |
@@ -193,7 +198,7 @@ Every run writes an append-only event log under `<target-repo>/.case/<task-slug>
 | Closer        | Creates the PR after evidence gates pass                             | Implement or test                   |
 | Retrospective | Records learnings and proposes harness improvements                  | Edit target repo code               |
 
-¹ The orchestrator is TypeScript runtime code (`src/agent/orchestrator-session.ts`), not an LLM agent prompt like the others.
+¹ The orchestrator runs as an LLM agent session via `ca --agent`, or as TypeScript runtime code for direct `ca <issue>` dispatch.
 
 The key boundary is context isolation. Implementer context includes task details, playbooks, repo learnings, and revision feedback. Verifier context is intentionally fresher. Reviewer context is focused on the diff and principles.
 
@@ -207,6 +212,12 @@ Evidence markers live under the target repo's `.case/<task-slug>/` directory:
 
 The closer checks these markers before opening a PR. The point is not ceremony; it is making the PR auditable without trusting a chat transcript.
 
+Each repo declares an `evidenceStrategy` in `projects.json` that drives what the verifier produces:
+
+- `ui-screenshot`: Playwright before/after screenshots for user-facing UI changes.
+- `scenario-script`: a consumer script that exercises the specific user-facing scenario.
+- `test-output`: automated test output only (for libraries and non-UI code).
+
 ## Self-Improvement
 
 After a run, the retrospective agent should leave the harness smarter:
@@ -240,18 +251,15 @@ Priority:
 
 ## Repository Map
 
-Target repos are listed in `projects.json`.
+Target repos are listed in `~/.config/case/projects.json` (created by `ca init` + `ca onboard`). The schema is `projects.schema.json` in this repo.
 
-| Repo                   | Path                        | Purpose                               |
-| ---------------------- | --------------------------- | ------------------------------------- |
-| cli                    | `../cli/main`               | WorkOS CLI                            |
-| skills                 | `../skills`                 | WorkOS integration skills             |
-| authkit-session        | `../authkit-session`        | Framework-agnostic session management |
-| authkit-tanstack-start | `../authkit-tanstack-start` | AuthKit TanStack Start SDK            |
-| authkit-nextjs         | `../authkit-nextjs`         | AuthKit Next.js SDK                   |
-| workos-node            | `../workos-node/main`       | WorkOS Node.js SDK                    |
+Add a repo with:
+
+```bash
+ca onboard <path>
+```
 
-Add a repo by updating `projects.json`, adding any needed architecture notes under `docs/architecture/`, and verifying with:
+Then add any needed architecture notes under `docs/architecture/` and verify with:
 
 ```bash
 ca check --repo <name>