Skip to content

feat(evidence): repo-aware evidence strategy, ca onboard, and project-level config#14

Merged
nicknisi merged 8 commits into
mainfrom
feat/evidence-strategy
May 19, 2026
Merged

feat(evidence): repo-aware evidence strategy, ca onboard, and project-level config#14
nicknisi merged 8 commits into
mainfrom
feat/evidence-strategy

Conversation

@nicknisi
Copy link
Copy Markdown
Member

Summary

  • Evidence strategy per repo — replaces the broken type: "app" | "library" field with an explicit evidenceStrategy enum (ui-screenshot, scenario-script, test-output). Each repo in projects.json declares what kind of verification evidence it can produce.
  • Required evidence expectationsevidenceExpectations is now required on every task. The orchestrator prompt includes a strategy-to-expectations lookup table with concrete examples. The verifier checks its evidence against these task-specific expectations, not just the generic rubric.
  • ca onboard <path> — probes a repo for package manager, language, git remote, and scripts. Infers evidence strategy from framework deps and project structure. Writes the entry to projects.json and runs ca bootstrap to validate.
  • Project-level credentials and verification notescredentials (path to env file, defaults to ~/.config/case/credentials) and verificationNotes (free-text verifier hints) fields on ProjectEntry. The verifier reads these from context instead of hardcoding WorkOS-specific paths and SDK examples.
  • Removed type field — no deprecation, just gone. All 6 repos now have explicit evidenceStrategy.

What was tested

  • Full test suite: 471 pass, 0 fail
  • Typecheck: clean
  • New tests for ca onboard evidence strategy inference (Next.js → ui-screenshot, library with build → scenario-script, test-only → test-output)
  • New test assertions for evidence expectations in task markdown
  • Existing assembler tests updated for evidenceStrategy replacing type

Follow-ups

  • Add verificationNotes to remaining WorkOS repos (authkit-session, authkit-nextjs, etc.)
  • Consider per-repo credential files (~/.config/case/<repo>.env) for isolation
  • ca onboard --strategy <override> flag for manual strategy selection

nicknisi added 3 commits May 18, 2026 15:52
Evidence quality is the trust bottleneck — the orchestrator was writing
vague "verify it works" expectations (or none at all) regardless of
whether the target repo has a UI to screenshot or is a pure library.

Add `evidenceStrategy` field to projects.json (ui-screenshot,
scenario-script, test-output) so each repo declares what kind of proof
is meaningful. The orchestrator prompt now includes a lookup table
mapping strategies to concrete, falsifiable evidence expectations.
`evidenceExpectations` is required on all tasks — the verifier checks
its work against these specific expectations, not just the generic
rubric.
The binary type field (app|library) was broken — most repos didn't set
it, and the inference was wrong. Replace it entirely with the explicit
evidenceStrategy field (now required). All 6 repos get explicit
strategies; the two app repos that were missing it get scenario-script.

Add `ca onboard <path>` — probes a repo for package manager, language,
git remote, and scripts, infers the evidence strategy, writes the entry
to projects.json, then runs bootstrap to validate. This prevents future
repos from being added without an evidence contract.

Also updates the verifier to branch on evidenceStrategy instead of Repo
type, and adds a test-output fast path for repos that only need
automated evidence.
The verifier had hardcoded ~/.config/case/credentials and WorkOS SDK
examples baked into the prompt. This couples the pipeline to WorkOS
repos — any non-WorkOS project would hit wrong credential paths and
irrelevant code examples.

Add credentials and verificationNotes fields to ProjectEntry. The
assembler passes both to the verifier context. The verifier prompt now
reads the credentials path and verification hints from context instead
of hardcoding them. WorkOS-specific SDK patterns move to
verificationNotes on the workos-node entry in projects.json.

This makes the pipeline usable for non-WorkOS repos without agent
prompt changes.
devin-ai-integration[bot]

This comment was marked as resolved.

nicknisi added 2 commits May 18, 2026 16:14
…m repo

- README: remove WorkOS-specific language, add evidence strategy docs,
  update CLI reference with ca onboard, fix repository map to reference
  ~/.config/case/projects.json
- AGENTS.md: fix workos-node stack (npm not pnpm), reference evidence
  strategy, remove WorkOS OSS from title
- CLAUDE.md: remove stale tasks/done/ and node one-liner, add ca onboard,
  remove projects.json from structure (now user-local)
- CONTEXT.md: fix phase list (remove non-existent approve), add evidence
  strategy term
- tasks/README.md: evidence expectations now required
- .gitignore: remove stale root-level markers, gitignore projects.json
  (user-local, lives in ~/.config/case/)
- projects.json removed from git tracking (stays on disk)
- Add proposed-amendments/README.md explaining historical status
Bun lockfile detection was setting packageManager to 'pnpm' instead of
'bun'. Also added 'bun' to the schema's packageManager enum.

Removed dead ternary that returned 'typescript' in both branches —
replaced with a direct assignment since all onboarded repos are
currently treated as TypeScript.
devin-ai-integration[bot]

This comment was marked as resolved.

- bun needs `bun run <script>` like npm — `bun build` invokes Bun's
  built-in bundler, not the package.json script
- test command is always set (required by schema) even when the repo
  has no test script in package.json
- orchestrator prompt referenced ${caseRoot}/projects.json which no
  longer exists — updated to ~/.config/case/projects.json
- removed "WorkOS OSS" from orchestrator prompt
devin-ai-integration[bot]

This comment was marked as resolved.

First-time users who run ca init then ca onboard would hit a throw from
loadProjectsManifest since no projects.json exists yet. Now the onboard
handler catches the error and creates an empty manifest at
~/.config/case/projects.json before proceeding.
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

⚠️ 1 issue in files not directly in the diff

⚠️ Evidence Expectations section silently omitted for empty string despite being a required field (src/entry/task-factory.ts:142-144)

The evidenceExpectations field was changed from optional to required in TaskCreateRequest (src/types.ts:344), and the tasks README now marks ## Evidence Expectations as "Required". The verifier prompt explicitly says to "Read the ## Evidence Expectations section." However, buildTaskMarkdown at line 142 still uses a truthiness check (if (request.evidenceExpectations)), which would silently omit the section for an empty string ''. All current callers provide non-empty values, but the guard is inconsistent with the field being required and could cause confusing agent behavior if a future caller passes an empty string.

View 10 additional findings in Devin Review.

Open in Devin Review

Comment thread src/commands/onboard.ts Outdated
await mkdir(dirname(path), { recursive: true });
await Bun.write(path, JSON.stringify({ $schema: './projects.schema.json', repos: [] }, null, 2) + '\n');
process.stdout.write(`Created ${path}\n`);
return { repos: [], path, repoBasePath: dataDir };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 loadOrCreateManifest uses dataDir as repoBasePath but loadProjectsManifest uses caseRoot

When loadOrCreateManifest falls through to the catch path (first-time onboard, no existing projects.json), it sets repoBasePath: dataDir (~/.config/case/). The probeRepo function then computes relative repo paths from this base via relative(dataDir, absPath). However, when this same file is later loaded by loadProjectsManifest (src/config.ts:77), the repoBasePath is set to caseRoot (the case checkout directory) for non-embedded mode. Since dataDircaseRoot, stored relative paths resolve to wrong locations in all subsequent commands (ca check, ca bootstrap, pipeline runs, etc.).

Concrete example of path mismatch
  • dataDir = ~/.config/case/
  • caseRoot = /home/user/repos/case/
  • Onboarded repo at /home/user/repos/my-app
  • During onboard: relative('~/.config/case/', '/home/user/repos/my-app')../../repos/my-app
  • During load: resolve('/home/user/repos/case/', '../../repos/my-app')/home/user/my-app (WRONG — missing repos/ segment depending on directory depth)
Prompt for agents
In src/commands/onboard.ts, the loadOrCreateManifest function's catch path (line 115) sets repoBasePath to dataDir. This is inconsistent with how loadProjectsManifest in src/config.ts resolves repoBasePath when loading the same file later (line 77: isEmbeddedPackageRoot(caseRoot) ? dirname(path) : caseRoot).

The fix: import isEmbeddedPackageRoot from paths.ts and change line 115 to use the same logic:
  return { repos: [], path, repoBasePath: isEmbeddedPackageRoot(caseRoot) ? dataDir : caseRoot };

This ensures the repoBasePath used during onboard matches what loadProjectsManifest will use when reading the file back. The caseRoot parameter is already available in scope (passed to loadOrCreateManifest).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

- loadOrCreateManifest now uses isEmbeddedPackageRoot to match
  loadProjectsManifest's repoBasePath logic, preventing path
  mismatches between onboard and subsequent commands
- buildTaskMarkdown uses !== undefined instead of truthiness check
  for required evidenceExpectations field, preventing silent
  omission on empty string

Co-Authored-By: nick.nisi@workos.com <nick.nisi@workos.com>
Comment thread src/commands/onboard.ts
await mkdir(dirname(path), { recursive: true });
await Bun.write(path, JSON.stringify({ $schema: './projects.schema.json', repos: [] }, null, 2) + '\n');
process.stdout.write(`Created ${path}\n`);
return { repos: [], path, repoBasePath: isEmbeddedPackageRoot(caseRoot) ? dataDir : caseRoot };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3f8bd51: loadOrCreateManifest now uses isEmbeddedPackageRoot(caseRoot) ? dataDir : caseRoot to match loadProjectsManifest's repoBasePath logic. This ensures relative repo paths computed during onboard resolve correctly in all subsequent commands.

Comment thread src/entry/task-factory.ts
lines.push('## Edge Cases', '', request.edgeCases, '');
}
if (request.evidenceExpectations) {
if (request.evidenceExpectations !== undefined) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3f8bd51: Changed from truthiness check (if (request.evidenceExpectations)) to if (request.evidenceExpectations !== undefined) so that an empty string no longer silently omits the required section.

@nicknisi nicknisi merged commit 33fef0f into main May 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant