Skip to content

feat: [ENG-2238] AutoHarness V2 HarnessContext + module contract#494

Merged
danhdoan merged 2 commits intoproj/autoharness-v2from
feat/ENG-2238
Apr 21, 2026
Merged

feat: [ENG-2238] AutoHarness V2 HarnessContext + module contract#494
danhdoan merged 2 commits intoproj/autoharness-v2from
feat/ENG-2238

Conversation

@danhdoan
Copy link
Copy Markdown
Collaborator

Summary

  • Problem: Phase 4's templates (Task 4.3) import HarnessContext to declare their function signatures. Without the type surface shipped, can't write a single line of Phase 4 implementation that compiles. This PR is the seam.
  • Why it matters: Same role as Phase 1.1's in-memory test double — the first PR of a phase whose entire purpose is unblocking the other engineer.
  • What changed: Added 5 new type declarations (HarnessContextEnv, HarnessContextTools, HarnessContext, HarnessModule, HarnessLoadResult) to src/agent/core/domain/harness/types.ts. New barrel at src/agent/core/domain/harness/index.ts. 8-scenario type-level test suite.
  • What did NOT change (scope boundary): No runtime code. No HarnessModuleBuilder (Task 3.2). No SandboxService modifications (Task 3.3). No template content (Phase 4). No consumer imports these types yet.

Type of change

  • New feature
  • Bug fix
  • Refactor (no behavior change)
  • Documentation
  • Test
  • Chore (build, dependencies, CI)

Scope (select all touched areas)

  • TUI / REPL
  • Agent / Tools
  • LLM Providers
  • Server / Daemon
  • Shared (constants, types, transport events)
  • CLI Commands (oclif)
  • Hub / Connectors
  • Cloud Sync
  • CI/CD / Infra

Linked issues

  • Closes ENG-2234
  • Related ENG-2228 (Phase 1 Task 1.4 — service-initializer wiring, merged — closed Phase 1)
  • Related ENG-2222 (ENG that moved harness types out of infra/ — the layering discipline this PR preserves)
  • Unblocks Phase 4 Task 4.3

Root cause (bug fixes only, otherwise write N/A)

  • Root cause: N/A
  • Why this was not caught earlier: N/A

Test plan

  • Coverage added:
    • Unit test
    • Integration test
    • Manual verification only
  • Test file(s): test/unit/agent/harness/types-context.test.ts
  • Key scenario(s) covered:
    • HarnessModule.meta required; curate/query optional (2 tests, one positive + one @ts-expect-error negative)
    • HarnessContext / HarnessContextEnv / HarnessContextTools readonly enforcement (3 tests, all @ts-expect-error-based — compile-time invariants)
    • HarnessLoadResult discriminated-union narrowing (3 tests: loaded: true exposes module+version; loaded: false exposes reason; failure variant's keyof excludes module)

User-visible changes

None. Types-only addition; no runtime behavior. harness.enabled = false remains the public default.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording

Before this PR, the test file didn't exist and the types weren't declared. After: all 8 pass. Full suite: 6665 passing / 0 failing.

$ npm test -- --grep "HarnessContext \+ module contract"
  HarnessContext + module contract
    HarnessModule
      ✔ requires `meta` and makes `curate` / `query` optional
      ✔ rejects a module missing `meta` at compile time
    HarnessContext readonly enforcement (compile-time only)
      ✔ rejects reassigning `ctx.env`
      ✔ rejects reassigning `env.workingDirectory`
      ✔ rejects reassigning `tools.curate`
    HarnessLoadResult discriminated union
      ✔ narrows to `module` + `version` when `loaded === true`
      ✔ narrows to `reason` when `loaded === false`
      ✔ does NOT expose `module` on the `loaded: false` variant
  8 passing (15ms)

Checklist

  • Tests added or updated and passing (npm test) — 8 new tests; full suite 6665 passing / 0 failing
  • Lint passes (npm run lint) — 0 errors, 224 pre-existing warnings
  • Type check passes (npm run typecheck) — exit=0
  • Build succeeds (npm run build) — exit=0
  • Commits follow Conventional Commits format — feat: [ENG-2234] ...
  • Documentation updated (if applicable) — task doc at features/autoharness-v2/tasks/phase_3/task_01-harness-module-contract.md (research repo) describes the scope; the 2-methods-vs-4 deviation is flagged in "Notes for reviewers" for post-merge task-doc tightening (same pattern as Tasks 0.5/0.6/1.2/1.3/1.4)
  • No breaking changes (or clearly documented above) — purely additive to an already-shipped types module
  • Branch is up to date with main — targets proj/autoharness-v2, not main

Risks and mitigations

  • Risk: HarnessContextTools ships with 2 methods (curate, readFile), not the 4 the task doc sketched. If Phase 4 Task 4.3's query template needs searchKnowledge on the context (likely), it'll fail to compile until the type is extended.

    • Mitigation: Extending is a one-line addition to HarnessContextTools. The prerequisite is moving SearchKnowledgeOptions / SearchKnowledgeResult (and for grep, GrepOptions) from src/agent/infra/sandbox/tools-sdk.ts into core/domain/ or core/interfaces/ — a small mechanical PR that preserves the layering discipline. When Phat reaches Task 4.3 and declares what his query template actually needs, we decide together whether to extend HarnessContextTools (my preference) or inline the call differently in the template.
  • Risk: Task doc (research repo) drafted the surface as 4 methods (curate, query, search, readFile) with invented type names (CurateArgs, SearchOpts, FileReadResult). Shipped uses real tools-SDK signatures. Anyone reading the task doc in isolation will expect 4 methods.

    • Mitigation: Per the established pattern (Tasks 0.5/0.6/1.2/1.3/1.4), tighten the task doc post-merge to reflect what shipped. The invented type names were always placeholders — the real signatures are what belong in the contract.
  • Risk: readonly is TypeScript-only. At runtime the context is mutable unless frozen.

    • Mitigation: Task 3.2's module builder is explicitly responsible for Object.freeze(ctx) before passing into VM evaluation. Called out in this PR's test file comments and in Task 3.2's task doc. A bug that skipped the freeze would leak only to harness code, not to outer sandbox — and the isolation integration test (Task 3.5) would catch real escapes.

Notes for reviewers

The HarnessContextTools surface decision is the only interesting design call in this PR. Task doc sketched 4 methods; shipped 2. The reasoning is in the JSDoc on HarnessContextTools itself — imports from infra/ would violate the layering we established in ENG-2222. If the team prefers the fuller surface, the fix is a separate PR moving SearchKnowledgeOptions / SearchKnowledgeResult / GrepOptions into core/ first. I recommend deferring that move until a real consumer (Phase 4 Task 4.3 or Phase 6 refinement) materializes, so we're not moving types speculatively.

readonly vs. Readonly<T> modifier choice. I used inline readonly on every field rather than wrapping the interface in Readonly<...>. Reason: property-level readonly composes better with discriminated unions and partial mocks in tests. The tradeoff is 5 extra characters per field declaration.

HarnessModule.query is typed Promise<unknown> — not Promise<QueryResult> as the task doc draft suggested. Reason: QueryResult doesn't exist in the codebase yet; the closest analog is SearchKnowledgeResult, which lives in infra and can't be imported into core per the layering rule. unknown is the safe placeholder — Phase 4 Task 4.3's query template will produce whatever shape matches its use case, and Phase 5's mode-selector / LLM surface layer-over the result when it matters. If tightening to a concrete type becomes useful, that's additive in a later PR.

The "does NOT expose module on loaded: false" test uses Extract<...> + keyof rather than an in-branch @ts-expect-error access. Reason: @ts-expect-error inside a narrowed-type branch is TS-version-sensitive — some versions allow "harmless" property access on narrowed unions, and a passing-by-accident test would mask real drift. The Extract<HarnessLoadResult, {loaded: false}>['keyof'] assertion is explicit and version-stable.

Related

  • Types this PR extends: src/agent/core/domain/harness/types.ts
  • Domain barrel: src/agent/core/domain/harness/index.ts (new)
  • Test file: test/unit/agent/harness/types-context.test.ts (new)
  • Task doc: features/autoharness-v2/tasks/phase_3/task_01-harness-module-contract.md (research repo)
  • Execution plan: features/autoharness-v2/execution-plan.md § Phase 3
  • Next (me): Phase 3 Task 3.2 — HarnessModuleBuilder

Phase 3 Task 3.1 — the seam that unblocks to start Phase 4.3
(templates) on day one. Ships the `HarnessContext` / `HarnessModule` /
`HarnessLoadResult` type surface so templates can be authored against
a stable contract before the module-builder (Task 3.2) and loader
(Task 3.3) exist.

Types added to `src/agent/core/domain/harness/types.ts`:
  - HarnessContextEnv (commandType, projectType, workingDirectory)
  - HarnessContextTools (curate, readFile — 2 methods for v1.0)
  - HarnessContext (abort, env, tools)
  - HarnessModule (meta required; curate/query optional)
  - HarnessLoadResult (discriminated union on `loaded`)

All fields `readonly`. Runtime freezing comes later in Task 3.2's
module builder via `Object.freeze`.

Shipped with 2 methods on `HarnessContextTools` instead of the 4
enumerated in the task doc draft. Reason: layering. `grep` and
`searchKnowledge` reference `GrepOptions` / `SearchKnowledgeOptions` /
`SearchKnowledgeResult` which currently live in
`src/agent/infra/sandbox/tools-sdk.ts`. Importing `infra/` types into
`core/domain/` would reintroduce the layering violation ENG-2222 fixed.
The 2-method surface covers Phase 4 pass-through curate templates;
extending is additive when a real consumer needs more (moving the
infra types to `core/` first is a prerequisite one-line PR).

New barrel `src/agent/core/domain/harness/index.ts` consolidates the
domain import surface.

8 type-level tests (using `expect-type`) cover the discriminated-union
narrowing + readonly enforcement + HarnessModule required/optional
shape. Full suite: 6665 passing, 0 failing.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 21, 2026

PR Review: feat: [ENG-2238] AutoHarness V2 HarnessContext + module contract

Tasks

  • Gather context (read changed files)
  • Review src/agent/core/domain/harness/types.ts
  • Review src/agent/core/domain/harness/index.ts
  • Review test/unit/agent/harness/types-context.test.ts
  • Check layering / import boundaries
  • Post final review summary

View job run

Comment thread src/agent/core/domain/harness/types.ts
Comment thread src/agent/core/domain/harness/types.ts
Comment thread test/unit/agent/harness/types-context.test.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 21, 2026

Claude finished @danhdoan's task in 2m 28s —— View job


PR Review: feat: [ENG-2238] AutoHarness V2 HarnessContext + module contract

Tasks

  • Gather context (read changed files)
  • Review src/agent/core/domain/harness/types.ts
  • Review src/agent/core/domain/harness/index.ts
  • Review test/unit/agent/harness/types-context.test.ts
  • Check layering / import boundaries
  • Post final review summary

Summary

Score: 4.5 / 5

Clean, well-scoped PR. The type surface is minimal and defensible, the import discipline is solid, and the test approach (compile-time invariants via @ts-expect-error + expectTypeOf) is the right tool for this job. The main actionable item is using ValidatedHarnessVersion in HarnessLoadResult; the rest are minor ergonomics.


Layering / Import Boundaries ✅

New imports in types.ts:

  • from '../../interfaces/i-curate-service.js'core/interfaces/ → correct
  • from '../file-system/types.js'core/domain/ → correct

No infra/ imports. The layering discipline established in ENG-2222 is fully preserved.


src/agent/core/domain/harness/types.ts

HarnessLoadResult uses HarnessVersion instead of ValidatedHarnessVersion (line 264) — this is the one change I'd call out as worth fixing before merge. HarnessVersion is z.input<...> — the raw, uncoerced type. A successfully-loaded harness version came from the store and passed validation; consumers of the loaded: true branch should be able to rely on createdAt: number, heuristic ∈ [0,1], etc. ValidatedHarnessVersion already exists in the same file. Fix this →

HarnessContextEnv.commandType inline union (line 195) — minor ergonomics: 'chat' | 'curate' | 'query' is meaningful enough to deserve a named HarnessCommandType type. HarnessMetaSchema stores commandType as z.string(), creating a silent asymmetry — if a new command type like 'search' is later added, HarnessContextEnv rejects it at compile time while the meta schema accepts it. Extracting the type to one place makes the discrepancy visible when it's time to extend.

HarnessModule.meta returns HarnessMeta (input type) (line 246–250) — this is intentionally correct: harness code produces a raw declaration that the module builder (Task 3.2) validates. A brief inline note on meta clarifying that callers must validate via HarnessMetaSchema.parse would reduce confusion for future readers. Optional given the class-level JSDoc already explains this.

Design decisions confirmed correct:

  • readonly on every HarnessContext / HarnessContextEnv / HarnessContextTools field — right choice; composes cleanly with discriminated unions and test mocks
  • HarnessModule.query returning Promise<unknown> — intentional placeholder until Phase 4/5 materialize a concrete shape; documented in JSDoc
  • HarnessContextTools scoped to curate + readFile — deferring searchKnowledge/grep until a real consumer materializes is the right outside-in call

src/agent/core/domain/harness/index.ts

Simple barrel export * from './types.js'. Correct pattern; no issues.


test/unit/agent/harness/types-context.test.ts

Well-structured. All 8 tests are testing the right invariants.

Double as cast in tools construction (line 75) — (async () => ({}) as CurateResult) as HarnessContextTools['curate'] is safe for the readonly test being run, but the pattern silently survives future signature changes to curate. Direct object construction (matching the HarnessModule test pattern above) would catch signature drift. Low priority but worth tidying.

{} as HarnessVersion at line 97 — same observation: an empty-object cast for version in the loaded: true test won't catch new required fields added to HarnessVersion. Given HarnessVersion is already defined in this file and relatively stable, acceptable as-is.

Extract<HarnessLoadResult, {loaded: false}> + keyof assertion (line 121–122) — this is the right approach. Noted in the PR description and worth calling out as a good pattern: avoids the @ts-expect-error inside a runtime-narrowing block that can silently pass due to TS lenience across versions.

Comment thread src/agent/core/domain/harness/types.ts
Comment thread src/agent/core/domain/harness/types.ts
Comment thread src/agent/core/domain/harness/types.ts
Comment thread test/unit/agent/harness/types-context.test.ts
@danhdoan danhdoan merged commit 7618b9a into proj/autoharness-v2 Apr 21, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants