feat: add Mode E (Harness) — build-and-verify with multi-role evaluation by bohemianpan · Pull Request #1 · iamtouchskyer/opc

bohemianpan · 2026-03-31T20:43:58Z

Summary

Adds Mode E (Harness) to OPC — build code with harness, evaluate with OPC specialist roles
Only 6 lines added to skill.md — triage row + pointer to harness/mode-e.md
Harness lives in harness/ as independent files — update by replacing them
harness/mode-e.md is the bridge: defines how OPC roles replace harness's single evaluator

How it works

Implementer builds code (harness pipeline)
OPC roles evaluate from specialist angles (security, tester, PM, etc.)
Coordinator synthesizes into PASS/ITERATE/FAIL
Fix-and-retest loop until quality passes (10-round cap)

Files

harness/
├── SKILL.md                 # Standard harness (update independently)
├── implementer-prompt.md    # Build/Fix/Polish modes
├── evaluator-prompt.md      # Standalone evaluator (used as reference)
├── handoff-template.md      # Structured handoff between build and eval
└── mode-e.md                # OPC bridge: multi-role eval + verdict synthesis

Usage

/opc harness <task>

Test plan

/opc harness triggers Mode E triage
Implementer builds code and writes .harness/ state
OPC roles receive Mode E evaluator template with their persona
Coordinator synthesizes PASS/ITERATE/FAIL correctly
Existing modes A-D unaffected (only 6 lines changed in skill.md)

Integrates the harness build-and-verify pipeline as OPC's 5th mode. The implementer builds code, then OPC specialist roles evaluate it from multiple angles. Coordinator synthesizes into PASS/ITERATE/FAIL and drives fix-and-retest iteration until quality passes. - Mode E triage with signal keywords - Two-phase dispatch: Build then Multi-Role Evaluation - Mode E evaluator template: OPC role persona + harness evaluation framework - Verdict synthesis with iteration loop (10-round cap) - Per-role report format with iteration history - JSON schema extensions for harness state - Implementer and handoff prompt templates in prompts/

Adds harness build-and-verify pipeline as OPC Mode E. The harness skill lives in harness/ as independent files that can be updated by replacing them. Only 6 lines added to skill.md (triage row + pointer to harness/mode-e.md). harness/mode-e.md bridges the two systems: harness implementer builds code, OPC roles evaluate from multiple specialist angles, coordinator synthesizes into PASS/ITERATE/FAIL.

…dd role tag check - Escalate identical heading from warning to error (same heading = same agent) - Add role/agent/reviewer tag extraction and comparison - Eval files with identical Role: tags now produce hard error Gap #1 of 10: review independence enforcement Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

Addresses 3 ITERATE findings from U1.6r contract + semantics reviewers: 1. fireArtifactEmit: recordSuccess was unconditionally resetting _failStreak after per-item write failures, so circuit-breaker would never trip on persistent write failures. Track anyItemFailed and only call recordSuccess when every item in the call succeeded. (semantics F1, contract #2) 2. fireArtifactEmit: accept ArrayBufferView (Uint8Array, DataView) in addition to string / Buffer. Modern APIs (crypto.subtle, TextEncoder, Playwright) commonly return Uint8Array — tight Buffer.isBuffer check was silently dropping them with a misleading WARN. (semantics F2) 3. cmdExtensionArtifact: add nodeCapabilities to stdout JSON for consistency with cmdExtensionVerdict. (contract #1) 4. CONTRIBUTING.md: document executeRun + artifactEmit hooks with sample skeleton + hook surface summary table. (contract #3) Regression tests: 4 new tests (Uint8Array accepted, _failStreak persists across calls, success reset is all-or-nothing, CLI JSON includes nodeCapabilities). Total 118/118 extension tests, 22/22 suite files green.

iamtouchskyer · 2026-04-26T14:30:14Z

Closing — superseded by #2 (already merged). This PR also has merge conflicts.

Dazhen Pan added 4 commits March 31, 2026 13:39

fix: add Mode E cross-references in JSON schema and timeline rules

508fdd8

Remove leftover prompts/ directory from first attempt

76de7db

iamtouchskyer closed this Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Mode E (Harness) — build-and-verify with multi-role evaluation#1

feat: add Mode E (Harness) — build-and-verify with multi-role evaluation#1
bohemianpan wants to merge 4 commits intoiamtouchskyer:mainfrom
bohemianpan:feature/mode-e-harness

bohemianpan commented Mar 31, 2026

Uh oh!

iamtouchskyer commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bohemianpan commented Mar 31, 2026

Summary

How it works

Files

Usage

Test plan

Uh oh!

iamtouchskyer commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants