Context
Verifier is referenced in LoopStrategy::SelfVerifying in the Harness (#3):
SelfVerifying {
verifier: Box<dyn Verifier>,
evaluator_harness: Arc<dyn Harness>,
}
The Verifier is distinct from CompletionCheck (#43). CompletionCheck answers "is the task done?" Verifier answers "is what was produced correct?" — it is the oracle that the SelfVerifying loop uses to decide whether the evaluator's verdict should halt the build loop or continue it.
Currently a stub in the harness module tagged // SPEC: full trait lives in this issue. The SelfVerifying strategy returns HaltReason::StrategyNotYetImplemented until both this trait and #43 are implemented.
The SelfVerifying Loop Pattern
SelfVerifying loop:
// Build phase — standard ReAct loop until agent claims done
run_standard_loop(context) → build_result
// Evaluate phase — separate evaluator harness
// Read-only sandbox, fresh session, explicit evaluator role chunk
// Default-FAIL contract: evaluator cannot be biased by watching the build
eval_result = evaluator_harness.run(eval_task)
// Verifier decides what to do with the evaluator's verdict
match verifier.verify(build_result, eval_result):
Passed → HaltSuccess
Failed { why } → inject why into build context, continue build loop
The Verifier sits between the evaluator harness output and the build loop decision. It translates the evaluator's RunResult into an actionable verdict.
Trait Definition
VerifierVerdict {
Passed,
Failed { reason: String }, // injected into build context next turn
}
// Input to the verifier — what the build produced and what the evaluator said
VerifierInput {
build_result: RunResult,
eval_result: RunResult,
workspace: PathBuf,
iteration: u32, // which build-evaluate cycle this is
}
trait Verifier {
async fn verify(input: &VerifierInput) -> VerifierVerdict
// Maximum number of build-evaluate cycles before giving up.
// Prevents infinite build loops when the evaluator always finds problems.
fn max_iterations() -> u32 // default: 3
}
Standard Implementations
EvaluatorResponseVerifier
Parses the evaluator harness's RunResult::Success { output } for pass/fail signals. The simplest verifier — trusts the evaluator's final text response.
EvaluatorResponseVerifier {
pass_pattern: String, // regex: if output matches this, Passed
fail_pattern: String, // regex: if output matches this, extract reason
max_iterations: u32,
}
TestSuiteVerifier
Runs the test suite after the evaluator completes and uses the result as the verdict. Ignores the evaluator's text output — ground truth is the tests.
TestSuiteVerifier {
command: String,
working_dir: PathBuf,
timeout: Duration,
sandbox: Arc<dyn SandboxProvider>,
max_iterations: u32,
}
CompositeVerifier
Passes only when all child verifiers pass.
CompositeVerifier {
verifiers: Vec<Box<dyn Verifier>>,
max_iterations: u32,
}
Evaluator Harness Constraints
The evaluator_harness in SelfVerifying must be constructed with:
- Read-only sandbox —
SandboxProvider::read_only(workspace). No write or edit tools.
- Fresh session — always a new
SessionId, never shares with the build harness.
- Evaluator role chunk —
"role-evaluator" from PromptChunkRegistry. This chunk must be registered in the standard chunk library before SelfVerifying is usable.
Mode::AlwaysAsk — evaluator never acts, only reports.
SubagentTool::new() already enforces no nested subagents. The evaluator harness is a peer harness, not a subagent — it is constructed directly by the caller and injected, not spawned by the build harness.
Checklist
Related Issues
Context
Verifieris referenced inLoopStrategy::SelfVerifyingin the Harness (#3):The
Verifieris distinct fromCompletionCheck(#43).CompletionCheckanswers "is the task done?"Verifieranswers "is what was produced correct?" — it is the oracle that the SelfVerifying loop uses to decide whether the evaluator's verdict should halt the build loop or continue it.Currently a stub in the harness module tagged
// SPEC: full trait lives in this issue. TheSelfVerifyingstrategy returnsHaltReason::StrategyNotYetImplementeduntil both this trait and #43 are implemented.The SelfVerifying Loop Pattern
The
Verifiersits between the evaluator harness output and the build loop decision. It translates the evaluator'sRunResultinto an actionable verdict.Trait Definition
Standard Implementations
EvaluatorResponseVerifier
Parses the evaluator harness's
RunResult::Success { output }for pass/fail signals. The simplest verifier — trusts the evaluator's final text response.TestSuiteVerifier
Runs the test suite after the evaluator completes and uses the result as the verdict. Ignores the evaluator's text output — ground truth is the tests.
CompositeVerifier
Passes only when all child verifiers pass.
Evaluator Harness Constraints
The
evaluator_harnessinSelfVerifyingmust be constructed with:SandboxProvider::read_only(workspace). No write or edit tools.SessionId, never shares with the build harness."role-evaluator"fromPromptChunkRegistry. This chunk must be registered in the standard chunk library beforeSelfVerifyingis usable.Mode::AlwaysAsk— evaluator never acts, only reports.SubagentTool::new()already enforces no nested subagents. The evaluator harness is a peer harness, not a subagent — it is constructed directly by the caller and injected, not spawned by the build harness.Checklist
Verifiertrait +VerifierVerdict+VerifierInput+ all standard implementationsmax_iterationsenforcement tested — loop halts after N cycles even without Passed verdictVerifier"role-evaluator"chunk registered in standard chunk library (Implement PromptChunkRegistry and Mode system #24)fixtures/verifier/evaluator_pass.jsonl,fixtures/verifier/evaluator_fail.jsonlRelated Issues