Skip to content

feat(guardrails): Wave 8 — review fixes + 9 hooks + multi-model delegation#128

Merged
terisuke merged 7 commits intodevfrom
feat/guardrails-hooks-wave7
Apr 6, 2026
Merged

feat(guardrails): Wave 8 — review fixes + 9 hooks + multi-model delegation#128
terisuke merged 7 commits intodevfrom
feat/guardrails-hooks-wave7

Conversation

@terisuke
Copy link
Copy Markdown

@terisuke terisuke commented Apr 6, 2026

Summary

  • Address all 6 review findings from Wave 7
  • Implement 5 remaining plugin hooks + 4 CI workflow hooks
  • Add multi-model delegation enhancement (OpenCode competitive advantage)

Changes

Review Fixes

  • active_task_count race → Map-based callID tracking (CRITICAL)
  • verify-agent-output parses <task_result> payload (HIGH)
  • enforce-soak-time / enforce-domain-naming user-visible (MEDIUM)
  • issue_verification_done conditional (MEDIUM)

New Plugin Hooks (5)

  • verify-state-file-integrity, audit-docker-build-args
  • enforce-review-reading, pr-guard, stop-test-gate

New CI Workflow Hooks (4)

  • seed-verify.yml, workflow-sync.yml
  • inject-review-on-failure, post-pr-create-review-trigger (pr-management.yml)

Multi-Model Enhancement

  • Provider-aware routing per agent tier
  • Per-provider cost tracking (llm_calls_by_provider)
  • Tier mismatch + cost waste detection

guardrail.ts: 987 → 1184 lines (+197)

Test plan

Closes #124, #125, #126, #130

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 6, 2026 11:13
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the Guardrails profile to add multi-model delegation gates plus additional quality/operational advisories, and documents the design via a new ADR.

Changes:

  • Add delegation gates for parallel task limiting, agent↔model tier advisories, and session-level tracking fields in guardrail state.
  • Add quality/operational advisories for domain naming, endpoint dataflow reminders, tool-failure streaks, soak-time/follow-up reminders, and issue-close verification prompts.
  • Add ADR-007 documenting the multi-model delegation gate approach.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
packages/guardrails/profile/plugins/guardrail.ts Introduces new delegation/quality/operational hooks, new state fields, and additional guardrail advisories/blocks.
docs/ai-guardrails/adr/007-multi-model-delegation-gates.md Documents rationale and intended behavior for delegation gates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +645 to +649
const data = await stash(state)
const activeTasks = num(data.active_task_count)
if (activeTasks >= maxParallelTasks) {
const err = `parallel task limit reached (${activeTasks}/${maxParallelTasks}); wait for a running task to complete before delegating more`
await mark({ last_block: "task", last_reason: err })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

active_task_count is updated via a read-modify-write (stash -> mark) sequence without any locking/atomicity. If multiple task tool calls run in parallel, updates can race and allow the parallel limit to be exceeded (or counts to drift). Consider tracking active tasks by callID/session in memory, using an atomic compare-and-swap/lock around state.json, or recomputing the count from a set rather than increment/decrement counters.

Copilot uses AI. Check for mistakes.
Comment on lines +951 to +957
// Cost tracking: accumulate per-session cost
const data = await stash(state)
const sessionCost = num(data.session_cost)
const modelCost = (item.model.cost?.input ?? 0) + (item.model.cost?.output ?? 0)
if (modelCost > 0) {
await mark({ session_cost: sessionCost + modelCost })
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

session_cost accumulation is adding the model’s per-1M-token pricing (model.cost.input/output) directly to the session total, which does not represent the actual cost of a request (it depends on token usage). This will make session_cost meaningless (and likely trigger warnings too early/late). To track real cost, accumulate from the actual message/tool execution usage/cost metadata (e.g., assistant message info.cost or token usage) rather than the static price rates.

Suggested change
// Cost tracking: accumulate per-session cost
const data = await stash(state)
const sessionCost = num(data.session_cost)
const modelCost = (item.model.cost?.input ?? 0) + (item.model.cost?.output ?? 0)
if (modelCost > 0) {
await mark({ session_cost: sessionCost + modelCost })
}
// Cost tracking: do not derive session cost from model pricing metadata here.
// `item.model.cost` contains price rates, not the actual cost of this request.
// Session cost should only be updated from real usage/cost metadata produced by
// message/tool execution (for example, assistant `info.cost` or token usage).
const data = await stash(state)
const sessionCost = num(data.session_cost)

Copilot uses AI. Check for mistakes.
Comment on lines +955 to +959
if (modelCost > 0) {
await mark({ session_cost: sessionCost + modelCost })
}
if (sessionCost > maxSessionCost) {
await seen("delegation.cost_warning", { session_cost: sessionCost, max: maxSessionCost })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost warning compares sessionCost before adding the current modelCost, so the warning can lag by one request (and may never fire if the session cost crosses the threshold on the last request). Compute the new total first (including the just-added cost) and warn based on that value.

Suggested change
if (modelCost > 0) {
await mark({ session_cost: sessionCost + modelCost })
}
if (sessionCost > maxSessionCost) {
await seen("delegation.cost_warning", { session_cost: sessionCost, max: maxSessionCost })
const newSessionCost = sessionCost + modelCost
if (modelCost > 0) {
await mark({ session_cost: newSessionCost })
}
if (newSessionCost > maxSessionCost) {
await seen("delegation.cost_warning", { session_cost: newSessionCost, max: maxSessionCost })

Copilot uses AI. Check for mistakes.
Comment on lines +792 to +794
if (agent && output.length < 20) {
out.output = (out.output || "") + "\n⚠️ Agent output appears empty or trivially short. Verify the agent completed its task."
await seen("verify_agent.short_output", { agent, output_length: output.length })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verify-agent-output checks out.output.length < 20, but the task tool’s output is wrapped with task_id and <task_result> markers, so out.output will almost always exceed 20 chars even when the agent’s actual response is empty. Parse and validate the <task_result> payload (trimmed length) or use structured metadata from the task/session instead.

Suggested change
if (agent && output.length < 20) {
out.output = (out.output || "") + "\n⚠️ Agent output appears empty or trivially short. Verify the agent completed its task."
await seen("verify_agent.short_output", { agent, output_length: output.length })
const taskResultMatch = output.match(/<task_result>([\s\S]*?)<\/task_result>/i)
const agentOutput = (taskResultMatch?.[1] ?? output).trim()
if (agent && agentOutput.length < 20) {
out.output = (out.output || "") + "\n⚠️ Agent output appears empty or trivially short. Verify the agent completed its task."
await seen("verify_agent.short_output", { agent, output_length: agentOutput.length })

Copilot uses AI. Check for mistakes.
const halfDay = 12 * 60 * 60 * 1000
if (elapsed < halfDay) {
out.args.command = cmd // preserve command
await seen("soak_time.advisory", { elapsed_ms: elapsed, required_ms: halfDay })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This “soak time” gate only writes an event via seen(...) and doesn’t surface any advisory to the user (no out.output change, no warning). If the intent is an on-screen reminder, consider emitting the advisory from tool.execute.after for bash (where you can append to out.output) or explicitly console.warn here.

Suggested change
await seen("soak_time.advisory", { elapsed_ms: elapsed, required_ms: halfDay })
await seen("soak_time.advisory", { elapsed_ms: elapsed, required_ms: halfDay })
console.warn(text("soak time advisory — develop→main merge is happening before the 12-hour minimum soak period has elapsed"))

Copilot uses AI. Check for mistakes.
Comment on lines +656 to +660
const relFile = rel(input.worktree, file)
const fileName = path.basename(relFile)
for (const [dir, pattern] of Object.entries(domainDirs)) {
if (relFile.startsWith(dir) && !pattern.test(fileName)) {
await seen("domain_naming.mismatch", { file: relFile, expected_pattern: pattern.source, dir })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enforce-domain-naming only logs an event via seen(...) and doesn’t provide user-visible feedback. If this is meant to be an advisory (like the other architecture/test/CI reminders), consider emitting a message in tool.execute.after for write/edit so the user sees what naming pattern is expected for the path.

Copilot uses AI. Check for mistakes.
Comment on lines +874 to +875
}
await mark({ issue_verification_done: true })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue_verification_done is set to true unconditionally after any gh issue close, even when the hook detects reviewed/factchecked are false. This suppresses future “unverified” reminders and makes it easy to repeatedly close issues without ever meeting the verification criteria. Consider only setting issue_verification_done when verification conditions are satisfied (or track it per-issue close attempt).

Suggested change
}
await mark({ issue_verification_done: true })
} else {
await mark({ issue_verification_done: true })
}

Copilot uses AI. Check for mistakes.
Comment on lines +867 to +875
// Task completion gate: ensure task claims are backed by evidence
if (item.tool === "bash" && /\b(gh\s+issue\s+close)\b/i.test(str(item.args?.command))) {
const reviewed = flag(data.reviewed)
const factchecked = flag(data.factchecked)
if (!reviewed || !factchecked) {
out.output = (out.output || "") + "\n⚠️ Issue close without full verification: reviewed=" + reviewed + ", factchecked=" + factchecked + ". Ensure acceptance criteria have code-level evidence."
await seen("task_completion.incomplete", { reviewed, factchecked })
}
await mark({ issue_verification_done: true })
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/Issue #125 describe task-completion-gate as a hard block before issue close, but the implementation here is advisory only (it never blocks gh issue close). If the intended behavior is to enforce evidence before closing, this should throw (or move the hard gate to tool.execute.before). Otherwise, update the PR description/ADR to reflect that this is advisory.

Copilot uses AI. Check for mistakes.
Comment on lines +643 to +653
// Delegation: parallel execution gate for task tool
if (item.tool === "task") {
const data = await stash(state)
const activeTasks = num(data.active_task_count)
if (activeTasks >= maxParallelTasks) {
const err = `parallel task limit reached (${activeTasks}/${maxParallelTasks}); wait for a running task to complete before delegating more`
await mark({ last_block: "task", last_reason: err })
throw new Error(text(err))
}
await mark({ active_task_count: activeTasks + 1 })
}
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New delegation/operational gates (parallel task limit, cost tracking, failure tracking, soak time, follow-up limit, issue-close verification) aren’t covered by scenario tests. There is existing guardrails test coverage; please add tests that assert the new gates fire/accumulate/reset as expected (and that the hard block triggers at the configured limit).

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +25
### 3. cost-tracking (chat.params + tool.execute.after)
Accumulates `session_cost` from model cost metadata. Logs a warning event when session cost exceeds `maxSessionCost` threshold.

Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ADR states cost-tracking is implemented via “chat.params + tool.execute.after”, but the implementation only accumulates in chat.params. Either update the ADR to match the code, or add the missing tool.execute.after component (and ensure it uses actual usage/cost rather than static model price rates).

Copilot uses AI. Check for mistakes.
terisuke added a commit that referenced this pull request Apr 6, 2026
…ation

Review fixes (PR #128 feedback):
- Fix active_task_count race: counter → Map-based callID tracking
- Fix verify-agent-output: parse <task_result> payload instead of raw length
- Fix enforce-soak-time: add user-visible advisory via out.output
- Fix enforce-domain-naming: add user-visible advisory via out.output
- Fix issue_verification_done: conditional on reviewed && factchecked
- Add per-provider cost tracking to session state

New plugin hooks (5):
- verify-state-file-integrity: JSON parse check + auto-repair
- audit-docker-build-args: detect secrets in --build-arg
- enforce-review-reading: stale review detection (review_at < push_at)
- pr-guard: preflight check (tests + typecheck) before gh pr create
- stop-test-gate: block push/merge without test execution

New CI workflow hooks (4):
- enforce-seed-data-verification (seed-verify.yml)
- workflow-sync-guard (workflow-sync.yml)
- inject-claude-review-on-checks (pr-management.yml)
- post-pr-create-review-trigger (pr-management.yml)

Multi-model delegation enhancement (OpenCode competitive advantage):
- Provider-aware routing: recommend optimal providers per agent tier
- Per-provider LLM call tracking: llm_calls_by_provider map
- Cost waste detection: low-tier agent on high-tier model
- Tier mismatch advisory: surface in compacting context
- Session provider tracking: list of all providers used

guardrail.ts: 987 → 1184 lines (+197)

Closes #124, #125, #126

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@terisuke terisuke changed the title feat(guardrails): multi-model delegation gates + quality/operational hooks feat(guardrails): Wave 8 — review fixes + 9 hooks + multi-model delegation Apr 6, 2026
terisuke and others added 7 commits April 6, 2026 22:49
…ional hooks

Delegation gates (OpenCode competitive advantage over Claude Code):
- agent-model-mapping: tier-based model recommendation per agent type
- delegation-budget-gate: hard block at 5 concurrent parallel tasks
- cost-tracking: session-wide cost accumulation with threshold warning
- parallel-execution-gate: prevents unbounded task delegation
- verify-agent-output: detects empty/trivially short agent responses

Quality hooks:
- enforce-domain-naming: advisory for file naming convention mismatches
- enforce-endpoint-dataflow: 4-point verification reminder on API changes
- task-completion-gate: evidence verification before issue close
- tool-failure-recovery: consecutive failure detection with recovery advice

Operational hooks:
- enforce-soak-time: merge timing advisory
- enforce-follow-up-limit: feature freeze warning on 2+ consecutive fix PRs
- enforce-issue-close-verification: acceptance criteria verification prompt
- post-merge-close-issues: issue reference detection after merge
- enforce-memory-update-on-commit: memory save reminder after significant edits
- enforce-doc-update-scope: documentation freshness reminder

guardrail.ts: 758 → 984 lines (+226)

Refs #124, #125, #126

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the design rationale for OpenCode's provider-aware task routing,
mapping Claude Code's 7 Codex delegation gates to OpenCode's multi-provider
equivalents.

Refs #124

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ality hooks

Scenario test covers 9 hook firing points with 18 new assertions:
- delegation state initialization (7 fields)
- parallel execution gate: increment, decrement, hard block at 5
- verify-agent-output: empty response detection
- domain-naming: src/ui/ PascalCase mismatch → events.jsonl
- endpoint-dataflow: router.get() modification advisory
- tool-failure-recovery: 3 consecutive failures detection
- compaction context: active_tasks, session_cost, consecutive_failures

19 tests / 190 assertions — all pass.

Refs #124, #125, #126

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CRITICAL fix:
- active_task_count staleness recovery: reset counter if last task
  started >5min ago, preventing permanent session lockout on task crash

HIGH fixes:
- Cost tracking: replaced misleading session_cost (model rate accumulation)
  with llm_call_count (simple invocation counter)
- Failure detection: replaced broad regex (/error|failed|exception/)
  with structured signals (metadata.exitCode, title="Error")

WARNING fixes:
- Removed dead variable `cmd` in post-merge block
- Consolidated duplicate gh pr merge detection removed

Refs #124, #125, #126

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ation

Review fixes (PR #128 feedback):
- Fix active_task_count race: counter → Map-based callID tracking
- Fix verify-agent-output: parse <task_result> payload instead of raw length
- Fix enforce-soak-time: add user-visible advisory via out.output
- Fix enforce-domain-naming: add user-visible advisory via out.output
- Fix issue_verification_done: conditional on reviewed && factchecked
- Add per-provider cost tracking to session state

New plugin hooks (5):
- verify-state-file-integrity: JSON parse check + auto-repair
- audit-docker-build-args: detect secrets in --build-arg
- enforce-review-reading: stale review detection (review_at < push_at)
- pr-guard: preflight check (tests + typecheck) before gh pr create
- stop-test-gate: block push/merge without test execution

New CI workflow hooks (4):
- enforce-seed-data-verification (seed-verify.yml)
- workflow-sync-guard (workflow-sync.yml)
- inject-claude-review-on-checks (pr-management.yml)
- post-pr-create-review-trigger (pr-management.yml)

Multi-model delegation enhancement (OpenCode competitive advantage):
- Provider-aware routing: recommend optimal providers per agent tier
- Per-provider LLM call tracking: llm_calls_by_provider map
- Cost waste detection: low-tier agent on high-tier model
- Tier mismatch advisory: surface in compacting context
- Session provider tracking: list of all providers used

guardrail.ts: 987 → 1184 lines (+197)

Closes #124, #125, #126

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gation gate

- Cherry-pick PR #127 CI fixes (test timeouts, prompt-during-run skip, duplicate-pr null guard)
- Fix callID resolution: read from item.callID (top-level) as well as item.args.callID
- All 19 guardrails scenario tests pass (190 assertions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ADR stated session_cost via chat.params + tool.execute.after, but
implementation uses llm_call_count + llm_calls_by_provider in chat.params
only (actual cost data unavailable at hook time).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@terisuke terisuke force-pushed the feat/guardrails-hooks-wave7 branch from 1601f82 to 99b575d Compare April 6, 2026 13:49
@terisuke terisuke merged commit d79ba99 into dev Apr 6, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(guardrails): multi-model delegation gates — OpenCode competitive advantage

2 participants