feat(mcp): canonical delegate hardening — reviewer gate, winner-selection, no-op+secret floor, createKbGate (0.36.0) by tangletools · Pull Request #83 · tangle-network/agent-runtime

tangletools · 2026-05-31T09:27:21Z

Hardens agent-runtime's one canonical delegation MCP so every product's delegated loops are reliable — folding the proven techniques from the ai-trading-blueprint fork + physim's KB subsystem back into the substrate, instead of re-forking per product. Published as 0.36.0.

delegate_code (build-in-a-loop reliability):

No-op rejection + always-on secret-path floor on the coder validator (4234c94).
Optional adversarial reviewer gate — a candidate must pass mechanical checks AND be approved to win (catches 'compiles+passes but wrong/unsafe').
winnerSelection: highest-score (default) | smallest-diff | highest-readiness | first-approved, over all valid candidates. Fails loud when nothing survives.

delegate_research (valid-only KB growth):

createKbGate — fail-closed fact gate distilled from physim: passage-present anti-hallucination floor, value-in-passage, no-circular-citation (laundering), pluggable consumer judges. Verdict-only; remediation is the caller's.

This is the engine for the configured loop-runner (#828). Full suite 420 green, tsc + biome clean. Additive — default behavior unchanged.

Follow-ups: the thin loop-runner config (#828) that drives code/research/review/audit/self-improve/dynamic on a cadence; fleet bump #827 now targets ^0.36.0.

…er validator First increment of the canonical MCP delegate hardening (the techniques the ai-trading-blueprint delegation fork proved, folded back into agent-runtime so delegate_code is reliable for the whole fleet — not re-forked per product): - No-op rejection: an empty patch can trivially pass tests/typecheck (nothing changed) yet does no work — now valid=false (scores.nonEmpty=0). - Secret-path floor: always-on, independent of task.forbiddenPaths — rejects a patch touching credential-shaped paths (.env, *.pem/*.key/*.p12/*.pfx, keystore, wallet, id_rsa/id_ed25519, secrets/credentials.json). valid=false. Both are hard gates (flip valid), additive to the existing forbidden-path / diff-size / tests / typecheck checks; the weighted composite is unchanged so clean patches don't regress. Tests: empty patch → invalid; secret path → invalid even when not in forbiddenPaths; normal patch still valid. Full suite 407 green, tsc + biome clean. Remaining hardening increments (this branch): reviewer/audit gate + winner- selection strategy on delegate_code; physim's valid-only KB-growth (passage- present storage guard, fail-closed judge registry, correct-on-veto/escalate, circular-citation detection) on delegate_research. Umbrella: #828 (loop-runner).

…bGate for valid-only research Increments 2 + 3 of the canonical-MCP delegate hardening (folding the proven techniques from the ai-trading-blueprint fork + physim's KB subsystem back into agent-runtime, so every product's delegated loops are reliable without re-forking). delegate_code (createDefaultCoderDelegate): - Optional `reviewer` (CoderReviewer): a candidate that passes mechanical validation must ALSO be approved by an adversarial reviewer to win — catches the "compiles + tests pass but wrong/unsafe" class. No reviewer → unchanged behavior. - `winnerSelection`: highest-score (default, = kernel) | smallest-diff | highest-readiness | first-approved, over ALL valid candidates not just the kernel's single winner. Fails loud when nothing survives validation (+ review). delegate_research (createKbGate): - Reusable, dependency-free valid-only KB-growth gate distilled from physim: fail-closed judge registry, first-veto-wins. Always-on floor — passage-non-empty, passage-present anti-hallucination guard (verbatim passage MUST appear in source), value-in-passage (literal / comma-grouped / billion-million shorthand), no-circular-citation (laundering catch). Consumer judges append after the floor. Operates on fact candidates, not a store — composes with agent-knowledge without importing it. Verdict only; remediation is the caller's (never drops silently). Tests: delegate selection + reviewer fail-loud + backward-compat; kb-gate floor + shorthand + circular + consumer-judge. Full suite 420 green, tsc + biome clean. Engine for the loop-runner (#828). Increment 1 (no-op + secret floor) = 4234c94.

…r-selection, no-op+secret floor, createKbGate)

… the hardened engines The thin façade that makes the hardened delegation engines (this branch) usable as ONE configured, schedulable entrypoint — the "configured delegated loop runner" (#828). - runDelegatedLoop(mode, registry): dispatches code | review | research | audit | self-improve | dynamic to a pre-configured runner. Owns mode routing, timing, fail-loud on an unregistered mode (ConfigError), and a uniform DelegatedLoopResult (a thrown engine becomes { ok:false, error } so unattended/scheduled runs record and move on rather than crash). - coderLoopRunner / reviewLoopRunner: default code/review runners over the hardened coder delegate (no-op + secret floor, reviewer gate, winner-selection). review mode TYPE-requires a reviewer — a review loop with no reviewer is just a code loop. - Registry is partial + injectable: products/routines register only the modes they use; tests inject stubs; the engines stay the canonical agent-runtime ones (no fork). This is the layer a scheduled routine targets (research/audit/self-improve on a cadence; code/review/dynamic on demand). Tests: dispatch routing, fail-loud unregistered mode, thrown-engine → ok:false, coderLoopRunner real wiring via stub. Full suite green, tsc + biome clean. Engine = 4234c94 + 688d701.

… the hardened engines (#828)

tangletools · 2026-05-31T09:35:39Z

Extended on this branch: runDelegatedLoop — the configured loop-runner (#828) over these hardened engines. runDelegatedLoop(mode, registry) dispatches code | review | research | audit | self-improve | dynamic to pre-configured runners (fail-loud on unregistered mode; thrown engine → {ok:false} so scheduled runs record + move on); coderLoopRunner/reviewLoopRunner are default code/review runners over the hardened delegate. Shipped 0.37.0. So this PR now lands: delegate hardening (0.36.0) + the loop-runner façade (0.37.0) — the engine and the configured entrypoint a scheduled routine targets.

Rounds out the configured loop-runner (#828) — every mode now has a default factory wiring a shipped engine, so a routine can run any of them with config only (still registry-injectable for stubs/custom engines): - dynamicLoopRunner — runLoop + createDynamicDriver (agent-authored topology) - researchLoopRunner — research-in-a-loop with valid-only KB growth: each round research → createKbGate (fail-closed) → accept clean facts, re-research vetoed ones up to maxRounds (correct-on-veto), and RETURN final vetoes (escalate, never silently drop). VetoedFact carries the gate reason. - selfImproveLoopRunner — optimizePrompt (identity-gated) - auditLoopRunner — runAnalystLoop over captured trace/run data (code/review shipped previously.) Tests: research single-round accept/veto + escalation, research correct-on-veto across rounds, dynamic real runLoop via stub. Full suite 427 green, tsc + biome clean. Completes the engine (#827 target) + runner; the thin scheduled-routine wrapper is the only remaining layer.

…des (#828)

Closes the loop-runner (#828): a cron/routine/Makefile invokes `agent-runtime-loop --mode <mode> --config <module>`. The config module wires the DelegatedLoopRegistry (with full env/creds access — deps live there, not in the generic bin), the bin runs the mode, prints the DelegatedLoopResult as JSON, exits 0 ok / 1 recorded-failure / 2 usage-or-config-error. - runLoopRunnerCli: pure, IO-free CLI core (mode validation → load registry → dispatch → exit code) — exported + unit-tested. - parseLoopRunnerArgv, DELEGATED_LOOP_MODES, isDelegatedLoopMode exported. - New bin `agent-runtime-loop` → dist/loop-runner-bin.js (tsup entry + package bin). Tests: argv parsing (space + = forms), exit 0/1/2 paths (success, recorded failure, unknown mode, no-runner-for-mode, config load failure). Full suite green, tsc + biome clean.

…plete)

drewstone added 5 commits May 31, 2026 03:18

chore(release): 0.36.0 — MCP delegate hardening (reviewer gate, winne…

97382c2

…r-selection, no-op+secret floor, createKbGate)

chore(release): 0.37.0 — runDelegatedLoop configured loop-runner over…

eea49c9

… the hardened engines (#828)

drewstone added 4 commits May 31, 2026 03:41

chore(release): 0.38.0 — loop-runner default factories for all six mo…

7f1f96e

…des (#828)

chore(release): 0.39.0 — agent-runtime-loop schedulable bin (#828 com…

210c43c

…plete)

tangletools merged commit 6747248 into main May 31, 2026
0 of 2 checks passed

tangletools deleted the feat/mcp-delegate-hardening branch May 31, 2026 09:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): canonical delegate hardening — reviewer gate, winner-selection, no-op+secret floor, createKbGate (0.36.0)#83

feat(mcp): canonical delegate hardening — reviewer gate, winner-selection, no-op+secret floor, createKbGate (0.36.0)#83
tangletools merged 9 commits into
mainfrom
feat/mcp-delegate-hardening

tangletools commented May 31, 2026

Uh oh!

tangletools commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tangletools commented May 31, 2026

Uh oh!

tangletools commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants