v0.3.4 — dogfood-driven hardening + portable core
Dogfood-driven hardening: ran loop-inspector + loop-runtime-monitor against 9 real
on-disk loops (foreign and in-house). The tools had been built and tested only against
this suite's own well-formed loops, so first contact with foreign/edge-case inputs exposed
six defects — all fixed here under TDD, each pinned by a regression test.
Fixed
- (P1)
inspect_loopno longer crashes on a malformedmanifest.yaml.read_manifest
(loop/contract.py) ranyaml.safe_loadwithout a guard — the one read path missing the
json.JSONDecodeErrorguard every JSON read already had — so a malformed manifest in an
untrusted/foreign loop dir killed the inspector with a traceback instead of returning a
report. It now fails safe to{}, fixing the crash forinspect_loop,validate_contract,
anddoctor_reportat once. inspect_loopnow scoresSPEC.md/WORKFLOW.md/TASKS.jsondual-location (.loop/
∪ workspace root), likemanifest/statealready resolved. Previously SPEC/WORKFLOW were
hard-coded to the workspace root, so a loop whose contract lives under.loop/(including
loop-engineer's own repo) was falsely scored as having "no success criteria" / "no
independent verification." Scores on substance, not on where the file sits.inspect_looprecognizes a single-fileloop-contract.mdas a contract-owned source
for success criteria, approval gates, plan-then-execute, and terminal-state coverage — a
committed minimal-contract loop that names all 7 terminal states is no longer scored 0/7.runtime_monitoris terminal-state-aware. It now readsterminal_state/state == "terminal"and reportsrecommendation: "done"(surfacing the terminal state) instead of
advisingcontinueon a loop that has already finished.runtime_monitorno longer reports an unparseable RUNLOG as healthy. A non-empty
RUNLOG that yields zero parseable iteration records now returnsstatus: "degraded"/
recommendation: "replan"(with evidence) instead of the benignok/continue/[]that
was byte-identical to a healthy loop — making the silent inertness of stall/repair-churn
detection on prose RUNLOGs visible.
Changed
- Removed the unreferenced broad-substring corpus scoring path from
scripts/inspect_loop.py
(_gather_corpus,_walk_bounded,_evaluate_checks,_terminal_states_covered) — dead
code since the keyword-stuffing fix replaced it with the typed-contract path. Corrected
loop-inspector/SKILL.mdandreference/patterns.md§4 to describe the actual named,
typed, dual-located contract file set the inspector reads, rather than a "reads any foreign
harness shape semantically" claim the implementation never honored.
Added
pyproject.toml— the portable core is now installable withpip install -e .
(optionalpip install -e ".[yaml]"for faster manifest parsing), so
python3 -m loop doctor|inspect <workspace>runs from any directory rather than only the
repo root. The core stays pure-stdlib; PyYAML remains an optional extra. A new
test_docs_versioncheck pins thepyproject.tomlversion to.claude-plugin/plugin.json.
Documentation
- README: the Portable validator / inspector section documents the editable install for
running outside the repo root; the 30-secondinspectdemo now shows the full
target/present/gapsreport; thedoctorblock notes the omittedpathsobject;
validate/verifyare documented asdoctoraliases;terminal_state.jsonis noted as
resolving in either.loop/or the workspace root. examples/coverage-repairrecords receipts at the canonical.loop/receipts/*.jsonl(was the
stale pre-decoupling.gsd/audit/receipts/path, inconsistent with the example's own.loop/
layout).loop-runtime-monitor/SKILL.mdframes its position generically ("vs a loop-driving operator")
instead of naming a private plugin agent.Dogfood-driven hardening: ranloop-inspector+loop-runtime-monitoragainst 9 real
on-disk loops (foreign and in-house). The tools had been built and tested only against
this suite's own well-formed loops, so first contact with foreign/edge-case inputs exposed
six defects — all fixed here under TDD, each pinned by a regression test.
Fixed
- (P1)
inspect_loopno longer crashes on a malformedmanifest.yaml.read_manifest
(loop/contract.py) ranyaml.safe_loadwithout a guard — the one read path missing the
json.JSONDecodeErrorguard every JSON read already had — so a malformed manifest in an
untrusted/foreign loop dir killed the inspector with a traceback instead of returning a
report. It now fails safe to{}, fixing the crash forinspect_loop,validate_contract,
anddoctor_reportat once. inspect_loopnow scoresSPEC.md/WORKFLOW.md/TASKS.jsondual-location (.loop/
∪ workspace root), likemanifest/statealready resolved. Previously SPEC/WORKFLOW were
hard-coded to the workspace root, so a loop whose contract lives under.loop/(including
loop-engineer's own repo) was falsely scored as having "no success criteria" / "no
independent verification." Scores on substance, not on where the file sits.inspect_looprecognizes a single-fileloop-contract.mdas a contract-owned source
for success criteria, approval gates, plan-then-execute, and terminal-state coverage — a
committed minimal-contract loop that names all 7 terminal states is no longer scored 0/7.runtime_monitoris terminal-state-aware. It now readsterminal_state/state == "terminal"and reportsrecommendation: "done"(surfacing the terminal state) instead of
advisingcontinueon a loop that has already finished.runtime_monitorno longer reports an unparseable RUNLOG as healthy. A non-empty
RUNLOG that yields zero parseable iteration records now returnsstatus: "degraded"/
recommendation: "replan"(with evidence) instead of the benignok/continue/[]that
was byte-identical to a healthy loop — making the silent inertness of stall/repair-churn
detection on prose RUNLOGs visible.
Changed
- Removed the unreferenced broad-substring corpus scoring path from
scripts/inspect_loop.py
(_gather_corpus,_walk_bounded,_evaluate_checks,_terminal_states_covered) — dead
code since the keyword-stuffing fix replaced it with the typed-contract path. Corrected
loop-inspector/SKILL.mdandreference/patterns.md§4 to describe the actual named,
typed, dual-located contract file set the inspector reads, rather than a "reads any foreign
harness shape semantically" claim the implementation never honored.
Added
pyproject.toml— the portable core is now installable withpip install -e .
(optionalpip install -e ".[yaml]"for faster manifest parsing), so
python3 -m loop doctor|inspect <workspace>runs from any directory rather than only the
repo root. The core stays pure-stdlib; PyYAML remains an optional extra. A new
test_docs_versioncheck pins thepyproject.tomlversion to.claude-plugin/plugin.json.
Documentation
- README: the Portable validator / inspector section documents the editable install for
running outside the repo root; the 30-secondinspectdemo now shows the full
target/present/gapsreport; thedoctorblock notes the omittedpathsobject;
validate/verifyare documented asdoctoraliases;terminal_state.jsonis noted as
resolving in either.loop/or the workspace root. examples/coverage-repairrecords receipts at the canonical.loop/receipts/*.jsonl(was the
stale pre-decoupling.gsd/audit/receipts/path, inconsistent with the example's own.loop/
layout).loop-runtime-monitor/SKILL.mdframes its position generically ("vs a loop-driving operator")
instead of naming a private plugin agent.
Erratum (2026-06-30): the Documentation note above overstated examples/coverage-repair — the frozen example ships contract artifacts, not a receipts trail. Corrected in the CHANGELOG Errata section; as of the M2 launch slice the example is fully runnable with a committed real holdout-gate verdict.