Release v0.3.1 — inspector + runtime monitor · SollanSystems/loop-engineer

Adversarial-fix milestone. The v0.3.0 release closed two false-POSITIVE classes
in the anti-cheat scanner; a GPT-5.5/xhigh codex challenge over the v0.3.0 diff
then found the blind side — evasion paths the scanner failed to flag, plus
boundary-validation gaps in three harness scripts. This patch closes them.

Fixed

Anti-cheat scanner false-negatives (P1.1–P1.5) — scripts/anticheat_scan.py

Scoped self-exclusion (P1.1). A scanner self-edit that empties or shrinks
DEFAULT_GATE_PATHS / _ADDED_LINE_SIGNATURES is now graded critical
(FailedSafety); additive and comment-only self-edits stay clean. Removed
entries are compared semantically, so a reorder or reformat does not flag.
Delete + rename evidence (P1.2). parse_changed_files now also captures
gate files that are deleted (+++ /dev/null) or renamed
(rename from/rename to); both of Codex's exact exploit diffs now return
clean:false.
verify-* gate coverage (P1.3). Gate-path matching now covers
verify-fast / verify-full / verify-safety; tampering one to bypass it is
flagged.
Broader tautology detection (P1.4). Identical-operand assertions (a literal
or an identifier compared against itself) and always-true unittest calls now
downgrade to FailedUnverifiable; honest asserts with distinct operands stay clean.
Path-shaped hidden-answer names (P1.5). Trajectory reads of held-out /
hold_out / answer-key / golden / expected-output paths are flagged, while a
plain assert result == expected stays clean.

Boundary validation (P1.6, P2.1–P2.4)

scripts/benchmark_harness.py — compare() raises on a mismatched A/B
task-set instead of reporting a silent delta; non-bool claimed_done /
verification_passed and out-of-range repair / criteria counts are rejected.
scripts/runtime_monitor.py — robust score parsing for 1e-3, negatives, and
malformed input (no crash); tests pin the exact intervention per scenario.
scripts/inspect_loop.py — bounded shallow walk with a per-file read cap
replaces the unbounded full-tree traversal.

Changed (P2.5)

README.md — present-tense install note corrected to "all 9 skills".
.claude-plugin/plugin.json — version 0.3.0 → 0.3.1.

Credits

The false-negative and boundary findings came from the GPT-5.5/xhigh
codex adversarial review over the v0.3.0 release diff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.1 — inspector + runtime monitor

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Fixed

Changed (P2.5)

Credits

Uh oh!