fix(coder): address #825 + #829 + #832 auto-review findings (final review pass)#834
Conversation
Final cleanup pass to complete the coder branch for EM testing. Five Important + three Minor findings across the Phase 5/6/11 auto-reviews: - tests/coder/test_self_fix/test_cli.py: replace try/except SystemExit with pytest.raises so an argparse regression that silently accepts the invalid severity can't silently pass the test. (#825, #829) - tests/coder/test_integration_e2e.py: replace the no-op `os.environ["PATH"] = os.environ.get("PATH", "")` with an actual prepend of sys.executable's dir via monkeypatch.setenv so the change reverts after the test. (#829) - tests/coder/test_fixes_827_828.py: drop the unused Path import. (#832) - src/gaia/coder/self_fix/loop_driver.py: narrow the broad `except Exception` blocks around review_gate and notify_em to (RuntimeError, CalledProcessError, OSError) so programming errors surface. Preserves the loose-coupling intent. (#825) - src/gaia/coder/self_fix/loop_driver.py + verifier.py: _append_notes / _append_note now raise ValueError on corrupted or wrong-type notes_json instead of silently replacing with []. The audit trail is canonical; silent replacement would mask regressions. (#825) All 395 tests pass on coder HEAD.
SummaryTight, well-scoped cleanup PR addressing auto-review findings from #825, #829, and #832. The two structural changes — narrowing Issues Found🟢 Minor — no unit tests for the new The fail-loudly rewrite of # tests/coder/test_self_fix/test_notes_json.py
import pytest
from gaia.coder.self_fix.loop_driver import _append_notes
from gaia.coder.self_fix.verifier import _append_note
@pytest.mark.parametrize("fn", [_append_notes, _append_note])
def test_raises_on_corrupted_json(fn):
with pytest.raises(ValueError, match="corrupted"):
fn("{not-json", {"at": "t"})
@pytest.mark.parametrize("fn", [_append_notes, _append_note])
def test_raises_on_non_list_json(fn):
with pytest.raises(ValueError, match="must be a JSON array"):
fn('{"already": "an object"}', {"at": "t"})Non-blocking — can land as a follow-up. 🟢 Minor — The narrowed Strengths
VerdictApprove — this is a clean finishing pass on the |
Summary
Final cleanup pass to complete the
coderbranch for EM testing. Five Important + three Minor findings across the Phase 5/6/11 auto-reviews. All 395 tests pass.Changes
test_self_fix/test_cli.py—pytest.raisesso a silent-pass regression in argparse can't pass the test. [feat(coder): self-correction loop (§7.3-§7.9 + continuous critique) #825, feat(coder): Phase 11 — CLI unification + end-to-end integration tests #829]test_integration_e2e.py— realPATHprepend viamonkeypatch.setenvinstead of a no-op assignment that leaked env. [feat(coder): Phase 11 — CLI unification + end-to-end integration tests #829]test_fixes_827_828.py— drop unusedPathimport. [fix(coder): address #827 + #828 auto-review findings (security + runtime) #832]loop_driver.py— narrow broadexcept Exceptionaroundreview_gateandnotify_emto(RuntimeError, CalledProcessError, OSError). Programming errors now surface per CLAUDE.md fail-loudly. [feat(coder): self-correction loop (§7.3-§7.9 + continuous critique) #825]loop_driver.py+verifier.py—_append_notes/_append_noteraiseValueErroron corrupted or wrong-typenotes_jsoninstead of silently replacing with[]. [feat(coder): self-correction loop (§7.3-§7.9 + continuous critique) #825]Test plan
pytest tests/coder/ tests/eval/— 395/395 pass