Skip to content

fix: thread fixes_count_only through cycle verifiers (#0139)#167

Merged
fazxes merged 1 commit intomainfrom
feat/cycle-result-count-only-fallback
Apr 6, 2026
Merged

fix: thread fixes_count_only through cycle verifiers (#0139)#167
fazxes merged 1 commit intomainfrom
feat/cycle-result-count-only-fallback

Conversation

@fazxes
Copy link
Copy Markdown
Member

@fazxes fazxes commented Apr 6, 2026

Summary

  • Adds fixes_count_only: int to CycleResult TypedDict, set by _as_cycle_result when agent returns fixes_committed/fixes_applied count-only payload instead of structured fixes[]
  • Updates expected_fix_commits, allowed_total_cycle_commits, and expected_cycle_commits to fall back to fixes_count_only when fixes == []
  • Exports both verifier functions from nightshift/__init__.py
  • 18 regression tests reproducing the Evaluation fix: tolerate separate shift-log commits in cycle verification #15 payload shape

Root cause

Claude cycles returning {"fixes_committed": 1, "summary": "..."} instead of {"fixes": [...]} were false-rejected as "structured output implies 0" — the count landed only in prose notes and all three verifier functions read fixes == []. This blocked the eval gate at 53/100.

Test plan

  • make check passes (1079 tests)
  • TestAsCycleResultCountOnlyPayload::test_eval_0015_single_fix_cycle_no_longer_produces_violation — end-to-end reproduction confirms no violations fire
  • TestAllowedTotalCycleCommits::test_count_only_one_fix_gives_range_1_to_3 — confirms range is (1,3) not (0,1)
  • TestExpectedFixCommits::test_returns_count_only_fallback_when_fixes_empty — confirms fallback returns 1 not 0

…e-reject (#139)

Claude cycles returning count-only payloads (fixes_committed/fixes_applied instead
of fixes[]) were false-rejected as "structured output implies 0" because the count
landed only in prose notes and all three verifier functions read fixes==[]. Adds
fixes_count_only: int to CycleResult; _as_cycle_result sets it; expected_fix_commits,
allowed_total_cycle_commits, and expected_cycle_commits fall back to it. 18 regression
tests reproduce the eval #15 payload shape and verify no violations fire.
@fazxes fazxes merged commit accff7f into main Apr 6, 2026
7 checks passed
@fazxes fazxes deleted the feat/cycle-result-count-only-fallback branch April 6, 2026 21:59
fazxes added a commit that referenced this pull request Apr 8, 2026
…clear

Two-cycle test run against Phractal confirms the three merged PRs fixed
the scored failures from eval #15. Both cycles accepted (no false
rejections), shift log persisted, counters positive. Score 86/100
exceeds the BUILD EVAL GATE threshold of >= 80.

- Guard rails: 4 -> 9 (PR #167 count-only payload fix working)
- Shift log: 0 -> 9 (durable artifact written and co-committed)
- Fix quality: 3 -> 8 (full metadata in accepted cycles)
- Discovery: 5 -> 8 (structured output preserved faithfully)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant