Skip to content

fix(tuning): actually apply approved proposals (codex P1)#34

Merged
dep0we merged 1 commit into
mainfrom
fix/p1-tuning-apply-reconcile
May 7, 2026
Merged

fix(tuning): actually apply approved proposals (codex P1)#34
dep0we merged 1 commit into
mainfrom
fix/p1-tuning-apply-reconcile

Conversation

@dep0we
Copy link
Copy Markdown
Owner

@dep0we dep0we commented May 7, 2026

Summary

  • Truth A confirmed: docs/spec/11-tuning.md is unambiguous — --apply must write approved diffs. The implementation guide shows the exact apply_diff(target, p.proposed_diff, helper) call. A previous session wrote a CHANGELOG entry claiming "v0.3 does not auto-write" but this was rationalization of a bug, not intent.
  • apply_proposals now reads each accepted proposal's proposed_diff and writes the target file via atomic_write, enforcing tools.md write_paths.
  • parse_report_proposals now also extracts proposed_diff from the diff code fence in the report body (was left empty before).
  • Instructional diffs (comment-heavy, multi-step like promotable-memory proposals) are auto-detected and flagged as manual apply required with a skip_reason in history rather than failing silently.
  • --dry-run with --apply previews what would change without writing.
  • CHANGELOG corrected.

Changed files

  • atomic_agents/tuning.pyparse_report_proposals (diff extraction), new helpers _load_write_paths, _diff_is_auto_applicable, _apply_diff_to_file, full rewrite of apply_proposals body, CLI output
  • tests/test_tuning.py — 14 new regression tests (310 total, all green)
  • CHANGELOG.md — corrects the wrong "v0.3 does not auto-write" claim

Test plan

  • uv run python -m pytest tests/test_tuning.py -v → 39/39 pass
  • uv run python -m pytest --tb=short → 310/310 pass
  • Core regression test_apply_writes_accepted_proposal_to_filesummary["applied"] == 1 and file actually contains the diff
  • test_apply_does_not_write_on_dry_run — file unchanged, history written
  • test_apply_respects_write_path_enforcement — skipped with write_path violation reason
  • test_apply_skips_instructional_diffs — skipped with manual-apply reason
  • test_parse_report_proposals_extracts_diff — diff round-trip works
  • test_apply_history_records_applied_true_when_writtenapplied=True in history

🤖 Generated with Claude Code

--apply previously only logged decisions; now writes the approved
persona/memory edits via atomic_write, respecting tools.md
write_paths. --dry-run added for safe preview.

parse_report_proposals now extracts proposed_diff from the diff
code fence in the report body so the applier has the actual change.
Instructional diffs (comment-heavy, multi-step) are flagged as
manual-apply with a skip_reason rather than failing silently.
The CHANGELOG entry that wrongly claimed v0.3 doesn't auto-write
has been corrected.

Adds 14 regression tests covering: diff applicability gating,
context-line insertion, new-file creation, the core apply→file-write
path, dry-run no-write guarantee, write_path enforcement, instructional
diff skipping, diff round-trip through parse_report_proposals, and
applied=True in history when a file is written.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dep0we dep0we merged commit 4ab1b23 into main May 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant