What variant of Codex are you using?
CLI
What feature would you like to see?
I'd like Codex CLI to support an experimental hash-anchored edit mode (alongside apply_patch) to improve edit reliability, especially when models struggle with strict patch formatting.
The core idea is to let read/search tools return stable per-line anchors (short content-hash IDs), and let an edit tool target those anchors directly instead of reproducing exact old text.
Research-driven motivation (from the Harness Problem write-up):
- Benchmark shape: 16 models, 3 edit formats, 180 tasks per run, 3 runs.
- Reported patch-format failure rates were high for several non-Codex models (examples in the post: Grok 4 at 50.7%, GLM-4.7 at 46.2%).
- The write-up reports that hashline-style anchored edits matched or beat replace-style edits for most tested models.
- Reported impact examples: Grok Code Fast 1 from 6.7% to 68.3%, and Grok 4 Fast output tokens down 61% due to fewer edit retry loops.
Proposed CLI PoC scope:
- Add an experimental edit tool (or mode), e.g.
apply_anchored_edit, behind a feature flag.
- Return line anchors from read/search output (example format:
lineNumber:hash|content).
- Support operations like
replace_range, insert_after, and delete_range using anchor IDs.
- Enforce optimistic concurrency: if anchors no longer match current file content, reject safely with a clear conflict message.
- Keep full backward compatibility by preserving
apply_patch and using anchored mode as opt-in (or adaptive fallback).
Why this matters:
Suggested acceptance criteria:
- Add a benchmark mode for edit-tool reliability (failure rate + retries + token usage).
- Demonstrate reduced failed edit attempts vs. patch-only baseline on the same task set.
- Show no correctness regressions on existing Codex CLI edit workflows.
- Keep the feature gated/experimental until reliability data is strong.
Additional information
Primary reference: https://blog.can.ac/2026/02/12/the-harness-problem/
Related prior issue: #11601
If maintainers prefer a minimal first step, a benchmark-only branch (no default behavior change) would still be very useful to validate feasibility in Codex CLI.
What variant of Codex are you using?
CLI
What feature would you like to see?
I'd like Codex CLI to support an experimental hash-anchored edit mode (alongside
apply_patch) to improve edit reliability, especially when models struggle with strict patch formatting.The core idea is to let read/search tools return stable per-line anchors (short content-hash IDs), and let an edit tool target those anchors directly instead of reproducing exact old text.
Research-driven motivation (from the Harness Problem write-up):
Proposed CLI PoC scope:
apply_anchored_edit, behind a feature flag.lineNumber:hash|content).replace_range,insert_after, anddelete_rangeusing anchor IDs.apply_patchand using anchored mode as opt-in (or adaptive fallback).Why this matters:
Suggested acceptance criteria:
Additional information
Primary reference: https://blog.can.ac/2026/02/12/the-harness-problem/
Related prior issue: #11601
If maintainers prefer a minimal first step, a benchmark-only branch (no default behavior change) would still be very useful to validate feasibility in Codex CLI.