Skip to content

Conversation

@ammar-agent
Copy link
Collaborator

Problem

The file_edit_insert tool was adding an extra newline when content already ended with \n, causing failures on simple tasks like terminal-bench's hello-world where file content precision matters.

Example failure:

  • Task: Create hello.txt with "Hello, world!\n"
  • Agent: file_edit_insert(content="Hello, world!\n")
  • Result: File contains "Hello, world!\n\n" (double newline!)
  • Test: ❌ FAILED - Expected exactly one newline

Root Cause

The tool's implementation (lines 96-97):

const newLines = [...lines.slice(0, line_offset), content, ...lines.slice(line_offset)];
const newContent = newLines.join("\n");

When content is inserted as an array element and joined with "\n", it always gets a trailing newline. When content already includes "\n", this produces double newlines.

The tool was designed to add a trailing newline, but this behavior was:

  1. Undocumented in the tool description
  2. Unexpected by the agent (naturally includes \n when instructed to "make sure it ends in a newline")
  3. Causing subtle bugs where exact file content matters

Solution

Smart newline detection - If content already ends with \n, strip it before joining to prevent doubling:

const contentEndsWithNewline = content.endsWith("\n");
const normalizedContent = contentEndsWithNewline ? content.slice(0, -1) : content;

This matches the agent's natural expectation - if they explicitly add \n, it should be preserved as-is without doubling.

Testing

Added 2 regression tests:

  • ✅ Content with trailing newline (the hello-world case)
  • ✅ Multiline content with trailing newline

All 14 tests pass (12 existing + 2 new).

Impact

Estimated: +5-10% accuracy on terminal-bench tasks

Direct fixes:

  • hello-world task (was failing 1/2 tests, now should pass both)
  • Likely fixes other tasks where exact file content matters

Indirect improvements:

  • Agent won't avoid tool after encountering unexpected behavior
  • Fewer edge case bugs in config files, data formats, etc.

Analysis Source

Full analysis: terminal-bench-results/analysis_run_18894357631.md

From run 18894357631:

  • 40% accuracy (16/40 tasks)
  • This bug identified as Priority 1 critical issue
  • Several other tasks showed partial failures (5/7 tests, 4/5 tests) that may also benefit from this fix

Generated with cmux

The file_edit_insert tool was adding an extra newline when content already
ended with \n, causing failures on simple tasks like terminal-bench's
'hello-world' where precision matters.

Root cause: The tool splits content into lines and joins with "\n", which
always adds a trailing newline. When content already includes "\n", this
produces double newlines.

Fix: Detect if content ends with \n and normalize before joining. This
matches the agent's natural expectation - if they explicitly add \n, it
should be preserved as-is without doubling.

Test coverage:
- Added regression test for content with trailing newline
- Added test for multiline content with trailing newline
- All 14 tests pass including existing behavior tests

Impact: Fixes terminal-bench hello-world task and likely improves accuracy
on other tasks where exact file content matters (configs, data files).

Analysis: terminal-bench-results/analysis_run_18894357631.md

_Generated with `cmux`_
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Codex caught a regression: when inserting at EOF, the previous fix would
strip the trailing newline from content, but join() wouldn't add it back
since there's nothing after it.

Example:
  File: "line1\nline2" (no trailing newline)
  Insert "line3\n" at offset 2 (end of file)
  Previous: "line1\nline2\nline3" (newline lost!)
  Now: "line1\nline2\nline3\n" (preserved ✓)

Solution: Only strip trailing newline if we're NOT inserting at EOF.
When at EOF, preserve the content's trailing newline as-is.

Added regression test for this case.

Co-authored-by: chatgpt-codex-connector
@ammario ammario added this pull request to the merge queue Oct 29, 2025
Merged via the queue into main with commit 0d5f997 Oct 29, 2025
18 of 19 checks passed
@ammario ammario deleted the tb-fix-file-edit-insert branch October 29, 2025 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants