fix(file_write): non-greedy tag, last-block fence, schema content fallback, debug placeholder#246
Open
voidborne-d wants to merge 1 commit intolsdefine:mainfrom
Open
Conversation
…lback, debug placeholder Closes lsdefine#241. - ga.py: <file_content> tag match becomes non-greedy + uses last tag/fence to avoid swallowing prose between unrelated triple backticks. - ga.py: file_write accepts optional args.content as a direct fallback when reply body has no <file_content> or trailing code block. - agent_loop.py: _clean_content replaces <file_content> with a length placeholder instead of stripping silently, so writes are visible in logs. - tools_schema(_cn).json: declare optional content parameter on file_write.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #241.
Summary
Issue #241 reports 3 extraction bugs + 1 design gap in
file_write. Reproduced and fixed all four with a 12-line, 4-file change. Net code size change: 0 (LoC down by 1 inga.py, +1 inagent_loop.py, +1 each in the two schema files).Changes
Bug 1 — greedy
(.*)swallows multi-tag bodies (ga.py)<file_content[^>]*>(.*)</file_content>matched to the last</file_content>in the reply, so two adjacent<file_content>blocks (or one whose body contains the literal closing string) produced corrupt content. Switched tore.findall(... .*? ...)and take the last match — non-greedy + last-wins matches the LLM's most-recent intent.Bug 2 — first-fence-to-last-fence span (
ga.py)The fallback
text.find('\``'), text.rfind('```')returned the entire span between the first and last triple backtick — when the reply contained a prose code snippet *before* the file-content fence, the prose plus its closing fence got concatenated into the file. Replaced withre.findall(r"```[^\n\`]\n([\s\S]?)```", ...)` and take the last fence body. Single-fence behavior unchanged.Bug 3 —
<file_content>silently stripped from logs (agent_loop.py)_clean_contentremoved<file_content>...</file_content>entirely, so when a write turned out wrong there was no way to confirm what the model actually emitted. Now substitutes a<file_content: N chars>placeholder — visible in turn logs, zero token overhead, debugging restored.Bug 4 — schema lacks a
contentparameter (tools_schema*.json)file_write's parameter set was justpath+mode; if reply-body parsing missed the content (Bugs 1/2 territory or bad LLM formatting) there was no fallback. Added an optionalcontentstring.do_file_writenow doesargs.get(\"content\") or extract_robust_content(response.content)— body-extraction stays the canonical path, schema arg is a tail-end safety net.Verification (no test infra in repo, so behavior verified by direct run)
python -m py_compile ga.py agent_loop.pyclean. Both schemas parse as valid JSON.Notes
contentfield calls it[Optional fallback]so the LLM keeps preferring<file_content>(preserves token-efficient streaming).ga.py, +1 elsewhere). Aligns with the project's "ideally negative or zero" line-count guidance.