Skip to content

fix(opencode): repair common tool-input shape failures before retry#29412

Open
paymog wants to merge 1 commit into
anomalyco:devfrom
paymog:fix/tool-input-repair
Open

fix(opencode): repair common tool-input shape failures before retry#29412
paymog wants to merge 1 commit into
anomalyco:devfrom
paymog:fix/tool-input-repair

Conversation

@paymog
Copy link
Copy Markdown

@paymog paymog commented May 26, 2026

Issue for this PR

Closes #26498

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Adds a validate-then-repair layer for tool-call arguments. When the schema decode of an LLM-emitted tool call fails, the parse error's own issue tree is walked to locate the failing paths, four targeted shape repairs are tried at those paths, and the input is re-decoded. Successful inputs are never touched.

The four repairs cover the failure modes that repeatedly come up against open-weight models in opencode issues (#26498, #26913, #24566, #22329, #24722) and similar codebases:

  1. null at an optional field → drop the key
  2. JSON-array-shaped string like '["a","b"]' → parse to a real array
  3. empty-object placeholder {} at an optional field → drop the key
  4. bare scalar where the schema expected an array (InvalidType / AnyOf leaf) → wrap as [scalar]

Ordering is load-bearing: (2) must precede (4), or a stringified array would be double-wrapped into ['["a","b"]']. The order is encoded in a single repairAt switch so it stays a single source of truth.

Effect Schema short-circuits at the first failing element of an array or struct, so a single repair pass can only fix one path. recover() loops decode → repair → decode up to 6 rounds; if no repair applies or the bound is hit, it surfaces the original schema-level error to the model, never a repair-induced cascade.

On success, Effect.annotateCurrentSpan("tool.input_repaired", ...) records which repairs fired at which paths, so per-(model, tool) repair rates can be watched in telemetry and regressions on a specific contract surface before users notice.

Two adjacent fixes for related failure modes:

  • FilePathInput (in packages/core/src/schema.ts): a schema for fields that flow to filesystem operations. Some open-weight models occasionally emit paths wrapped in markdown auto-links like "[notes.md](http://notes.md)" — a post-training distribution leak from chat-style output applied where it makes no sense. FilePathInput unwraps only the degenerate case (link text equals URL with protocol stripped); real markdown such as [click](https://example.com) passes through untouched. Used by read, write, edit, and lsp tools. Encoding the intent at the schema level plugs the leak for every path field at once.

  • read tool pairing hint: the runtime already defaults offset=1 and limit=2000 when one is omitted, but previously the model had no signal that this happened and could not self-correct on the next turn. Adds an informational note (no Error: prefix, so the TUI does not paint it red) inside the <content> block when only one of the pair was provided, telling the model what default was applied.

How this differs from #26496

#26496 closes the same issue with a DeepSeek-specific prompt-section that tells the model the expected JSON shape. That helps but it depends on the model following the instruction, scales linearly with the number of models, and pays a token cost on every turn. This PR fixes the contract at the parse boundary instead — the schema is the prior, repairs run only at paths the schema explicitly disagreed with, valid inputs pay zero overhead, and the same code path covers any model whose failures fall in the same shape catalogue. The two approaches are not mutually exclusive; the prompt-side change could still ship alongside this for the cases where prevention is cheaper than repair.

How did you verify your code works?

Tests (all in this PR):

  • packages/opencode/test/tool/repair.test.ts — 13 tests covering each repair in isolation, the load-bearing ordering case ('["a","b"]' must parse, not wrap), the recover() loop converging on a multi-path bad payload, and the unrepairable-failure case surfacing the original error.
  • packages/core/test/schema.test.ts — 7 tests for FilePathInput covering: plain paths pass through, degenerate auto-link unwraps in three URL-shape variants, real markdown links preserved, embedded auto-links inside longer paths handled correctly, the description annotation reaching the JSON Schema output.

Existing tests:

  • bun test test/tool/ — full opencode tool test suite (303 tests including JSON Schema snapshot tests for every tool, parameter accept/reject, and the existing Tool.define regression for InvalidArgumentsError) passes unchanged.
  • bun typecheck clean across both packages/opencode and packages/core.

The FilePathInput description annotation is placed on the encoded side of the decodeTo transformation so the JSON Schema emitted to the LLM still carries it — the snapshot tests for read, write, edit, and lsp confirm this.

Screenshots / recordings

N/A — no UI change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Related PR Found:

Note on #29361:

  • fix(opencode): normalize file tool argument aliases may be loosely related as it also deals with tool argument handling, though it focuses on aliases rather than shape repair.

No duplicate PRs found with the same repair functionality.

Open-weight models (deepseek, glm, qwen, ...) emit a small, repeatable set
of shape mistakes in tool-call arguments: null at optional fields,
stringified JSON arrays, empty-object placeholders, and bare scalars in
array positions. The current "Invalid tool input" prose error is rarely
enough for the model to find the fix on its own; it loops on the same
malformed call.

Adds a validate-then-repair layer: schema decode runs unchanged, valid
inputs are never touched. On failure, the parse error's own issue tree
localizes the bug, four targeted shape repairs run at the failing paths,
and we re-decode. Loops until the input parses, no further repair
applies, or the round bound is hit; on terminal failure the original
schema-level error is what reaches the model.

Adds FilePathInput for fields that flow to fopen/stat. A small fraction
of model emissions wrap paths in markdown auto-links like
"[notes.md](http://notes.md)" - a post-training distribution leak from
chat output applied where it makes no sense. FilePathInput unwraps only
the degenerate case where link text equals the URL with the protocol
stripped; real markdown passes through.

read tool: surfaces the offset/limit pairing decision back to the model
when only one was provided. The runtime already defaults the missing
side; previously the model had no signal that this happened and could
not self-correct on the next turn.

Closes anomalyco#26498
@paymog paymog force-pushed the fix/tool-input-repair branch from eb2d3cc to f8df27a Compare May 26, 2026 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DeepSeek tool calls can use malformed argument JSON

1 participant