Stage A: harden provider against truncated structured responses by Anmolnoor · Pull Request #3 · Anmolnoor/fcli

Anmolnoor · 2026-05-27T10:02:56Z

Context

Part 1 of the plan to fix the Provider returned invalid JSON for a structured response crash (traced from session 0e54161d). Root cause: a file.write action carries the entire file body inline in the plan JSON; the model hits its output-token budget mid-string, the JSON never closes, and the turn dies with a confusing parse error.

This stage doesn't change where content lives yet (that's Stage B) — it makes the truncation failure legible and recoverable so it stops masquerading as an invalid-JSON error.

What changed

New settings provider.max_output_tokens / provider.num_ctx, threaded into both adapters (Ollama → num_predict/num_ctx options; OpenAI → max_output_tokens).
Explicit truncation detection: Ollama done_reason == "length" and OpenAI status == "incomplete" now raise the new ProviderErrorCode.TRUNCATED with the partial response_text attached, instead of falling through to json.loads.
Truncation-aware repair: the planner's retry tells the model its output was cut off and to emit a shorter plan (use content_brief for large bodies — landing in Stage B), rather than the generic "not valid JSON" nudge that previously re-sent the same oversized request.
Self-diagnosing logs: a capped (4 KB) response_text is now persisted into the EXCEPTION / plan_generation_failed event payloads, so the NDJSON event log captures the raw bad response without needing --debug.

Tests

5 new (385 total, ruff clean): num_predict/num_ctx + max_output_tokens wiring; TRUNCATED on done_reason=length and status=incomplete; orchestrator recovery from a truncated plan carrying the content_brief hint.

Sequencing

Stage A of A→B→C→D. Independently shippable; Stage B (decoupling large writes) builds on the max_output_tokens and content_brief groundwork here.

🤖 Generated with Claude Code

When the model hits its output-token budget mid-plan, the JSON object is left unterminated and the turn died with a confusing "invalid JSON" error. This makes truncation a first-class, legible failure: - Add provider.max_output_tokens / num_ctx settings, threaded into both adapters. Ollama gets num_predict/num_ctx options; OpenAI gets max_output_tokens. - Detect truncation explicitly: Ollama done_reason=="length" and OpenAI status=="incomplete" now raise ProviderErrorCode.TRUNCATED with the partial response_text attached, instead of falling through to json.loads. - Planner repair retry is truncation-aware: it tells the model its output was cut off and to emit a shorter plan (use content_brief for large file bodies) rather than the generic "not valid JSON" nudge. - Persist a capped response_text into the EXCEPTION / plan_generation_failed event payloads so the NDJSON log is self-diagnosing without --debug. Tests: num_predict/num_ctx and max_output_tokens wiring, TRUNCATED on done_reason=length and status=incomplete, and orchestrator recovery from a truncated plan carrying the content_brief hint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Anmolnoor merged commit 5168f9f into main May 27, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stage A: harden provider against truncated structured responses#3

Stage A: harden provider against truncated structured responses#3
Anmolnoor merged 1 commit into
mainfrom
stage-a/truncation-hardening

Anmolnoor commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Anmolnoor commented May 27, 2026

Context

What changed

Tests

Sequencing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant