fix(llm): surface code, type, and nested fields on provider stream errors by kitlangton · Pull Request #28757 · anomalyco/opencode

kitlangton · 2026-05-22T02:03:54Z

Issue for this PR

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

OpenAI Responses and Anthropic Messages stream-error handlers were collapsing provider failures into opaque generic strings. That made rate limits, context-length errors, model overloads, and nested OpenAI response.error payloads hard to diagnose.

This PR surfaces richer provider-error messages:

OpenAI Responses reads top-level event.{code,message,param} and nested event.response.error.{code,message,param}.
Anthropic Messages prefixes error.type when available.
Empty OpenAI error messages are treated as absent so the code fallback still appears.
Partial Anthropic proxy/gateway error payloads still parse when one of type or message is missing.

The branch has been rebased onto current dev after #28754 and #28755 merged, so the diff now only contains the provider-error changes.

How did you verify your code works?

packages/llm: bun run test test/provider/openai-responses.test.ts passed with 35 pass
packages/llm: bun run test test/provider/anthropic-messages.test.ts passed with 22 pass
packages/llm: bun run test passed with 221 pass, 28 skip
packages/llm: bun run typecheck passed
root bun run typecheck passed with 15 successful tasks

Screenshots / recordings

Not applicable. This is an LLM protocol error-handling fix.

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

github-actions · 2026-05-22T02:04:04Z

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

Open an issue describing the bug/feature (if one doesn't exist)
Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

github-actions · 2026-05-22T02:04:06Z

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

github-actions · 2026-05-22T04:51:04Z

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

…rors The OpenAI Responses and Anthropic Messages stream-error handlers collapsed every provider failure into one of two opaque strings: - `OpenAI Responses stream error` / `OpenAI Responses response failed` - `Anthropic Messages stream error` In production this meant rate limits, context-length overflows, model overloads, and image-validation failures were all surfaced identically. The OpenAI `response.failed` path was the worst case: the error details live under `response.error`, not at the top level, so the previous `event.message ?? event.code` lookup was always undefined and the catch-all string was the only thing callers ever saw. This commit: - Reads OpenAI errors from `event.{code,message,param}` and falls back to `event.response.error.{code,message,param}` so `response.failed` finally surfaces the underlying cause. Adds `error` to the typed `response` schema instead of leaving it as an untyped rest field. - Prefixes the failure code/type when both code and message are present (`rate_limit_exceeded: Slow down`, `overloaded_error: Overloaded`) so consumers can branch on the failure mode without parsing prose. - Marks Anthropic's `error.{type,message}` fields optional so partial payloads from OpenAI-compatible proxies still parse, and falls back to whichever field is populated. - Adds focused tests for every shape (nested code+message, code-only, empty error, missing error entirely) on both protocols. Stacked on top of fix/openai-responses-tool-image so the OpenAI Responses error-event schema widening reuses the new `response.error` field.

…hain probe (#34340) * fix(codex): surface error code in Responses 'failed' status errors When a Codex Responses turn ends with status=failed, the response carries the failure details under `response.error` as `{code, message, param, ...}`. The previous extractor pulled only `message`, so users seeing a rate-limit failure got a bare "Slow down" string indistinguishable from a generic stream truncation; an internal_error with empty message degraded to a dict dump ("{'code': 'internal_error', 'message': ''}"). Extract a `_format_responses_error()` helper that: - prefixes `code` when both code and message are present (e.g. 'rate_limit_exceeded: Slow down') - falls back to the bare `code` when message is empty - accepts both dict and attribute-style payloads (SDK and JSON-RPC paths) - preserves the prior status-only fallback when no error payload exists Apply the same helper at the sibling site in `codex_app_server_session.run_turn()` so codex-CLI subprocess turn failures get the same treatment. Tests: - 8 new unit tests for `_format_responses_error` covering both shapes, empty/missing fields, non-string fields, and the status-only fallback. - 2 regression tests on `_normalize_codex_response` for failed status with and without a code, asserting the exact RuntimeError message. - All 3603 tests in tests/agent/ pass. Adapted from anomalyco/opencode#28757. * feat(prompt): universal task-completion guidance + local Python toolchain probe Two cross-model failure modes get a single-line answer in the cached system prompt. Both gated by config (default on), both add zero overhead when not needed, both verified via real AIAgent prompt builds. ## What changed `TASK_COMPLETION_GUIDANCE` — short prompt block applied to ALL models. Targets two failure modes observed on a real Sarasota real-estate build task: (1) Opus stopped after writing an 85-byte stub and gave a prose response with finish_reason=stop on call #3 of 90; (2) DeepSeek pushed through a PEP-668 wall, then returned fabricated listings instead of admitting the blocker. Both behaviors are model-family-agnostic, so the guidance lives outside the existing tool_use_enforcement gate (~192 tokens, paid once per session via prefix cache). `tools/env_probe.py` — local Python toolchain probe. Detects python3/pip/uv/PEP-668 state and emits ONE short line in the system prompt when something is non-default. Emits NOTHING when the env is clean (zero token cost for normal users). Skipped entirely for remote terminal backends (docker/modal/ssh) — they have their own probe. Example output on a broken environment (the actual case): Python toolchain: python3=3.11.15 (no pip module), python=missing (use python3), pip→python3.12 (mismatch), PEP 668=yes (use venv or uv). ## Config Both flags live under `agent.` in config.yaml, default True: agent: task_completion_guidance: true # universal "finish the job" block environment_probe: true # local Python toolchain hints Neither addition required a `_config_version` bump — deep-merge fills defaults in for existing user configs. ## Validation | Test surface | Result | |---|---| | tests/tools/test_env_probe.py | 10/10 pass (probe unit) | | tests/run_agent/test_run_agent.py — new classes | 8/8 pass (integration) | | TestToolUseEnforcementConfig | 17/17 pass (no regression) | | TestBuildSystemPrompt | 9/9 pass (no regression) | | TestInvalidateSystemPrompt | 2/2 pass (no regression) | | tests/agent/test_prompt_builder.py | 124/124 pass (no regression) | | tests/hermes_cli/ | 5662/5662 pass (config defaults) | | E2E AIAgent build (broken env) | Both blocks present, 2,178 chars | | E2E AIAgent build (clean env) | 771-char net overhead, env probe silent |

github-actions Bot added the needs:issue label May 22, 2026

github-actions Bot added contributor needs:compliance This means the issue will auto-close after 2 hours. labels May 22, 2026

github-actions Bot mentioned this pull request May 22, 2026

📊 AI CLI 工具社区动态日报 2026-05-22 ivanweng2077/big_model_radar#73

Open

github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 22, 2026

github-actions Bot closed this May 22, 2026

kitlangton reopened this May 22, 2026

This was referenced May 22, 2026

OpenAI Responses JSON-stringifies image media returned from tool results #28859

Closed

Native LLM stream errors hide provider error code and nested details #28860

Closed

github-actions Bot removed the needs:issue label May 22, 2026

kitlangton added 2 commits May 22, 2026 12:26

chore(llm): simplify provider stream error detail handling

952a610

kitlangton force-pushed the fix/openai-responses-provider-error-detail branch from d05e3f1 to 952a610 Compare May 22, 2026 16:28

kitlangton merged commit d0cb587 into anomalyco:dev May 22, 2026
10 checks passed

teknium1 mentioned this pull request May 29, 2026

fix(codex): surface error code in Responses 'failed' status errors NousResearch/hermes-agent#34200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): surface code, type, and nested fields on provider stream errors#28757

fix(llm): surface code, type, and nested fields on provider stream errors#28757
kitlangton merged 2 commits into
anomalyco:devfrom
kitlangton:fix/openai-responses-provider-error-detail

kitlangton commented May 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kitlangton commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Checklist

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kitlangton commented May 22, 2026 •

edited

Loading