fix(agents): propagate provider errorMessage in transport stream throws#69091
fix(agents): propagate provider errorMessage in transport stream throws#69091CoKeFish wants to merge 1 commit intoopenclaw:mainfrom
Conversation
When a provider stream sets `output.stopReason = "error"` with a concrete
`output.errorMessage` (e.g. `Provider finish_reason: MALFORMED_FUNCTION_CALL`),
the subsequent `throw new Error("An unknown error occurred")` discarded that
message. Callers like `pi-embedded-runner` only logged the generic string,
making post-mortems blind to the real provider signal.
Thread the real `output.errorMessage` into the throw, falling back to the
generic string when it is absent. No flow-control change — only the thrown
message.
Refs openclaw#1914 (root cause acknowledged in pi-ai, still swallowed here),
openclaw#59524 (model fallback treats finish_reason:error as success),
openclaw#60473 (sub-agent reports success on stopReason:error).
AI-assisted: Claude Opus 4.7 · lightly tested (new unit tests + existing
transport-stream-shared + openai-transport-stream suites green locally).
Greptile SummaryThis PR fixes a bug where Confidence Score: 5/5Safe to merge — minimal, well-tested change with no P0/P1 findings. The fix is a targeted one-liner at three identical call sites, unit tests cover both the propagation and fallback branches, and existing test suites remain green. No logic regressions or security concerns identified. No files require special attention. Reviews (1): Last reviewed commit: "fix(agents): propagate provider errorMes..." | Re-trigger Greptile |
Summary
finalizeTransportStream/ the OpenAI Responses transport detectsoutput.stopReason === "error"with a concreteoutput.errorMessage(e.g.Provider finish_reason: MALFORMED_FUNCTION_CALL), the subsequentthrow new Error("An unknown error occurred")discards that message.pi-embedded-runnerlog only the generic string, leaving post-mortems blind to the real provider signal (safety blocks, malformed tool calls, upstream 5xx surfaced via SSE). Related to Gemini finishReason (SAFETY, RECITATION, MALFORMED_FUNCTION_CALL) swallowed as generic error #1914 (fixed in pi-ai but still swallowed here), Model fallback treats finish_reason:error as success, stops fallback chain #59524, Sub-agent reports 'completed successfully' when provider returns stopReason: error #60473.output.errorMessageinto the throw, falling back to the generic string when absent. Three call sites updated (1 shared helper + 2 OpenAI Responses variants). Added unit tests locking in both branches.classifyFailoverReasonreadsmsg.errorMessage, not the thrownErrormessage, so retry/fallback logic is untouched. No user-visible defaults.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
transport-stream-shared.tsandopenai-transport-stream.tshard-coded"An unknown error occurred"even when the stream had already populatedoutput.errorMessagewith the provider signal. The outercatchthen overwritesoutput.errorMessagewith the thrown string, so the real cause is lost end-to-end.finalizeTransportStreampropagatesoutput.errorMessageon the error branch.@mariozechner/pi-ai, but the openclaw transport layer kept its own generic throws.Regression Test Plan (if applicable)
src/agents/transport-stream-shared.test.tsfinalizeTransportStreamwithstopReason: "error"throws the realoutput.errorMessage; witherrorMessageabsent, falls back to the generic string.User-visible / Behavior Changes
Error logs and UI that surface stream errors now show the provider-reported reason (e.g.
Provider finish_reason: MALFORMED_FUNCTION_CALL) instead of the generic"An unknown error occurred". No config/default changes.Diagram (if applicable)
Security Impact (required)
NoNoNoNoNoNote:
output.errorMessagecarries provider error strings that may contain request identifiers or truncated payload snippets. This is the same surfacepi-embedded-runneralready logs viaoutput.errorMessageon the non-throwing paths, so no new sensitive-data exposure is introduced.Repro + Verification
Environment (original bug)
google-genaitransportoutput.errorMessage = "Provider finish_reason: MALFORMED_FUNCTION_CALL", but the pi-embedded-runner log surfaced only"An unknown error occurred". Root-caused by tracing the stream to the three throw sites intransport-stream-shared.tsandopenai-transport-stream.ts.Environment (fix verification)
main)pnpm exec vitest run src/agents/transport-stream-shared.test.ts→ 6/6 passpnpm exec vitest run src/agents/openai-transport-stream.test.ts→ 65/65 passmain(assert the real errorMessage survives the throw), pass after the change.Evidence
Human Verification (required)
finalizeTransportStreamerror branch with concreteerrorMessage→ throws the real message (new unit test)finalizeTransportStreamerror branch withouterrorMessage→ throws the generic fallback (new unit test)transport-stream-shared.test.tssuite still green (4 pre-existing tests)openai-transport-stream.test.tssuite still green (65 tests, covers the two updated call sites)errorMessageset to empty string → falls back to generic (||short-circuits)stopReason === "aborted"path unchanged (noerrorMessagepopulated; still surfaces generic)pnpm testrun on Windows (unrelated platform-specific flakiness in other suites)Compatibility / Migration
YesNoNoRisks and Mitigations
errorMessagemay contain characters or patterns that downstream log parsers didn't expect (they used to always see the literal"An unknown error occurred").output.errorMessageon the non-throwing paths (failTransportStream), so any parser that reads error events already tolerates them. No new surface.classifyFailoverReasonbehavior could shift if it ever starts reading the thrownError.message.pi-embedded-helpers/errors.ts:1272readsmsg.errorMessage, not the thrown string — unchanged.AI-assisted
pnpm teston Windows due to unrelated platform flakiness).output.errorMessage || "An unknown error occurred"; added unit tests lock in both branches in the shared helper that the OpenAI variants duplicate.