fix(nim): NVIDIA NIM provider hardening and response normalization#27830
fix(nim): NVIDIA NIM provider hardening and response normalization#27830Zireael wants to merge 2 commits into
Conversation
|
The following comment was made by an LLM, it may be inaccurate: Potential Related PRs FoundHere are PRs addressing overlapping issues that PR #27830 consolidates:
Note: PR #27830 appears to be a comprehensive consolidation that addresses all these issues in a unified defense layer (nim-defense.ts). These older PRs may be superseded or complementary to the current PR's approach. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
a5be9f0 to
1829ec4
Compare
Issue for this PR
Closes #10884, #6290, #19947, #25786, #13900, #24264, #22493
Type of change
What does this PR do?
NVIDIA NIM returns OpenAI-incompatible responses in several ways that crash agent sessions: numeric or missing tool_call.id, malformed JSON arguments, thinking tokens leaking into message content, and HTTP 200 payloads that look like errors but pass through as valid responses. Reasoning models like DeepSeek V4 also hang without chat_template_kwargs in the right place.
This adds a defense layer in the provider fetch pipeline that fixes these at the transport level. It works in three parts: normalize outgoing requests (inject kwargs, fix model IDs), normalize incoming responses (fix IDs, repair JSON, strip thinking leakage), and retry transient failures (HTTP 429, 5xx, HTTP 200 errors).
The normalization intercept runs after the raw HTTP response comes back but before the AI SDK validates it. Response reconstruction preserves original headers. Streaming passes through untouched.
How did you verify your code works?
58 unit tests cover the normalizer, JSON repair, request enrichment, and retry wrapper. The full provider test suite passes (393/394, one pre-existing timeout in unrelated plugin config tests). Typecheck is clean.
Screenshots / recordings
Not a UI change.
Checklist