-
Notifications
You must be signed in to change notification settings - Fork 29
🤖 Add integration test for OpenAI reasoning error #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The previous fix added 'reasoning.encrypted_content' to the include option, but the root cause was that reasoning parts from history were being sent back to OpenAI's Responses API. When reasoning parts are included in messages sent to OpenAI, the SDK creates separate reasoning items with IDs (e.g., rs_*). These orphaned reasoning items cause errors: 'Item rs_* of type reasoning was provided without its required following item.' Solution: Strip reasoning parts from CmuxMessages BEFORE converting to ModelMessages. Reasoning content is only for display/debugging and should never be sent back to the API in subsequent turns. This happens in filterEmptyAssistantMessages() which runs before convertToModelMessages(), ensuring reasoning parts never reach the API.
Per Anthropic documentation, reasoning content SHOULD be sent back to Anthropic models via the sendReasoning option (defaults to true). However, OpenAI's Responses API uses encrypted reasoning items (IDs like rs_*) that are managed automatically via previous_response_id. Anthropic-style text-based reasoning parts sent to OpenAI create orphaned reasoning items that cause 'reasoning without following item' errors. Changes: - Reverted filterEmptyAssistantMessages() to only filter reasoning-only messages - Added new stripReasoningForOpenAI() function for OpenAI-specific stripping - Apply reasoning stripping only for OpenAI provider in aiService.ts - Added detailed comments explaining the provider-specific differences
OpenAI's Responses API uses encrypted reasoning items (rs_*) managed via previous_response_id. Sending stale provider metadata from history causes: - "Item 'rs_*' of type 'reasoning' was provided without its required following item" - "referenced reasoning on a function_call was not provided" Solution: Blank out providerMetadata on all content parts for OpenAI after convertToModelMessages(). This preserves reasoning content while preventing metadata conflicts. Also fixed splitMixedContentMessages to treat reasoning parts as text parts (they stay together with text, not with tool calls). Fixes #7099 (Vercel AI SDK issue) Reference: https://github.com/gvkhna/vibescraper
# Conflicts: # src/services/aiService.ts # src/utils/messages/modelMessageTransform.ts
- Change 'let filteredMessages' to 'const' (no longer reassigned) - Remove unused 'provider' parameter from transformModelMessages() - Fix clearProviderMetadataForOpenAI to actually clear reasoning parts (was only checking part.type === 'text', now checks both 'text' and 'reasoning') - Update all test calls to remove provider parameter - Update docstrings to reflect new behavior
Tool result messages (role: 'tool') can also contain stale providerMetadata on ToolResultPart that references the parent tool-call. This metadata can cause the same 'reasoning without following item' errors when sent back to OpenAI. Extended clearProviderMetadataForOpenAI() to also process tool messages. Evidence: - LanguageModelV3ToolResultPart has providerOptions field - @kristoph noted error occurs when items 'immediately after reasoning' lack IDs - Tool results are sent immediately after tool calls, completing the chain This makes the fix comprehensive for all message types that can have stale metadata.
This test attempts to reproduce the intermittent error: "Item 'rs_*' of type 'reasoning' was provided without its required following item" The test: - Uses OpenAI reasoning model (gpt-5-codex) - Sends multi-turn conversation with reasoning + tool calls - Runs multiple attempts (default 10, configurable via OPENAI_REASONING_TEST_RUNS) - Checks for the specific error in stream events Run with: TEST_INTEGRATION=1 bun x jest tests/ipcMain/openaiReasoning.test.ts The error is intermittent, so multiple attempts increase chances of reproduction. Once reproduced, debug dumps can be examined to understand the root cause. Generated with `cmux`
The test was incorrectly waiting for stream-end events even when stream-error occurred. Now it catches timeout exceptions and checks for error events regardless of whether stream-end was received. This allows the test to properly detect the OpenAI reasoning error when it occurs. Generated with `cmux`
Member
Author
|
Closing to recreate with clean branch |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds an integration test to reproduce and track the intermittent OpenAI reasoning error:
Test Implementation
File:
tests/ipcMain/openaiReasoning.test.tsThe test:
OPENAI_REASONING_TEST_RUNS)Run with:
TEST_INTEGRATION=1 bun x jest tests/ipcMain/openaiReasoning.test.ts # Or with custom attempt count OPENAI_REASONING_TEST_RUNS=20 TEST_INTEGRATION=1 bun x jest tests/ipcMain/openaiReasoning.test.tsTest Results
First Test Run (3 attempts)
Item 'rs_05d1dd2ba9ba43270068e541ecb9ec81938b35eead69f3d8c3' of type 'reasoning' was provided without its required following item.[stream-start, reasoning-end, tool-call-start, stream-error]Second Test Run (10 attempts)
Key Findings
Error is intermittent: Not deterministic, likely depends on OpenAI's internal state or timing
Current fix MAY be working: The
clearProviderMetadataForOpenAI()function on this branch appears to reduce error frequency significantly:Error can occur on FIRST message: Not just on follow-ups with conversation history
Test is functional: Successfully detects the error when it occurs
Next Steps
main(without fix) vstokens(with fix)previous_response_identirely when errors occurproviderMetadatamore aggressively (on ALL part types)Related
src/utils/messages/modelMessageTransform.ts(clearProviderMetadataForOpenAI())Generated with
cmux