fix: Fix openai responses api issues by ncrispino · Pull Request #685 · massgen/MassGen

ncrispino · 2025-12-22T21:46:50Z

Fix GPT-5 Reasoning Token Errors and Double Voting

Summary

This PR fixes two critical issues affecting GPT-5 reasoning models and multi-agent voting:

GPT-5-mini (and other GPT-5-X models) reasoning token errors - Fixed malformed Response API requests causing 400 errors
Double voting handling - Gracefully handle when reasoning models make multiple vote calls

Closes MAS-181

Issues Fixed

Issue 1: GPT-5 Reasoning Token Errors

Error:

Error code: 400 - {'error': {'message': "Item 'rs_...' of type 'reasoning' was provided without its required following item."}}

Root Causes:

The id field was being stripped from function_call items, breaking reasoning-to-function-call pairing required by OpenAI's Response API
When using previous_response_id, response items were being added both manually AND automatically, causing duplicate reasoning items

Fixes:

massgen/formatter/_response_formatter.py: Preserve id field in function_call items (required for reasoning pairing per LangChain PR #9082)
massgen/backend/response.py: Only add response items manually when NOT using previous_response_id to avoid duplicates

Issue 2: Double Voting in Multi-Agent Workflows

Issue: Certain GPT-5 models (particularly gpt-5.1 and gpt-5.2) make multiple vote calls despite enforcement messages, causing workflow failures.

Model Behavior:

❌ gpt-5.1, gpt-5.2: Ignore tool error enforcement, repeat violations
✅ gpt-5(-X), gpt-5.1-codex: Respond to enforcement correctly

Fix: Instead of rejecting multiple votes, gracefully handle by taking the last vote as the agent's final decision.

Rationale:

The last vote represents the agent's most refined thinking
Reasoning models iterate through their logic - later votes are more informed
Simpler than vote counting and more aligned with how reasoning models work

Changes

Files Modified

massgen/formatter/_response_formatter.py
- Preserve id field for reasoning item pairing
- Added reference to LangChain's similar fix
massgen/backend/response.py
- Conditional response item addition based on response_id presence
- Deduplicate items by ID to prevent duplicate reasoning tokens
- Improved logging (debug level instead of info)
massgen/orchestrator.py
- Simplified multiple vote handling (38 lines → 14 lines)
- Take last vote instead of complex counting/enforcement
- Removed unnecessary debug logging
- Cleaner variable names

Lines Changed

massgen/backend/response.py              | +16 -3
massgen/formatter/_response_formatter.py | +5 -2
massgen/orchestrator.py                  | +14 -38
Total: +35 -43 (net -8 lines, cleaner code)

Testing

Test Case 1: Reasoning Models

uv run massgen --automation --model gpt-5-mini "Create a simple website"

Before: 400 error about reasoning tokens
After: ✅ Completes successfully

Test Case 2: Multi-Agent Voting with gpt-5.1/5.2

uv run massgen --config config.yaml "Create a simple website"

Before: Agent execution failed: 'agent1' (KeyError)
After:

⚠️ Agent made 2 votes - using last (final decision): agent1
🏆 Turn 1 winner: agent_a

Verified Behavior

✅ No reasoning token errors with GPT-5 models
✅ Warning shown when multiple votes detected
✅ Execution completes successfully using last vote
✅ Anonymous ID mapping works correctly (agent1 → agent_a)

Known Limitations

GPT-5 Model Variants: Some models (gpt-5.1, gpt-5.2) don't properly respond to tool error enforcement. This fix works around the limitation by handling multiple votes gracefully rather than relying on enforcement.

Recommendation: For critical voting scenarios, prefer models that respond to enforcement (gpt-5(-X), gpt-5.1-codex) or accept that multiple votes will be deduplicated to the last vote.

Migration Notes

No breaking changes. This is purely a bug fix that:

Makes reasoning models work correctly
Handles edge cases more gracefully
Improves user experience with better messaging

Users will see informational warnings when multiple votes are detected, but workflows will continue successfully.

Fix openai issues

4759ab3

Henry-811 changed the base branch from main to dev/v0.1.29 December 24, 2025 16:17

Henry-811 merged commit 0f4b482 into dev/v0.1.29 Dec 24, 2025
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fix openai responses api issues#685

fix: Fix openai responses api issues#685
Henry-811 merged 1 commit intodev/v0.1.29from
fix_responses_api

ncrispino commented Dec 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ncrispino commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix GPT-5 Reasoning Token Errors and Double Voting

Summary

Issues Fixed

Issue 1: GPT-5 Reasoning Token Errors

Issue 2: Double Voting in Multi-Agent Workflows

Changes

Files Modified

Lines Changed

Testing

Test Case 1: Reasoning Models

Test Case 2: Multi-Agent Voting with gpt-5.1/5.2

Verified Behavior

Related Documentation

Known Limitations

Migration Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ncrispino commented Dec 22, 2025 •

edited

Loading