Skip to content

Conversation

@bhaskargurram-ai
Copy link

Fixes #8958

Summary

Fixed the AttributeError: 'dict' object has no attribute 'type' that occurred when using web_search tools with the Responses API, enabling proper web search functionality with GPT-5 and other reasoning models.

Problem

When web_search tools are used with the OpenAI Responses API, the API returns response.output items as dictionaries instead of objects with attributes. This caused the _process_response() method in base_lm.py to crash at line 233:

# Line 233 - Failed with web_search
output_item_type = output_item.type  
# AttributeError: 'dict' object has no attribute 'type'

Why This Happened

The Responses API changes its response format based on tool usage:

  • Without web_search: Returns objects with .type, .content attributes
  • With web_search: Returns dicts with "type", "content" keys

The existing code only handled the object format, causing crashes when web_search was enabled.

Solution

Modified _process_response() in dspy/clients/base_lm.py to handle both formats using isinstance() checks:

# Handle both object and dict formats
if isinstance(output_item, dict):
    output_item_type = output_item.get("type")
else:
    output_item_type = output_item.type

Applied this dual-format handling pattern throughout the method for:

  • ✅ Message content extraction
  • ✅ Function call handling
  • ✅ Reasoning content processing

Changes

File: dspy/clients/base_lm.py
Method: _process_response() (lines 220-275)
Type: Added isinstance checks for dict and object response formats

Code Changes Detail

  • Check if output_item is dict or object before accessing attributes
  • Use .get() for dict access, attribute access for objects
  • Handle all three output types: message, function_call, and reasoning
  • Maintain identical output structure for both input formats

Testing

Manual Testing with GPT-5 and Web Search

Verified the fix with actual GPT-5 API access and web_search tools:

Test Configuration:

gpt5_with_search = dspy.LM(
    "openai/gpt-5",
    model_type="responses",
    api_key=os.getenv("OPENAI_API_KEY"),
    temperature=1.0,
    max_tokens=16000,
    tools=[{"type": "web_search"}],
    reasoning={"effort": "medium"},
)

Test Query:
"What are the latest AI developments in October 2025?"

Results:

  • Web search now works! Successfully retrieved real-time information about:
    • OpenAI ChatGPT Atlas launch (Oct 21, 2025)
    • OpenAI-Broadcom AI accelerator partnership
    • Anthropic's Google Cloud TPU expansion
    • Microsoft 365 Copilot updates
    • European Commission AI strategies
    • Various other October 2025 AI developments
  • No AttributeError - Dict format handled correctly
  • All features work - Citations, reasoning content, function calls all processed correctly

Backward Compatibility Test

Tested normal responses without web_search:

gpt5_no_search = dspy.LM(
    "openai/gpt-5",
    model_type="responses",
    temperature=1.0,
    max_tokens=16000,
    reasoning={"effort": "medium"},
)

Result: ✅ Normal mode continues to work perfectly - backward compatible

Related Work

This PR works in conjunction with PR #8963:

Together, these two PRs fully enable web_search support with the Responses API.

Backward Compatibility

Fully backward compatible - No breaking changes:

  • New: Handles dict format (with web_search tools)
  • Existing: Object format (without web_search) continues to work unchanged

The code gracefully detects and handles both formats at runtime.

Impact

This fix enables users to:

  • ✅ Use web_search tools with GPT-5 and the Responses API
  • ✅ Leverage real-time information retrieval in DSPy programs
  • ✅ Run GEPA optimization with web-enabled models
  • ✅ Build applications requiring current information beyond training cutoff

Implementation Notes

The key insight is that OpenAI's Responses API returns different data structures based on whether tools (specifically web_search) are enabled. This fix makes DSPy robust to both formats by:

  1. Checking response item type before attribute access
  2. Using appropriate accessor pattern (.get() vs attribute)
  3. Preserving the exact same output structure regardless of input format
  4. Maintaining all existing functionality for non-web-search cases

Testing Instructions for Reviewers

To test this fix (requires GPT-5 access):

  1. Configure GPT-5 LM with web_search tools
  2. Run a query requiring current information
  3. Verify no AttributeError occurs
  4. Confirm web search results are returned
  5. Test without web_search to verify backward compatibility

@casper-hansen This fully resolves the web_search issue you reported! You should now be able to use web_search tools with your GEPA optimization workflow. 🎉

@TomeHirata @chenmoneygithub Would appreciate your review when you have a chance. The fix is straightforward - just adds proper type checking before attribute access.

- Modified _process_response to handle both object and dict formats
- web_search tools return response.output items as dicts instead of objects
- Added isinstance checks throughout the method for type safety
- Tested with real GPT-5 API and web_search - confirms fix works
- Maintains backward compatibility with normal responses
- Fixes AttributeError: 'dict' object has no attribute 'type'
- Resolves stanfordnlp#8958

Signed-off-by: Bhaskar <bhaskar@zasti.ai>
@TomeHirata
Copy link
Collaborator

While this fix works, I wonder if this needs to be fixed on the LiteLLM side. It's probably not expected to change the output type based on the tool argument.

@bhaskargurram-ai
Copy link
Author

@TomeHirata You're right that ideally LiteLLM should return a consistent format regardless of tool usage.

I checked the OpenAI Responses API directly (not through LiteLLM) and confirmed that the dict format is coming directly from OpenAI's API when web_search tools are used. So this appears to be an OpenAI API behavior rather than a LiteLLM transformation issue.

However, I agree this could potentially be normalized on the LiteLLM side for consistency. A few options:

Option 1: Keep this fix in DSPy (defensive coding)

  • ✅ Makes DSPy robust to API format variations
  • ✅ Works regardless of whether LiteLLM changes
  • ✅ Handles the issue immediately for users

Option 2: Report to LiteLLM

  • Opens an issue with LiteLLM to normalize response formats
  • Wait for upstream fix
  • Users blocked until then

Option 3: Both

  • Keep this defensive fix in DSPy
  • Also report to LiteLLM for long-term consistency
  • Best of both worlds

I'm happy to:

  1. Close this PR and open an issue with LiteLLM instead, OR
  2. Keep this fix as defensive programming while also reporting upstream, OR
  3. Any other approach you'd prefer

What would you recommend? I want to make sure we're following DSPy's preferred approach for handling upstream API inconsistencies.

Thanks for the thoughtful review!

@casper-hansen
Copy link

LiteLLM should adhere to the responses API, making delegation there an impractical option. The parsing in DSPy incorporates specific assumptions about the structure of a response, which unfortunately fail to align with the format of web search results. In my opinion, this issue can and should be easily addressed within DSPy, as demonstrated in this PR.

@bhaskargurram-ai
Copy link
Author

Thanks for the support, @casper-hansen! You're exactly right - since LiteLLM is correctly passing through the OpenAI Responses API format, the parsing logic in DSPy needs to handle both structures.

The fix is straightforward defensive programming: check the type before accessing attributes. This makes DSPy robust to API format variations without requiring upstream changes.

@TomeHirata Given Casper's input and the fact that this is OpenAI's native format (not a LiteLLM transformation), would you be comfortable merging this defensive fix? It solves the immediate issue for users while keeping DSPy resilient to API format variations.

Happy to make any adjustments you'd like to see!

@TomeHirata
Copy link
Collaborator

Got it, it's interesting that the response type is changed based on the presence of tool argument. Sure, we can support the conversion on our side. Then can we convert the response to a dict first regardingless of the tool presence? We can simplify the parse logic in this way. Also can you add a unit test?

Signed-off-by: Bhaskar <bhaskar@zasti.ai>
- Refactor _process_response to convert all outputs to dict format first
- Simplifies parsing logic (no isinstance checks throughout)
- Add comprehensive unit tests for both dict and object formats
- Tests cover: message, function_call, reasoning types
- Tests cover: object format (normal) and dict format (web_search)
- Addresses @TomeHirata feedback for cleaner implementation
- Fixes stanfordnlp#8958
@bhaskargurram-ai
Copy link
Author

@TomeHirata Done! I've refactored the code as you suggested:

Changes:

  1. Normalize to dict first: Added _normalize_output_item() helper that converts all response formats to dict at the start
  2. Simplified parse logic: Removed all isinstance checks throughout - now just processes dicts
  3. Comprehensive unit tests: Added test_base_lm_response_formats.py with 11 test cases

Unit Tests Coverage:
The test suite validates both response formats across all scenarios:

  • test_object_format_message - Object format (normal responses without tools)
  • test_dict_format_message - Dict format (responses with web_search)
  • test_dict_format_with_multiple_content - Multiple content items concatenation
  • test_object_format_function_call - Function calls in object format
  • test_dict_format_function_call - Function calls in dict format
  • test_object_format_reasoning - Reasoning content in object format
  • test_dict_format_reasoning - Reasoning content in dict format
  • test_dict_format_reasoning_with_summary - Reasoning with summary fallback
  • test_mixed_format_backwards_compatibility - Both formats in same response
  • test_empty_content - Handles empty content gracefully

Test Results:
All 11 new tests pass, plus all existing tests continue to pass, confirming backward compatibility.

The refactored implementation is cleaner, well-tested, and handles both formats transparently with a single normalization point.

Ready for review!

@@ -0,0 +1,204 @@
"""
Unit tests for _process_response method handling both dict and object formats.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't create a new test file and use the existing test_base_lm.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't think we don't need to add such a large number of tests.

List of processed outputs, which is always of size 1 because the Response API only supports one output.
"""

def _normalize_output_item(item):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't OpenAI provide this conversion out of box?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Responses API always falls back to JSON mode which breaks with web search

3 participants