Skip to content

fix: handle pydantic ValidationError in beta.chat.completions.parse (fixes #1763)#2917

Open
giulio-leone wants to merge 10 commits intoopenai:mainfrom
giulio-leone:fix/issue-1763-parse-validation-error
Open

fix: handle pydantic ValidationError in beta.chat.completions.parse (fixes #1763)#2917
giulio-leone wants to merge 10 commits intoopenai:mainfrom
giulio-leone:fix/issue-1763-parse-validation-error

Conversation

@giulio-leone
Copy link

Problem

When beta.chat.completions.parse() receives malformed or truncated JSON from the API (with finish_reason: "stop"), a raw pydantic.ValidationError is raised directly to the user with no context about what went wrong or what the original content was.

This is confusing because:

  • The error doesn't indicate it came from parsing the API response
  • The raw content that failed to parse is lost
  • Users can't distinguish between a bug in their code vs bad API output

Solution

Added a new ContentFormatError exception (extends OpenAIError, following the pattern of LengthFinishReasonError) that wraps both pydantic.ValidationError and json.JSONDecodeError in _parse_content().

The new error:

  • Provides a clear message explaining the response didn't match the expected format
  • Preserves the raw content string via raw_content attribute for debugging
  • Chains the original exception as __cause__ so the full traceback is available

Changes

  • src/openai/_exceptions.py: Added ContentFormatError class
  • src/openai/__init__.py: Exported ContentFormatError
  • src/openai/lib/_parsing/_completions.py: Wrapped _parse_content() with try/except
  • tests/lib/chat/test_completions.py: Added tests for malformed JSON and schema mismatch

Example

try:
    completion = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[...],
        response_format=MyModel,
    )
except openai.ContentFormatError as e:
    print(e.raw_content)  # The raw JSON string that failed to parse

Fixes #1763

When beta.chat.completions.parse() receives malformed or truncated JSON
from the API, pydantic.ValidationError was raised directly without any
context. Now catches pydantic.ValidationError and json.JSONDecodeError
in _parse_content() and wraps them in a new ContentFormatError that
includes the raw content string for debugging.

Fixes openai#1763
Copilot AI review requested due to automatic review settings March 2, 2026 13:23
@giulio-leone giulio-leone requested a review from a team as a code owner March 2, 2026 13:23
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a dedicated SDK error for structured-output parsing failures so users get a clear, actionable exception (with access to the raw model content) instead of a bare pydantic.ValidationError when chat.completions.parse() receives malformed/truncated JSON.

Changes:

  • Introduces ContentFormatError (subclass of OpenAIError) to represent response-format parsing failures.
  • Wraps Pydantic parsing in _parse_content() to raise ContentFormatError on pydantic.ValidationError / json.JSONDecodeError.
  • Adds regression tests covering malformed JSON and schema mismatch cases, and exports the new exception from openai.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/openai/lib/_parsing/_completions.py Wraps response-format parsing to raise a consistent SDK exception instead of leaking raw Pydantic errors.
src/openai/_exceptions.py Adds ContentFormatError carrying raw_content for debugging.
src/openai/__init__.py Exports ContentFormatError from the top-level package.
tests/lib/chat/test_completions.py Adds tests asserting ContentFormatError and raw_content preservation for malformed JSON and schema mismatch.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Truncate raw_content in error message to 500 chars to prevent
  unbounded messages and reduce sensitive data exposure
- Remove redundant self.__cause__ assignment (raise...from sets it);
  store error on dedicated .error attribute instead
- Update docstring to cover both Pydantic models and dataclass-like
  types validated via TypeAdapter

Refs: openai#1763
@giulio-leone
Copy link
Author

Thanks for the thorough review! All feedback has been addressed in 81f3e5b:

  • raw_content truncation: Error message now caps at 500 chars; full content still available via .raw_content
  • Redundant __cause__: Removed; original exception stored on .error attribute instead
  • Docstring accuracy: Updated to cover both Pydantic BaseModel and dataclass-like types
  • response_format type in error: Kept as-is — the ValidationError/JSONDecodeError already identifies the expected schema, and the caller knows the type they passed

giulio-leone and others added 2 commits March 2, 2026 16:26
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Include truncated raw_content (first 500 chars) in exception message
for better debugging while avoiding unbounded output. Full content
remains accessible via the raw_content attribute. Update tests to
reflect the new message format and add truncation coverage test.

Refs: openai#2917
@giulio-leone giulio-leone force-pushed the fix/issue-1763-parse-validation-error branch from 121799e to 4290c9a Compare March 2, 2026 16:24
giulio-leone and others added 6 commits March 2, 2026 17:45
Apply reviewer code suggestions from PR review.
…response_format parameter

- Remove dangling code block from _exceptions.py ContentFormatError class that caused IndentationError
- Fix _completions.py to pass response_format= instead of expected_type= to ContentFormatError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

beta.chat.completions.parse returns unhandled ValidationError

2 participants