Skip to content

fix: resolve PydanticUserError with defer_build=True in multiprocess forkserver workers (pydantic >= 2.12)#3132

Open
SanskaarUndale21 wants to merge 1 commit intoopenai:mainfrom
SanskaarUndale21:fix/pydantic-defer-build-forkserver
Open

fix: resolve PydanticUserError with defer_build=True in multiprocess forkserver workers (pydantic >= 2.12)#3132
SanskaarUndale21 wants to merge 1 commit intoopenai:mainfrom
SanskaarUndale21:fix/pydantic-defer-build-forkserver

Conversation

@SanskaarUndale21
Copy link
Copy Markdown

Summary

Fixes #3079

When defer_build=True (the default via DEFER_PYDANTIC_BUILD), pydantic wraps
__pydantic_core_schema__ in a MockCoreSchema. Accessing schema["type"] inside
_get_extra_fields_type triggers a lazy rebuild that uses a hardcoded stack depth
(5 frames) to locate forward-reference namespaces. In multiprocess forkserver
workers the call stack is shallower/different, so the heuristic cannot find the
openai types namespace. Before pydantic 2.12 this silent failure was mostly harmless;
starting in 2.12 MockCoreSchema._get_built() raises PydanticUserError when the
rebuild fails, causing ~0.6 % of requests to crash in large forkserver deployments.

Changes

  • src/openai/_models.py_get_extra_fields_type:
    • Checks cls.__pydantic_complete__ before touching the schema; if the model is
      still deferred, calls cls.model_rebuild(_types_namespace=vars(module)) using the
      class's own module from sys.modules. This bypasses the fragile stack-depth
      heuristic and resolves forward references correctly regardless of call-stack depth.
    • Wraps the subsequent schema access in a try/except so that any residual failure
      degrades gracefully (returning None) instead of propagating an unhandled exception.

Test plan

  • Verify existing test suite passes (pytest tests/)
  • Manually reproduce with multiprocess forkserver + pydantic 2.12: create a pool
    with forkserver start method, import openai inside workers, deserialize a
    ChatCompletion response — confirm no PydanticUserError raised
  • Run with DEFER_PYDANTIC_BUILD=true (default) and DEFER_PYDANTIC_BUILD=false

…server workers

When defer_build=True (the default), pydantic wraps __pydantic_core_schema__
in a MockCoreSchema. Accessing schema["type"] triggers a lazy rebuild that uses
a hardcoded stack depth (5 frames) to resolve forward-reference namespaces.
In multiprocess forkserver workers the call stack is different, so the heuristic
fails and pydantic >= 2.12 raises PydanticUserError.

_get_extra_fields_type now detects an unbuilt model via __pydantic_complete__ and
calls model_rebuild() with the class's own module namespace before touching the
schema. This bypasses the fragile stack-depth heuristic and avoids the error
entirely. A second guard wraps the schema access itself so that any residual
failure degrades gracefully instead of crashing.

Fixes openai#3079
Copilot AI review requested due to automatic review settings April 28, 2026 09:46
@SanskaarUndale21 SanskaarUndale21 requested a review from a team as a code owner April 28, 2026 09:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an intermittent PydanticUserError triggered by Pydantic v2.12’s deferred schema rebuild (defer_build=True) in multiprocess forkserver workers by proactively rebuilding models using the model’s module namespace and by guarding schema access.

Changes:

  • Add proactive model_rebuild() when __pydantic_complete__ is false, using the class’s module namespace from sys.modules.
  • Wrap __pydantic_core_schema__ access in error handling to avoid propagating rebuild failures during construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/openai/_models.py
Comment on lines +446 to +450
# When defer_build=True, __pydantic_core_schema__ may be a MockCoreSchema whose
# __getitem__ triggers a lazy rebuild. That lazy rebuild uses a hardcoded stack
# depth (5 frames) to locate forward-reference namespaces, which is unreliable in
# multiprocess forkserver workers where the call stack is shallower. In pydantic
# >= 2.12 this manifests as a PydanticUserError at runtime.
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change targets a forkserver + deferred-build regression in pydantic >=2.12, but there doesn’t appear to be test coverage that exercises the deferred rebuild (MockCoreSchema) path in a forkserver worker. Adding a regression test (skipping platforms without forkserver) would help prevent reintroducing this crash.

Copilot uses AI. Check for mistakes.
Comment thread src/openai/_models.py
Comment on lines +457 to +460
try:
cls.model_rebuild(_types_namespace=vars(module) if module is not None else None)
except Exception:
return None
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls.model_rebuild(_types_namespace=...) relies on a private Pydantic kwarg; if that kwarg is absent/renamed in some supported pydantic v2 versions, this will raise (e.g. TypeError). Returning None immediately in that case skips the later schema-based path that might still succeed. Consider detecting whether _types_namespace is supported (e.g. via inspect.signature) and/or falling back to calling model_rebuild() without it, and avoid returning early so behavior degrades minimally across versions.

Suggested change
try:
cls.model_rebuild(_types_namespace=vars(module) if module is not None else None)
except Exception:
return None
model_rebuild = cls.model_rebuild
try:
supports_types_namespace = "_types_namespace" in inspect.signature(model_rebuild).parameters
except (TypeError, ValueError):
supports_types_namespace = False
try:
if supports_types_namespace:
model_rebuild(_types_namespace=vars(module) if module is not None else None)
else:
model_rebuild()
except TypeError:
try:
model_rebuild()
except Exception:
pass
except Exception:
pass

Copilot uses AI. Check for mistakes.
Comment thread src/openai/_models.py
Comment on lines +471 to +474
except Exception:
# Accessing the schema failed (e.g. the deferred rebuild produced an error);
# return None so construct() falls back to treating extra fields as untyped.
return None
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad except Exception around schema access will swallow unexpected errors (e.g. schema shape changes causing KeyError, or accidental AttributeError) and silently treat extras as untyped. To avoid masking real bugs, consider narrowing this to the specific failure you’re targeting (notably pydantic.errors.PydanticUserError from deferred rebuild) and letting truly unexpected exceptions surface.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PydanticUserError with defer_build=True and pydantic 2.12 in multiprocess forkserver workers

2 participants