Skip to content

types(generated): auto-enforce publisher-selector XOR at Pydantic parse time #759

@bokelley

Description

@bokelley

Context

The publisher-property-selector JSON Schema (adcp#4504) enforces two XOR constraints:

  • publisher_domain XOR publisher_domains[]
  • property_ids XOR property_tags (gated by selection_type discriminator)

These are expressed in the schema via allOf[not[required[both]]] + anyOf[required[either]]. datamodel-code-generator cannot translate that construct into Pydantic field constraints, so the generated PublisherPropertySelector1 / …3 accept payloads the schema rejects when consumers parse directly:

# Silently passes Pydantic validation today — should not.
PublisherPropertySelector1.model_validate({"selection_type": "all"})

What's already in place (PR #756)

validate_publisher_properties_item was extended to accept Pydantic model instances, so consumers have a one-call escape hatch:

selector = PublisherPropertySelector1.model_validate(payload)
validate_publisher_properties_item(selector)

This closes the gap when consumers remember to call the helper. It does not catch the case where someone parses an AuthorizedAgent (which carries a list[PublisherPropertySelector1 | …2 | …3]) and trusts Pydantic to have validated the array elements.

What this issue tracks

Make Pydantic's own discriminated-union parser enforce the XOR — without violating the codebase's layering rule (CLAUDE.md → "Import Architecture for Generated Types"; nothing outside stable.py / aliases.py / _ergonomic.py may import from generated_poc/, and those modules cannot hand-patch generated content that will be overwritten on regen).

Approaches worth evaluating

A. Wrap-validator via __pydantic_core_schema__ override.
Add a wrapper in _ergonomic.py (or a sibling _post_init_validators.py) that calls cls.__pydantic_core_schema__.update(...) with a model-level wrap validator. Pydantic supports this via the __get_pydantic_core_schema__ hook, but registering one after class creation is undocumented and may break across minor versions.

B. Subclass-and-replace.
In aliases.py, define PublisherPropertySelector1WithXor(PublisherPropertySelector1) with a @model_validator(mode='after'). Re-point AuthorizedAgent.publisher_properties' discriminated union to the subclasses via model_rebuild. The list-variance issue means existing Pydantic instances of the base class won't auto-upgrade — but new model_validate calls go through the subclass.

C. Patch the codegen.
Teach scripts/generate_types.py to emit model_validator(mode='after') for selector arms when the source schema carries a allOf[not[required[both]]] + anyOf[required[either]] pattern. Solves the problem at source — affects every future XOR-shaped schema. Highest leverage, biggest blast radius.

D. Document the limitation and lean on the helper.
Status quo: ship #756, document the gap in the public API docs, accept the opt-in pattern. Pragmatic if no one's getting bitten in practice.

Recommendation

Probably C (codegen patch) — it's the only one that scales as schemas evolve and other XOR constraints land. B is a viable bridge if C is too big. A is brittle. D is the current state plus #756.

Out of scope

The selector / property_ids+property_tags XOR (the other one in the schema) gets the same treatment as part of this issue.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    no-triageSkip the Claude triage bot — human or designated agent will handle this issue

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions