Return Pydantic model instances through XCom for structured output#67644
Merged
Conversation
…d output `LLMOperator`, `LLMAgentOperator`, `LLMFileAnalysisOperator`, and their `@task.llm` / `@task.agent` / `@task.llm_file_analysis` decorators stop calling `model_dump()` on Pydantic outputs before pushing to XCom. Downstream tasks now receive the model instance directly, so they can type-hint the class (`def downstream(result: MyModel)`) and use attribute access (`result.field`) instead of subscript access on a dict. To avoid forcing every DAG author to edit `[core] allowed_deserialization_classes`, the operators auto-register their `output_type` (and any `BaseModel` reachable from `Union`/`Optional`/`list` shapes) via a new `airflow.sdk.serde.allow_class(cls)` helper. The registration is process-local and runs in each worker's `__init__` -- same-DAG downstream tasks parse the DAG file when they start up, which re-runs the constructor and re-populates the per-process allow-list. The helper rejects classes that cannot be re-imported by qualname (defined in a function body, nested in another class, dynamically built with a mismatched `__name__`, or parametrised generics) so the failure surfaces at DAG parse time rather than at XCom-consume time. UI XCom viewer and cross-DAG `xcom_pull` are still gated by `[core] allowed_deserialization_classes` because the API server and other DAGs' workers don't import the producing DAG. Documented explicitly in the operator guides. Older Airflow versions that lack `allow_class` continue to get the dict form via a try/except fallback in each operator, so the provider keeps working on `apache-airflow>=3.0.0`.
…puts The UI's XCom viewer renders structured-output Pydantic instances via the ``stringify`` path (``airflow.serialization.stringify``) rather than the ``deserialize`` path, so user classes outside the ``airflow.*`` glob do not hit the allow-list gate -- they show up as ``module.MyModel@version=1(...)`` without any config change. Only cross-DAG ``xcom_pull`` is still gated. Also hoist a ``pydantic.create_model`` import to module scope in the serde test that was using it inline.
Strip DagBag's ``unusual_prefix_<sha>_`` module prefix from the displayed
classname and repr-quote string field values inside the ``classname@version=N(...)``
form. Before this change, an XCom value carrying a user-defined Pydantic class
rendered in the UI as:
unusual_prefix_9ce9eb..._typed_xcom_demo.TicketAnalysis@version=1(
priority=high,category=bug,summary=Nightly ETL...)
After:
typed_xcom_demo.TicketAnalysis@version=1(
priority='high', category='bug', summary='Nightly ETL...')
The prefix is a DagBag artifact (added to avoid ``sys.modules`` clashes
between same-named DAG files in different bundles) and has no value in the
human-readable XCom display. Quoting strings disambiguates ``field=value``
from a bare token and matches Pydantic/dataclass repr conventions.
Three CI failures fixed: 1. Compat tests against Airflow 3.0.6 / 3.1.8: new tests assumed allow_class is importable and asserted on Pydantic instance shape. Gate the new tests behind a requires_allow_class marker so they skip cleanly on older Airflow (operators already fall back to model_dump there via the import-safe import). 2. Docs build failed with 12 RST errors in autoapi-generated index.rst for example_dags modules. Pydantic BaseModel's inherited docstring leaks through autoapi rendering and breaks the Definition list. An explicit docstring on each module-level Pydantic class overrides the inherited one and keeps the RST valid. 3. Spell-check: qualname is a Python attribute name; backtick it in prose so the spell-checker treats it as code. Switched 'parametrised' (British) to 'parameterized' (American) to match wordlist.
Per Jed's review: some downstream consumers want the dict shape (e.g. forwarding the value to an external system that expects JSON-style payloads). Add serialize_output: bool = False to LLMOperator and AgentOperator (and via inheritance, LLMFileAnalysisOperator). When True the operator calls model_dump() before pushing to XCom, restoring the pre-PR behavior on demand without giving up the typed default. The class is not registered in _extra_allowed in that mode since the wire carries a plain dict and never hits the allow-list gate.
mypy was inferring the type from the first try-branch assignment (Callable[[type], None]) and then rejecting the except-branch's None fallback. Annotate the variable as object | None explicitly so both branches type-check.
Three more tests assert isinstance(result, BaseModel) from the execute_complete rehydration path. On older Airflow (no allow_class), the operator falls back to model_dump() and returns a dict, so those tests need the same compat gate.
gopidesupavan
approved these changes
May 28, 2026
Member
gopidesupavan
left a comment
There was a problem hiding this comment.
LGTM thanks for adding this. only nit..
Per gopidesupavan's review: the model_validate_json + try/except + serialize logic was duplicated between LLMOperator.execute_complete and the AgentOperator HITL branch. Move it to a single rehydrate_pydantic_output helper in output_type.py and call from both sites.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When you pass
output_type=SomePydanticModeltoLLMOperator/LLMAgentOperator/LLMFileAnalysisOperator(or the matching@task.llm/@task.agent/@task.llm_file_analysisdecorators), the operator used to callmodel_dump()on the result before pushing it to XCom. So even though the upstream declared a type, the downstream task got a dict and had to useanalysis["priority"]instead ofanalysis.priority.This PR drops that
model_dump()call. The Pydantic instance flows through XCom as-is. Downstream tasks can type-hint the class and use attribute access.Motivated by Jed's question :) "It would be nice if that could be
List[TicketAnalysis]instead" oflist[dict].What it looks like
How it works (and why this design)
The Pydantic deserializer that turns the wire bytes back into a class instance already exists (see
task-sdk/.../serde/__init__.py:286). What gates it is the allow-list check atserde/__init__.py:264, which only lets through classnames that match[core] allowed_deserialization_classesor live in the process-local_extra_allowedset.Lifting that gate for all Pydantic models is what bolkedebruin pushed back on in PR #51059:
import_stringruns before any type check, and Pydantic validators (@field_validator,@model_validator) can execute arbitrary code duringmodel_validate. So "trust any class as long as it inheritsBaseModel" reopens an attack surface that was deliberately closed.But in his same review he flagged the door:
This PR walks through that door. New helper in
airflow.sdk.serde:Each LLM operator calls
allow_class(output_type)from__init__. The threat model is the same as a config edit: the DAG author putoutput_type=MyModelin code that's already trusted. An attacker who can change that argument already has DAG-file write access, which is RCE.The reason same-DAG downstream tasks just work without any config: every worker that runs any task in the DAG parses the DAG file at startup, which re-runs every operator's
__init__, which callsallow_classagain. Process-local, idempotent.output_typecan be a single class, a Union, an Optional, alist[Model], etc. (pydantic-ai accepts all of those). The newiter_base_model_classeshelper walks the type tree and registers each reachableBaseModelso Union/Optional outputs work too.Demo
Local run with a minimal DAG that mirrors what
LLMOperator(output_type=TicketAnalysis)does internally -- registers the class viaallow_class, returns the instance from one task, attribute-accesses it from the downstream task.The producer task's log line confirms the value flows as a Pydantic instance, not a dict:
The consumer task receives it as
TicketAnalysisand uses attribute access. The UI XCom viewer renders it via the existingstringifypath:Not pretty (no field-by-field rendering today), but the value shows without any allow-list edit on the deployment.
What doesn't get auto-registered
One path still needs
[core] allowed_deserialization_classesupdated:xcom_pullfrom a different DAG. The consumer DAG's worker only parses its own DAG file, so the producer'sLLMOperator.__init__never runs there. That case is the same as today and is called out in the operator guides.The UI XCom display goes through
stringify(notdeserialize), so it works without config -- I confirmed this in the live demo above. Earlier drafts of this PR overstated the limitation; the docs in this version match the actual behaviour.Fail-fast on classes that can't round-trip
allow_classrejects classes whose qualname can't be re-imported:<locals>in__qualname__)__qualname__)__name__(e.g.MyModel = pydantic.create_model("Different", ...))Result[int])Without this guard the failure shows up at the downstream consumer's
import_string()call with no hint at the root cause. With the guard it raises a clearValueErrorat DAG parse time, pointing at the operator that owns the badoutput_type. The example DAGs that previously defined their Pydantic class inside the@dagbody are updated to put them at module scope.Backwards compatibility
Provider is in
incubationlifecycle (0.3.0), so the breaking change to the XCom value shape is permitted by the API contract reservation. Migration note added to the top ofproviders/common/ai/docs/changelog.rst.The
allow_classhelper itself is new toairflow.sdk.serde. The provider still declaresapache-airflow>=3.0.0, so I added a try/except import: when running against an older Airflow that doesn't haveallow_class, the operators fall back tomodel_dump()(the previous behaviour). Users on the new Airflow get the typed path; users on older versions keep the dict path.