Add PydanticAIHook to common.ai provider#62546
Merged
kaxil merged 7 commits intoapache:mainfrom Feb 27, 2026
Merged
Conversation
ba2f8f1 to
7394261
Compare
providers/common/ai/src/airflow/providers/common/ai/hooks/pydantic_ai.py
Show resolved
Hide resolved
providers/common/ai/src/airflow/providers/common/ai/hooks/pydantic_ai.py
Outdated
Show resolved
Hide resolved
a1bd0dc to
6be3efe
Compare
Member
Author
All your hard work man 🙏 |
f035e2c to
2d90a0a
Compare
Adds a hook for LLM access via pydantic-ai to the common.ai provider. The hook manages connection credentials and creates pydantic-ai Model and Agent objects, supporting any provider (OpenAI, Anthropic, Google, Bedrock, Ollama, vLLM, etc.). - get_conn() returns a pydantic-ai Model configured with credentials from the Airflow connection (api_key, base_url via provider_factory) - create_agent() creates a pydantic-ai Agent with the hook's model - test_connection() validates model resolution without an API call - Connection UI fields: password (API Key), host (base URL), extra (model) - Google Vertex/GLA providers delegate to default ADC auth Co-Authored-By: GPK <gopidesupavan@gmail.com>
3427dad to
fa55516
Compare
TypeVar on create_agent() lets mypy propagate the output_type through Agent[None, OutputT] → RunResult[OutputT] → result.output, so callers like example_pydantic_ai_hook.py don't need type: ignore. Also fix black-docs blank line in RST code block.
- Move SQLResult inside task function so Sphinx autoapi doesn't document Pydantic BaseModel internals (fixes RST indentation errors) - Add Groq, Ollama, vLLM to spelling wordlist - Change "parseable" to "valid" in test_connection docstring - Remove separate code-block from RST (class is now in exampleinclude)
- Import BaseHook from common.compat.sdk for Airflow 2.x/3.x compat - Import dag/task from common.compat.sdk in example DAG - Replace AirflowException with ValueError for model validation - Use @overload for create_agent so mypy handles the default correctly
@dag-decorated functions must be invoked at module level for DagBag to discover them. Without the calls, DagBag finds 0 DAGs.
The grpcio>=1.70.0 pin was only applied for Python 3.13 when it was added in apache#61380, but yandexcloud>=0.328.0 ships generated protobuf stubs that require grpcio>=1.70.0 at runtime on all Python versions.
This was referenced Feb 27, 2026
Closed
AkshayArali
pushed a commit
to AkshayArali/airflow_630
that referenced
this pull request
Feb 28, 2026
Adds a hook for LLM access via pydantic-ai to the common.ai provider. The hook manages connection credentials and creates pydantic-ai Model and Agent objects, supporting any provider (OpenAI, Anthropic, Google, Bedrock, Ollama, vLLM, etc.). - get_conn() returns a pydantic-ai Model configured with credentials from the Airflow connection (api_key, base_url via provider_factory) - create_agent() creates a pydantic-ai Agent with the hook's model - test_connection() validates model resolution without an API call - Connection UI fields: password (API Key), host (base URL), extra (model) - Google Vertex/GLA providers delegate to default ADC auth TypeVar on create_agent() lets mypy propagate the output_type through Agent[None, OutputT] → RunResult[OutputT] → result.output, so callers like example_pydantic_ai_hook.py don't need type: ignore. Also fix black-docs blank line in RST code block. - Move SQLResult inside task function so Sphinx autoapi doesn't document Pydantic BaseModel internals (fixes RST indentation errors) - Add Groq, Ollama, vLLM to spelling wordlist - Change "parseable" to "valid" in test_connection docstring - Remove separate code-block from RST (class is now in exampleinclude) - Import BaseHook from common.compat.sdk for Airflow 2.x/3.x compat - Import dag/task from common.compat.sdk in example DAG - Replace AirflowException with ValueError for model validation - Use @overload for create_agent so mypy handles the default correctly Co-authored-by: GPK <gopidesupavan@gmail.com>
dominikhei
pushed a commit
to dominikhei/airflow
that referenced
this pull request
Mar 11, 2026
Adds a hook for LLM access via pydantic-ai to the common.ai provider. The hook manages connection credentials and creates pydantic-ai Model and Agent objects, supporting any provider (OpenAI, Anthropic, Google, Bedrock, Ollama, vLLM, etc.). - get_conn() returns a pydantic-ai Model configured with credentials from the Airflow connection (api_key, base_url via provider_factory) - create_agent() creates a pydantic-ai Agent with the hook's model - test_connection() validates model resolution without an API call - Connection UI fields: password (API Key), host (base URL), extra (model) - Google Vertex/GLA providers delegate to default ADC auth TypeVar on create_agent() lets mypy propagate the output_type through Agent[None, OutputT] → RunResult[OutputT] → result.output, so callers like example_pydantic_ai_hook.py don't need type: ignore. Also fix black-docs blank line in RST code block. - Move SQLResult inside task function so Sphinx autoapi doesn't document Pydantic BaseModel internals (fixes RST indentation errors) - Add Groq, Ollama, vLLM to spelling wordlist - Change "parseable" to "valid" in test_connection docstring - Remove separate code-block from RST (class is now in exampleinclude) - Import BaseHook from common.compat.sdk for Airflow 2.x/3.x compat - Import dag/task from common.compat.sdk in example DAG - Replace AirflowException with ValueError for model validation - Use @overload for create_agent so mypy handles the default correctly Co-authored-by: GPK <gopidesupavan@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
PydanticAIHookto thecommon.aiprovider — a hook for LLM access via pydantic-ai. This ships the connection and hook foundation for AIP-99. Future PRs will add operators (LLMSQLQueryOperator) and decorators (@task.llm_sql_query) on top.The hook handles Airflow connection credentials and creates pydantic-ai
ModelandAgentobjects. It works with any provider pydantic-ai supports: OpenAI, Anthropic, Google, Bedrock, Groq, Mistral, Ollama, vLLM, etc.Usage
Connection fields:
sk-...http://localhost:11434/v1for Ollama{"model": "openai:gpt-5"}Cloud providers (Bedrock, Vertex) that use native auth chains leave password empty — pydantic-ai picks up
AWS_PROFILE,GOOGLE_APPLICATION_CREDENTIALS, etc. automatically.Why these choices
get_conn()returns pydantic-aiModel, notAgentorConnection. Airflow convention is thatget_conn()returns a reusable SDK client (OpenAIHook→ OpenAI client,DbApiHook→ DBAPI connection). A pydantic-aiModelis the connection-level object (credentials + model ID). AnAgentis session-level (binds a model to task-specific config), so it lives increate_agent().No abstract
LLMHookbase class. Every Airflow LLM hook (OpenAIHook,CohereHook,GenAIHook) extendsBaseHookdirectly. LLMs don't share a stable interface beyond "send text, get text" — divergence starts immediately with structured output, tools, streaming, vision. Pydantic-ai'sModelprotocol already handles abstraction. We can extract a base class later if a second framework creates real evidence of what a shared interface should look like.Credential injection via
provider_factory.infer_model()doesn't acceptapi_key/base_urldirectly — it takes aprovider_factorycallback that creates provider instances with credentials. Google Vertex/GLA are special-cased since they use ADC and don't acceptapi_key.Known issues
CI image build conflict:
pydantic-ai-slimrequiresopentelemetry-api>=1.28.0, which transitively pullsprotobuf>=5.0. This conflicts withyandexcloud(protobuf<5) when all providers are installed together with--resolution highest. Same issue was identified in #61794. The provider isstate: not-readyso this doesn't affect releases — it only affects the CI all-providers image build. Waiting for https://lists.apache.org/thread/qbx1b8p3296z5pj1hlg3qfggftgjw4m3Co-Authored-By: GPK gopidesupavan@gmail.com