-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
The vLLM implementation of the Responses API is stricter than other implementations currently, and it will produce an error if it doesn't receive all the fields it expects, like the "status": None field.
Here's complete working code with the low-level openai package:
https://github.com/Azure-Samples/nim-on-azure-serverless-gpus-demos/blob/main/examples/openai_functioncalling_loop.py#L122
In other frameworks, we have resolved this by adding "status": None to the function call response, like in this PR for microsoft agent-framework:
https://github.com/microsoft/agent-framework/pull/1509/files
I did ask the vLLM maintainers if they can fix the issue to be less strict, and they are working on it, but I don't know how long it will take for it to get fixed and be reflected in all the services that use vLLM.
I replicated the issue with a gpt-oss deployed via NVIDIA NIM (which wraps vLLM) with the following code. If you ping me on Slack/LinkedIn/Twitter, I can share the endpoint URL.
import asyncio
import logging
import os
import random
from dotenv import load_dotenv
from openai import AsyncOpenAI
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIResponsesModel
from pydantic_ai.providers.openai import OpenAIProvider
from rich.logging import RichHandler
logging.basicConfig(level=logging.DEBUG, format="%(message)s", datefmt="[%X]", handlers=[RichHandler()])
logger = logging.getLogger("weekend_planner")
load_dotenv(override=True)
client = AsyncOpenAI(
base_url=os.environ["NIM_ENDPOINT"],
api_key="none")
model = OpenAIResponsesModel(os.environ["NIM_MODEL"], provider=OpenAIProvider(openai_client=client))
def get_weather(city: str) -> dict:
"""Returns the weather for the given city."""
logger.info(f"Getting weather for {city}")
if random.random() < 0.05:
return {
"city": city,
"temperature": 72,
"description": "Sunny",
}
else:
return {
"city": city,
"temperature": 60,
"description": "Rainy",
}
agent = Agent(
model,
system_prompt="You are a helpful weather assistant.",
tools=[get_weather],
)
async def main():
result = await agent.run("what's the weather in Seattle?")
print(result.output)
if __name__ == "__main__":
logger.setLevel(logging.INFO)
asyncio.run(main())
Python, Pydantic AI & LLM client version
pydantic==2.11.10
pydantic-ai==1.0.8
pydantic-ai-slim==1.0.8
pydantic-evals==1.0.8
pydantic-graph==1.0.8
pydantic-settings==2.11.0
pydantic_core==2.33.2