-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Support GPT-5 Freeform Function Calling and Context Free Grammar for tool args and output #3612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c713a6e
de3bc18
e7ca5ca
f0a5cbe
68fc7cf
5d9af16
3687580
61f7291
7a869b9
5e8cef9
79b519a
991c01d
c70bc1d
9586e6c
0b47135
92db07a
3e605e0
ab45262
3982f32
f3595b3
bef5dec
21e1a0b
cfcf7cf
1c5c500
e0017b4
307b011
131ab91
b386eb6
01988a5
88b8b28
3a46eea
c084523
ef1a696
e3f514d
0dbcdaf
4533df1
fc477bb
4cacf9b
185c929
a81e7b9
4a6e540
8714253
b55c9ab
b347edc
997d2ac
afdd6ef
175eee9
19eb167
6e259c8
742fb91
d69daad
d78a5c2
dc1c182
d1fb3a4
97b4d82
c54f26e
e206e3e
c483fda
5265211
f337820
2c7367f
f3a4afd
c4665a2
b86d2b1
d713c29
d0c346c
3037e6e
4e2264d
e28836b
b49cd81
a7112f4
7c96803
5d2b372
ec057c8
c949c83
36a0759
ae64d6b
f5ca42e
e7bea60
637774f
9cf0931
8c6c976
febe88d
3106219
673ef1e
2581873
9ec5b69
3927cf0
5990d8b
32a6c08
0283547
b5e7d5a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -210,6 +210,137 @@ print(result2.output) | |
| #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. | ||
| ``` | ||
|
|
||
| ### Freeform Function Calling | ||
|
|
||
| GPT‑5 can now send raw text payloads - anything from Python scripts to SQL queries - to your custom tool without wrapping the data in JSON using freeform function calling. This differs from classic structured function calls, giving you greater flexibility when interacting with external runtimes such as: | ||
|
|
||
| * code execution with sandboxes (Python, C++, Java, …) | ||
| * SQL databases | ||
| * Shell environments | ||
| * Configuration generators | ||
|
|
||
| Note that freeform function calling does NOT support parallel tool calling. | ||
|
|
||
| You can enable freeform function calling for a tool by annotating the string parameter with [`FreeformText`][pydantic_ai.tools.FreeformText]. The tool must take a single string argument (other than the runtime context) and the model must be one of the GPT-5 responses models. For example: | ||
|
|
||
| ```python | ||
| from typing import Annotated | ||
|
|
||
| from pydantic_ai import Agent, FreeformText | ||
| from pydantic_ai.models.openai import OpenAIResponsesModel | ||
|
|
||
| model = OpenAIResponsesModel('gpt-5') # (1)! | ||
| agent = Agent(model) | ||
|
|
||
| @agent.tool_plain | ||
| def freeform_tool(sql: Annotated[str, FreeformText()]): ... # (2)! | ||
| ``` | ||
|
|
||
| 1. The GPT-5 family (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`) all support freeform function calling. | ||
| 2. If the tool or model cannot be used with freeform function calling then it will be invoked in the normal way. | ||
|
|
||
| You can read more about this function calling style in the [OpenAI documentation](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#2-freeform-function-calling). | ||
|
|
||
| #### Context Free Grammar | ||
|
|
||
| A tool that queries an SQL database can only accept valid SQL. The freeform function calling of GPT-5 supports generation of valid SQL for this situation by constraining the generated text using a context free grammar. | ||
|
|
||
| A context‑free grammar is a collection of production rules that define which strings belong to a language. Each rule rewrites a non‑terminal symbol into a sequence of terminals (literal tokens) and/or other non‑terminals, independent of surrounding context—hence context‑free. CFGs can capture the syntax of most programming languages and, in OpenAI custom tools, serve as contracts that force the model to emit only strings that the grammar accepts. | ||
|
|
||
| ##### Regular Expression | ||
|
|
||
| The grammar can be written as either a regular expression using [`RegexGrammar`][pydantic_ai.tools.RegexGrammar]: | ||
|
|
||
|
|
||
| ```python | ||
| from typing import Annotated | ||
|
|
||
| from pydantic_ai import Agent, RegexGrammar | ||
| from pydantic_ai.models.openai import OpenAIResponsesModel | ||
|
|
||
| model = OpenAIResponsesModel('gpt-5') # (1)! | ||
| agent = Agent(model) | ||
|
|
||
| timestamp_pattern = r'^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]) (?:[01]\d|2[0-3]):[0-5]\d$' | ||
|
|
||
| @agent.tool_plain | ||
| def timestamp_accepting_tool(timestamp: Annotated[str, RegexGrammar(timestamp_pattern)]): ... # (2)! | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think about supporting the Pydantic pattern |
||
| ``` | ||
|
|
||
| 1. The GPT-5 family (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`) all support freeform function calling with context free grammar constraints. Unfortunately `gpt-5-nano` often struggles with these calls. | ||
| 2. If the tool or model cannot be used with freeform function calling then it will be invoked in the normal way, which may lead to invalid input. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be super cool if we could perform our own agent-side validation of the input by defining If we do that, then most of this can be documented outside of OpenAI context. |
||
|
|
||
| ##### LARK | ||
|
|
||
| Or as a [LARK](https://lark-parser.readthedocs.io/en/latest/how_to_use.html) grammar using [`LarkGrammar`][pydantic_ai.tools.LarkGrammar]: | ||
|
|
||
| ```python | ||
| from typing import Annotated | ||
|
|
||
| from pydantic_ai import Agent, LarkGrammar | ||
| from pydantic_ai.models.openai import OpenAIResponsesModel | ||
|
|
||
| model = OpenAIResponsesModel('gpt-5') # (1)! | ||
| agent = Agent(model) | ||
|
|
||
| timestamp_grammar = r''' | ||
| start: timestamp | ||
|
|
||
| timestamp: YEAR "-" MONTH "-" DAY " " HOUR ":" MINUTE | ||
|
|
||
| %import common.DIGIT | ||
|
|
||
| YEAR: DIGIT DIGIT DIGIT DIGIT | ||
| MONTH: /(0[1-9]|1[0-2])/ | ||
| DAY: /(0[1-9]|[12]\d|3[01])/ | ||
| HOUR: /([01]\d|2[0-3])/ | ||
| MINUTE: /[0-5]\d/ | ||
| ''' | ||
|
|
||
| @agent.tool_plain | ||
| def i_like_iso_dates(date: Annotated[str, LarkGrammar(timestamp_grammar)]): ... # (2)! | ||
| ``` | ||
|
|
||
| 1. The GPT-5 family (`gpt-5`, `gpt-5-mini`, `gpt-5-nano`) all support freeform function calling with context free grammar constraints. Unfortunately `gpt-5-nano` often struggles with these calls. | ||
| 2. If the tool or model cannot be used with freeform function calling then it will be invoked in the normal way, which may lead to invalid input. | ||
|
|
||
| There is a limit to the grammar complexity that GPT-5 supports, as such it is important to test your grammar. | ||
|
|
||
| Freeform function calling, with or without a context free grammar, can be used with the output type for the agent: | ||
|
|
||
| ```python | ||
| from typing import Annotated | ||
|
|
||
| from pydantic_ai import Agent, LarkGrammar | ||
| from pydantic_ai.models.openai import OpenAIResponsesModel | ||
|
|
||
| sql_grammar = r''' | ||
| start: select_stmt | ||
| select_stmt: "SELECT" select_list "FROM" table ("WHERE" condition ("AND" condition)*)? | ||
| select_list: "*" | column ("," column)* | ||
| table: "users" | "orders" | ||
| column: "id" | "user_id" | "name" | "age" | ||
| condition: column ("=" | ">" | "<") (NUMBER | STRING) | ||
| %import common.NUMBER | ||
| %import common.ESCAPED_STRING -> STRING | ||
| %import common.WS | ||
| %ignore WS | ||
| ''' # (1)! | ||
|
|
||
| model = OpenAIResponsesModel('gpt-5') | ||
| agent = Agent(model, output_type=Annotated[str, LarkGrammar(sql_grammar)]) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's try to implement support for this feature in |
||
| ``` | ||
|
|
||
| 1. An inline SQL grammar definition would be quite extensive and so this simplified version has been written, you can find an example SQL grammar [in the openai example](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#33-example---sql-dialect--ms-sql-vs-postgresql). There are also example grammars in the [lark repo](https://github.com/lark-parser/lark/blob/master/examples/composition/json.lark). Remember that a simpler grammar that matches your DDL will be easier for GPT-5 to work with and will result in fewer semantically invalid results. | ||
|
|
||
| ##### Best Practices | ||
|
|
||
| You can find recommended best practices in the [OpenAI Cookbook](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#35-best-practices). | ||
|
|
||
| * [Lark Docs](https://lark-parser.readthedocs.io/en/stable/) | ||
| * [Lark IDE](https://www.lark-parser.org/ide/) | ||
| * [OpenAI Cookbook on CFG](https://cookbook.openai.com/examples/gpt-5/gpt-5_new_params_and_tools#3-contextfree-grammar-cfg) | ||
|
|
||
| ## OpenAI-compatible Models | ||
|
|
||
| Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,7 +24,7 @@ | |
| from ._utils import check_object_json_schema, is_async_callable, is_model_like, run_in_executor | ||
|
|
||
| if TYPE_CHECKING: | ||
| from .tools import DocstringFormat, ObjectJsonSchema | ||
| from .tools import DocstringFormat, ObjectJsonSchema, TextFormat | ||
|
|
||
|
|
||
| __all__ = ('function_schema',) | ||
|
|
@@ -44,6 +44,8 @@ class FunctionSchema: | |
| single_arg_name: str | None = None | ||
| positional_fields: list[str] = field(default_factory=list) | ||
| var_positional_field: str | None = None | ||
| text_format: TextFormat | None = None | ||
| """Text format annotation extracted from a string parameter, if present.""" | ||
|
|
||
| async def call(self, args_dict: dict[str, Any], ctx: RunContext[Any]) -> Any: | ||
| args, kwargs = self._call_args(args_dict, ctx) | ||
|
|
@@ -111,6 +113,7 @@ def function_schema( # noqa: C901 | |
| positional_fields: list[str] = [] | ||
| var_positional_field: str | None = None | ||
| decorators = _decorators.DecoratorInfos() | ||
| text_format: TextFormat | None = None | ||
|
|
||
| description, field_descriptions = doc_descriptions(function, sig, docstring_format=docstring_format) | ||
|
|
||
|
|
@@ -147,6 +150,13 @@ def function_schema( # noqa: C901 | |
| errors.append('RunContext annotations can only be used as the first argument') | ||
| continue | ||
|
|
||
| # Extract text format annotation if present | ||
| if extracted_format := _extract_text_format(annotation): | ||
| if text_format is not None: | ||
| errors.append('Only one parameter may have a TextFormat annotation') | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We may be able to weaken this requirement and support multiple grammar-constrained str args, if we can do the validation on our side. Then we'd use OpenAI's custom tools functionality only if there is a single arg with a format annotation. |
||
| else: | ||
| text_format = extracted_format | ||
|
|
||
| field_name = p.name | ||
| if p.kind == Parameter.VAR_KEYWORD: | ||
| var_kwargs_schema = gen_schema.generate_schema(annotation) | ||
|
|
@@ -222,6 +232,7 @@ def function_schema( # noqa: C901 | |
| takes_ctx=takes_ctx, | ||
| is_async=is_async_callable(function), | ||
| function=function, | ||
| text_format=text_format, | ||
| ) | ||
|
|
||
|
|
||
|
|
@@ -301,3 +312,39 @@ def _build_schema( | |
| def _is_call_ctx(annotation: Any) -> bool: | ||
| """Return whether the annotation is the `RunContext` class, parameterized or not.""" | ||
| return annotation is RunContext or get_origin(annotation) is RunContext | ||
|
|
||
|
|
||
| def _extract_text_format(annotation: Any) -> TextFormat | None: | ||
| """Extract a TextFormat annotation from an Annotated type hint. | ||
|
|
||
| Args: | ||
| annotation: The type annotation to check. | ||
|
|
||
| Returns: | ||
| The TextFormat instance if found, None otherwise. | ||
| """ | ||
| from typing import Annotated, get_args, get_origin | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please move imports to the top of the file |
||
|
|
||
| from .tools import FreeformText, LarkGrammar, RegexGrammar | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should define the types in this file, and import them there |
||
|
|
||
| if get_origin(annotation) is not Annotated: | ||
| return None | ||
|
|
||
| args = get_args(annotation) | ||
| if len(args) < 2: | ||
| return None | ||
|
|
||
| # First arg is the base type, rest are metadata | ||
| base_type = args[0] | ||
| metadata = args[1:] | ||
|
|
||
| # Check if base type is str | ||
| if base_type is not str: | ||
| return None | ||
|
|
||
| # Look for TextFormat in metadata | ||
| for item in metadata: | ||
| if isinstance(item, (FreeformText, RegexGrammar, LarkGrammar)): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or if we make these all subclasses of one type that's defined here, the more interesting subtypes can be defined in |
||
| return item | ||
|
|
||
| return None | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this can be used for output is a bit buried now, I'd like that to be clearer. If we do what I wrote in the other comment about validating on the agent side, this would warrant sections on the Output and Tool docs.