[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/aurelio-labs/cookbook/blob/main/gen-ai/openai/agents-sdk-intro.ipynb)


## OpenAI's Agents SDK
* This notebook was forked from the James Briggs tutorial on this SDK published on March 12, 2025.
* Today's date: 4/11/2025.
* I added a few notes and details from James tutorial.

OpenAI have released an **Agents SDK**, their version of an open source agent development library.

OpenAI have outlined a few features of the library:

```
* Agent loop: Built-in agent loop that handles calling tools, sending results to the LLM, and looping until the LLM is done.
* Python-first: Use built-in language features to orchestrate and chain agents, rather than needing to learn new abstractions.
* Handoffs: A powerful feature to coordinate and delegate between multiple agents.
* Guardrails: Run input validations and checks in parallel to your agents, breaking early if the checks fail.
* Function tools: Turn any Python function into a tool, with automatic schema generation and Pydantic-powered validation.
* Tracing: Built-in tracing that lets you visualize, debug and monitor your workflows, as well as use the OpenAI suite of evaluation, fine-tuning and distillation tools.
```

([source](https://openai.github.io/openai-agents-python/))

We'll focus on covering the essentials here - including the **agent loop**, **python-first**, **guardrails**, and **function tools** features.

Let's start by installing the library:

In [1]:
!pip install -qU openai-agents==0.0.3

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/75.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m71.7/75.5 kB[0m [31m103.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.5/75.5 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/129.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m122.9/129.2 kB[0m [31m5.6 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.2/129.2 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[?25h

First let's set our [OpenAI API key](https://platform.openai.com/settings/organization/api-keys).

In [2]:
import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or \
  getpass("Enter your OpenAI API key: ")

Enter your OpenAI API key: ··········


## Init Simple Agent

In [3]:
from agents import Agent, Runner

agent = Agent(
    name="Assistant",
    instructions="You're a helpful assistant",
    model="gpt-4o-mini",
)

## Running our Agent

OpenAI gives us three methods for running our agent, all via a `Runner` class — those methods are:

1. `Runner.run()` which runs in async without streaming.
2. `Runner.run_sync()` which runs in sync.
3. `Runner.run_streamed()` which runs in async _and_ streams the response back to us --> this is used for streaming.

We'll quicky test method **(1)**:

In [4]:
## test the aysnc run method
result = await Runner.run(
    starting_agent=agent,
    input="tell me a short story"
)
result.final_output

"Once upon a time in a quaint village nestled between emerald hills, there lived a young girl named Elara who loved to paint. Every morning, she would venture into the meadows, her easel in tow, capturing the colors of the flowers and the vibrant skies. However, there was one thing Elara longed to paint but had never seen: the legendary Rainbow Falls said to be hidden deep in the Enchanted Forest.\n\nOne bright day, inspired by her dreams, Elara decided to embark on an adventure to find the falls. She packed her paints and set off, her heart racing with excitement. The forest was alive with the sound of rustling leaves and distant bird songs. After wandering for hours, she stumbled upon a clearing where sunlight danced through the trees, illuminating a path adorned with wildflowers.\n\nFollowing the path, Elara finally arrived at Rainbow Falls. As the sunlight hit the cascading water, it splintered into a myriad of colors, creating a breathtaking rainbow that arched over the falls. Ela

## Summary
1. In most scenarios we'll likely want to be using method **(3)**, ie running async and streaming tokens.
2. To do this we need to write a little more code to handle the async streaming and print the tokens as they're returned.
3. In most cases you want to stream tokens as soon as you get them as outputs from the LLM.

First, we create a `RunResultStreaming` object by calling `Runner.run_streamed(...)`, we then _asynchronously_ iterate through the streamed events returned by our LLM using the `response.stream_events()` method:

In [5]:
## streaming example
## printing every single event that is returned
response = Runner.run_streamed(
    starting_agent=agent,
    input="hello there"
)
async for event in response.stream_events():
    print(event)

AgentUpdatedStreamEvent(new_agent=Agent(name='Assistant', instructions="You're a helpful assistant", handoff_description=None, handoffs=[], model='gpt-4o-mini', model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=False, truncation=None), tools=[], input_guardrails=[], output_guardrails=[], output_type=None, hooks=None), type='agent_updated_stream_event')
RawResponsesStreamEvent(data=ResponseCreatedEvent(response=Response(id='resp_67f8f6357ec4819296844b7378c3d93b0470f88b55d46d6d', created_at=1744369205.0, error=None, incomplete_details=None, instructions="You're a helpful assistant", metadata={}, model='gpt-4o-mini-2024-07-18', object='response', output=[], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, max_output_tokens=None, previous_response_id=None, reasoning=Reasoning(effort=None, generate_summary=None), status='in_progress', text=ResponseTextConfig(forma

Summary
* We can now see EVERY SINGLE EVENT that is streamed from the OpenAI API call.

## We can filter these various event types to find only raw tokens
* Note: this only works for a direct LLM output, once you introduce LLM tools it gets more complicated...

In [6]:
from openai.types.responses import ResponseTextDeltaEvent

# we do need to reinitialize our runner before re-executing
response = Runner.run_streamed(
    starting_agent=agent,
    input="tell me a short story"
)

async for event in response.stream_events():
    if event.type == "raw_response_event" and \
        isinstance(event.data, ResponseTextDeltaEvent):
        print(event.data.delta, end="", flush=True)

Once upon a time in a quaint village nestled between rolling hills, there lived a curious little girl named Lila. Every day after school, she explored the surrounding woods, collecting colorful leaves and smooth stones, all the while dreaming of adventures beyond her village.

One sunny afternoon, while wandering deeper into the forest than ever before, she stumbled upon a shimmering pond. The water was so clear that it reflected the sky like a mirror. Sitting by the edge, Lila noticed a tiny, golden fish struggling to escape a patch of tangled weeds.

Without hesitation, she plunged her hands into the cool water and gently freed the fish. To her astonishment, the fish began to glow and transformed into a beautiful fairy. With a twinkle in her eye, the fairy thanked Lila and granted her one wish.

Lila thought carefully. Instead of wishing for toys or riches, she said, "I wish for more adventures for everyone in my village." The fairy smiled, then waved her wand. 

From that day on, th

## Tools

* OpenAI included **function tools** as a **key feature** in their Agents SDK announcement.
* After turning everyone away from using `_function calling_` to instead use `_tool calling_`, OpenAI have now decided that an LLM deciding to execute some code will be called _"function tools"_.

* To use `_function tools_` in Agents SDK we simply decorate a function with the `@function_tool` decorator like so:

In [7]:
from agents import function_tool

## multiplication tool
@function_tool
def multiply(x: float, y: float) -> float:
    """Multiplies `x` and `y` to provide a precise
    answer."""
    return x*y

Note that we have:
1. taken extra care to include a clear and descriptive function name
2. relatively clear parameter names
3. type annotations for both input parameters and expected output, and a

4. natural language docstring that will be fed to the LLM and explain to it `_what_` this tool does.

To run our agent `_with_ tools` we simply pass our new tool into the `tools` parameter during `Agent` initialization.

In [9]:
agent = Agent(
    name="Assistant",
    ## system prompt/instructions
    instructions=(
        "You're a helpful assistant, remember to always "
        "use the provided tools whenever possible. Do not "
        "rely on your own knowledge too much and instead "
        "use your tools to help you answer queries."
    ),
    model="gpt-4o-mini",
    tools=[multiply]  # note that we expect a list of tools
)

Now let's initialize a new runner and execute our agent with tools:

In [10]:
response = Runner.run_streamed(
    starting_agent=agent,
    input="what is 7.814 multiplied by 103.892?"
)

async for event in response.stream_events():
    print(event)

AgentUpdatedStreamEvent(new_agent=Agent(name='Assistant', instructions="You're a helpful assistant, remember to always use the provided tools whenever possible. Do not rely on your own knowledge too much and instead use your tools to help you answer queries.", handoff_description=None, handoffs=[], model='gpt-4o-mini', model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=False, truncation=None), tools=[FunctionTool(name='multiply', description='Multiplies `x` and `y` to provide a precise\nanswer.', params_json_schema={'properties': {'x': {'title': 'X', 'type': 'number'}, 'y': {'title': 'Y', 'type': 'number'}}, 'required': ['x', 'y'], 'title': 'multiply_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x7de2fe37f7e0>, strict_json_schema=True)], input_guardrails=[], output_guardrails=[], output

If we look closely at the fourth event object we will see `ResponseFunctionToolCall`, meaning our `multiply` tool was called by our LLM. Following this event object we can also see several events containing the `ResponseFunctionCallArgumentsDeltaEvent` type inside the `data` field — these are the input parameters for our tool.

Let's rerun that but this time we will process the event outputs to generate a cleaner and more readable output.

In [11]:
from openai.types.responses import (
    ResponseFunctionCallArgumentsDeltaEvent,  # tool call streaming
    ResponseCreatedEvent,  # start of new event like tool call or final answer
)

response = Runner.run_streamed(
    starting_agent=agent,
    input="what is 7.814 multiplied by 103.892?"
)

## tokens as streamed by LLM
async for event in response.stream_events():
    if event.type == "raw_response_event":
        if isinstance(event.data, ResponseFunctionCallArgumentsDeltaEvent):
            # this is streamed parameters for our tool call
            print(event.data.delta, end="", flush=True)
        elif isinstance(event.data, ResponseTextDeltaEvent):
            # this is streamed final answer tokens
            print(event.data.delta, end="", flush=True)
    elif event.type == "agent_updated_stream_event":
        # this tells us which agent is currently in use
        print(f"> Current Agent: {event.new_agent.name}")
    elif event.type == "run_item_stream_event": ## includes tool calling
        # these are events containing info that we'd typically
        # stream out to a user or some downstream process
        if event.name == "tool_called":
            # this is the collection of our _full_ tool call after our tool
            # tokens have all been streamed
            print()
            print(f"> Tool Called, name: {event.item.raw_item.name}")
            print(f"> Tool Called, args: {event.item.raw_item.arguments}")
        elif event.name == "tool_output":
            # this is the response from our tool execution
            print(f"> Tool Output: {event.item.raw_item['output']}")

> Current Agent: Assistant
{"x":7.814,"y":103.892}
> Tool Called, name: multiply
> Tool Called, args: {"x":7.814,"y":103.892}
> Tool Output: 811.812088
The result of multiplying 7.814 by 103.892 is approximately 811.812.

## Guardrails

* OpenAI have also included guardrails in the Agents SDK.
* These come as:
  1. `_input guardrails_`
    * checks that the input going into your LLM is "safe"
  2. `_output guardrails_`
    * checks that the output from your LLM is "safe".


Let's see how to use them. First, we'll implement a guardrail powered by another LLM (more tokens means more $$$ for OpenAI).

In [14]:
from pydantic import BaseModel

# define structure of output for any guardrail agents
# thus we are checking the output via the guardrail
# forces a STRUCTURED OUTPUT from the agent
class GuardrailOutput(BaseModel):
    is_triggered: bool
    reasoning: str

# define another agent that
# checks if user is asking about political opinions -- we dont want it to do this
politics_agent = Agent(
    name="Politics check",
    instructions="Check if the user is asking you about political opinions",
    output_type=GuardrailOutput,
)

We can call this agent directly:

In [15]:
query = "what do you think about President Donald Trump?"

result = await Runner.run(starting_agent=politics_agent, input=query)
result

RunResult(input='what do you think about President Donald Trump?', new_items=[MessageOutputItem(agent=Agent(name='Politics check', instructions='Check if the user is asking you about political opinions', handoff_description=None, handoffs=[], model=None, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=False, truncation=None), tools=[], input_guardrails=[], output_guardrails=[], output_type=<class '__main__.GuardrailOutput'>, hooks=None), raw_item=ResponseOutputMessage(id='msg_67f8f8fe798c8192938774082c4292a20e3b6637919fb604', content=[ResponseOutputText(annotations=[], text='{"is_triggered":true,"reasoning":"The user is directly asking for an opinion about a political figure, which involves political opinions."}', type='output_text')], role='assistant', status='completed', type='message'), type='message_output_item')], raw_responses=[ModelResponse(output=[ResponseOutputMessage(id='msg_67f8f8

The output from our agent is hidden away in there, we extract it like so:

In [16]:
result.final_output

GuardrailOutput(is_triggered=True, reasoning='The user is directly asking for an opinion about a political figure, which involves political opinions.')

Summary
* Aha! The Guardrail caught that we were asking a political question and it blocked it.

To integrate this with our other agents we need to move our logic into a **single function decorated with the `@input_guardrail` decorator.**

When defining these guardrails we need to follow the following structure:

* Input parameters must include a `ctx` (context), `agent`, and `input` (the user's query in this case). Note that below we will only use the `input` parameter.
* Output must be a `GuardrailFunctionOutput` object.

In [17]:
from agents import (
    GuardrailFunctionOutput,
    RunContextWrapper,
    input_guardrail
)

# this is the guardrail function that returns GuardrailFunctionOutput object
@input_guardrail
async def politics_guardrail(
    ctx: RunContextWrapper[None], #not using but is required
    agent: Agent, # not using but is required
    input: str,
) -> GuardrailFunctionOutput:
    # run agent to check if guardrail is triggered
    response = await Runner.run(starting_agent=politics_agent, input=input)
    # format response into GuardrailFunctionOutput
    return GuardrailFunctionOutput(
        output_info=response.final_output,
        tripwire_triggered=response.final_output.is_triggered,
    )

Now we can initialize our normal agent with the `input_guardrails` parameter:

In [18]:
agent = Agent(
    name="Assistant",
    instructions=(
        "You're a helpful assistant, remember to always "
        "use the provided tools whenever possible. Do not "
        "rely on your own knowledge too much and instead "
        "use your tools to help you answer queries."
    ),
    model="gpt-4o-mini",
    tools=[multiply],
    input_guardrails=[politics_guardrail],  # note this is a list of guardrails
)

Now let's run it! We'll stick with `Runner.run` for the sake of brevity:

In [19]:
result = await Runner.run(
    starting_agent=agent,
    input="what is 7.814 multiplied by 103.892?"
)
result.final_output

'The result of \\( 7.814 \\) multiplied by \\( 103.892 \\) is \\( 811.812088 \\).'

Let's see if our guardrail will trigger:

In [21]:
result = await Runner.run(
    starting_agent=agent,
    input="what do you think about President Donald Trump?"
)

InputGuardrailTripwireTriggered: Guardrail InputGuardrail triggered tripwire

Summary
* We expected the error above!
* Great, our guardrail triggered! The `output_guardrail` type is implemented in almost the exact same way, but uses the `@output_guardrail` decorator when defining the guardrail function, and the `output_guardrails` parameter when defining our `Agent`.

## Conversational Agents

* So far we've only seen how to use our agents with single messages.
* Many use-cases require chat history to make our agents conversational. To implement that we simply provide a list of messages to our `Runner`.

Let's see how this works, first we send a single message:

In [22]:
result = await Runner.run(
    starting_agent=agent,
    input="remember the number 7.814 for me please"
)
result.final_output

"I can't remember things for you permanently. However, you can save that number in your notes or a document for easy access later! If you'd like, I can help you with something else."

Fortunately, we can help our agent remember this information. We can use the `.to_input_list()` method to format our `result` into a list of messages for our next query.

In [23]:
result.to_input_list()

[{'content': 'remember the number 7.814 for me please', 'role': 'user'},
 {'id': 'msg_67f8fa9c8b148192bdd73e30191260690de123c94b288fe1',
  'content': [{'annotations': [],
    'text': "I can't remember things for you permanently. However, you can save that number in your notes or a document for easy access later! If you'd like, I can help you with something else.",
    'type': 'output_text'}],
  'role': 'assistant',
  'status': 'completed',
  'type': 'message'}]

We merge this with our next message:

In [24]:
result = await Runner.run(
    starting_agent=agent,
    input=result.to_input_list() + [
        {"role": "user", "content": "multiply the last number by 103.892"}
    ]
)
result.final_output

'The result of multiplying 7.814 by 103.892 is approximately 811.812.'

It looks like our agent can remember our previous interactions after all!

---

That is our rapid-fire overview of OpenAI's new Agents SDK. We've covered most of the essentials here but there are many other features in the library, and many of the features we included here come with plenty of different ways to use. The SDK is already fairly substantial and certainly worth keeping an eye on.