## Agents in OpenAI: A Structured Overview

This document provides an overview of Agents in the OpenAI ecosystem, drawing from various resources.

### 1. Core Concepts

*   **Definition:** Agents are a fundamental building block, essentially LLMs configured with specific instructions and tools. (Reference: 2)

*   **Key Properties:**
    *   `instructions`: (also known as a developer message or system prompt) Directives that guide the Agent's behavior. (Reference: 2)
    *   `model`: The specific LLM to be used (e.g., "o3-mini"). (Reference: 2)
    *   `model_settings`: Configuration options to tune the LLM's behavior, such as temperature and top\_p. (Reference: 2)
    *   `tools`: Functions or capabilities the Agent can use to achieve its goals. (Reference: 2)
    *   `context`: A dependency-injection mechanism allowing users to provide dependencies and state to each agent during the run. (Reference: 2)
    *   `output_type`: The desired format of the agent's output. If not set, the default output type is plain text (i.e., `str`). Another common choice is Pydantic objects. (Reference: 2)

### 2. Basic Agent Configuration (Reference: 2)

The following code snippet demonstrates the basic configuration of an agent:

```python
from agents import Agent, ModelSettings, function_tool

@function_tool
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny"

agent = Agent(
    name="Haiku agent",
    instructions="Always respond in haiku form",
    model="o3-mini",
    tools=[get_weather],
)
```

### 3. Advanced Features

*   **Handoffs:** Agents can delegate tasks to specialized sub-agents called "handoffs".  This allows for modular and specialized agent orchestration. (Reference: 2)

*   **Dynamic Instructions:** Instructions can be provided dynamically via a function that receives the agent and context and returns the prompt. (Reference: 2)

*   **Lifecycle Events (Hooks):**  Allows observation of the Agent's lifecycle.  You can hook into the agent lifecycle with the `hooks` property by subclassing the `AgentHooks` class and overriding the methods you're interested in. (Reference: 2)

*   **Guardrails:** Mechanisms for running checks/validations on user input, in parallel with the agent's execution. (Reference: 2)

*   **Cloning/Copying Agents:** Agents can be duplicated using the `clone()` method, with the option to modify properties during the cloning process. (Reference: 2)

### 4. Forcing Tool Use (Reference: 2)

The `ModelSettings.tool_choice` parameter can force the LLM to use tools. Valid values include:

*   `auto`: LLM decides whether to use a tool.
*   `required`: LLM must use a tool (it chooses which one).
*   `none`: LLM must not use a tool.
*   `my_tool`: (a specific string) LLM must use the specified tool.

**Important Note:** When forcing tool use, consider setting `Agent.tool_use_behavior` to prevent infinite loops where the LLM is perpetually forced to call a tool.

### 5. Additional Resources (References)

*   [agents-sdk-intro.ipynb](https://github.com/aurelio-labs/cookbook/blob/main/gen-ai/openai/agents-sdk-intro.ipynb): A cookbook example demonstrating the Agent SDK. (Reference: 1)
*   [agents.md](https://github.com/openai/openai-agents-python/blob/main/docs/agents.md): Documentation on Agents in the OpenAI Agents Python library. (Reference: 2)
*   [Quickstart](https://openai.github.io/openai-agents-python/quickstart/): Quickstart guide for the OpenAI Agents Python library. (Reference: 3)
*   [Model Context Protocol](https://docs.continue.dev/customize/context-providers#model-context-protocol): Information on the Model Context Protocol. (Reference: 4)
*   [End-to-End Examples](https://openai.github.io/openai-agents-python/mcp/#end-to-end-examples): End-to-end examples of using the Model Context Protocol. (Reference: 6)
*   [YouTube Video](https://www.youtube.com/watch?v=35nxORG1mtg): A video resource (unspecified content). (Reference: 5)

In [None]:
import asyncio
from agents import Agent, Runner, OpenAIChatCompletionsModel, AsyncOpenAI
from openai.types.responses import ResponseTextDeltaEvent
import nest_asyncio
nest_asyncio.apply()  # This allows asyncio.run() to work in Jupyter

local_model = OpenAIChatCompletionsModel(
    model="qwen2.5-14b-instruct-1m",
    openai_client=AsyncOpenAI(base_url="http://localhost:1234/v1", 
                              api_key='lm-studio')
)

agent = Agent(
    name="Assistant",
    instructions="You are a helpful reasoning assistant",
    model=local_model
)

async def stream_response():
    result = Runner.run_streamed(agent, "What is higher between 9.19 and 9.9?")
    # Use stream_events() instead of stream
    async for event in result.stream_events():
        # Handle raw response events for text streaming
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)

# Now you can use asyncio.run() as normal
asyncio.run(stream_response())

RunResultStreaming(input='What is higher between 9.19 and 9.9?', new_items=[], raw_responses=[], final_output=None, input_guardrail_results=[], output_guardrail_results=[], current_agent=Agent(name='Assistant', instructions='You are a helpful reasoning assistant', handoff_description=None, handoffs=[], model=<agents.models.openai_chatcompletions.OpenAIChatCompletionsModel object at 0x10e6908d0>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=False, truncation=None, max_tokens=None), tools=[], mcp_servers=[], input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True), current_turn=0, max_turns=10, is_complete=False)

In [13]:
result = await Runner.run(
    starting_agent=agent,
    input="tell me a short story"
)
result.final_output

'In the heart of a small, forgotten village, there was an old clock that stood tall in the town square. Its hands moved slowly, as if it had seen too much time pass by. The villagers believed the clock held a secret — that at midnight, on the stroke of 12, it would tell its most ancient tale.\n\nOne stormy night, a young girl named Lila, curious and brave, decided to stay up late to witness the moment. As the winds howled and rain lashed against her window, she watched the clock\'s hands inch closer to midnight. When the final chime tolled, the clock\'s face glowed faintly, and a soft voice filled the air.\n\n"It was built by a man who lost his love," the voice said. "He crafted it to remember every moment, hoping time would heal his heart."\n\nLila smiled, understanding that sometimes, the greatest stories aren’t told in words but in the quiet moments we cherish. She wrapped herself in her blanket, feeling a warmth that came from knowing there was magic in the world — even if it only 

In [25]:
# we do need to reinitialize our runner before re-executing
response = Runner.run_streamed(
    starting_agent=agent,
    input="tell me a short story"
)

async for event in response.stream_events():
    if event.type == "raw_response_event" and \
        isinstance(event.data, ResponseTextDeltaEvent):
        print(event.data.delta, end="", flush=True)

Once upon a time, in a small village by the sea, there lived an old fisherman named Tom. Every morning, he would set out in his creaky boat, hoping to catch just enough fish to feed his family. One foggy morning, as he cast his net into the misty waters, something unusual happened. The fish didn’t bite that day, but when he pulled his net back into the boat, it was filled with shimmering, golden coins instead.

Tom was puzzled but grateful. He knew he couldn’t keep the treasure, so he returned to his village and shared it with the townspeople. Together, they built a school, fixed the church roof, and helped those in need. From then on, whenever Tom cast his net, he always made sure to leave a little extra for those who needed it most. And though the coins never returned, something even more precious did—the warmth of a community that stood together.

And so, Tom learned that the greatest catch wasn’t in the ocean, but in the hearts of those around him.

In [15]:
from agents import function_tool

@function_tool
def multiply(x: float, y: float) -> float:
    """Multiplies `x` and `y` to provide a precise
    answer."""
    return x*y

In [17]:
math_agent = agent.clone(
    name="Math Agent",
        instructions=(
        "You're a helpful assistant, remember to always "
        "use the provided tools whenever possible. Do not "
        "rely on your own knowledge too much and instead "
        "use your tools to help you answer queries."
    ),
    tools=[multiply]  # note that we expect a list of tools
)

In [22]:
response = Runner.run_streamed(
    starting_agent=agent,
    input="what is 7.814 multiplied by 103.909?"
)

async for event in response.stream_events():
    if event.type == "raw_response_event" and \
        isinstance(event.data, ResponseTextDeltaEvent):
        print(event.data.delta, end="", flush=True)

To find the product of 7.814 and 103.909, you simply multiply the two numbers together:

\[ 7.814 \times 103.909 = 812.506726 \]

So, 7.814 multiplied by 103.909 is approximately **812.507** when rounded to three decimal places.

In [21]:
response = Runner.run_streamed(
    starting_agent=math_agent,
    input="what is 7.814 multiplied by 103.909?"
)

async for event in response.stream_events():
    if event.type == "raw_response_event" and \
        isinstance(event.data, ResponseTextDeltaEvent):
        print(event.data.delta, end="", flush=True)

The result of multiplying 7.814 by 103.909 is approximately 811.945.

In [23]:
from openai.types.responses import (
    ResponseFunctionCallArgumentsDeltaEvent,  # tool call streaming
    ResponseCreatedEvent,  # start of new event like tool call or final answer
)

response = Runner.run_streamed(
    starting_agent=math_agent,
    input="what is 7.814 multiplied by 103.892?"
)

async for event in response.stream_events():
    if event.type == "raw_response_event":
        if isinstance(event.data, ResponseFunctionCallArgumentsDeltaEvent):
            # this is streamed parameters for our tool call
            print(event.data.delta, end="", flush=True)
        elif isinstance(event.data, ResponseTextDeltaEvent):
            # this is streamed final answer tokens
            print(event.data.delta, end="", flush=True)
    elif event.type == "agent_updated_stream_event":
        # this tells us which agent is currently in use
        print(f"> Current Agent: {event.new_agent.name}")
    elif event.type == "run_item_stream_event":
        # these are events containing info that we'd typically
        # stream out to a user or some downstream process
        if event.name == "tool_called":
            # this is the collection of our _full_ tool call after our tool
            # tokens have all been streamed
            print()
            print(f"> Tool Called, name: {event.item.raw_item.name}")
            print(f"> Tool Called, args: {event.item.raw_item.arguments}")
        elif event.name == "tool_output":
            # this is the response from our tool execution
            print(f"> Tool Output: {event.item.raw_item['output']}")

> Current Agent: Math Agent
{"x": 7.814, "y": 103.892}
> Tool Called, name: multiply
> Tool Called, args: {"x": 7.814, "y": 103.892}
> Tool Output: 811.812088
The result of multiplying 7.814 by 103.892 is approximately 811.812.