## Tool Calling + Agents (?)

### Import and setup

In [2]:
import openai
from rich import print
from getpass import getpass

oai_api_key = getpass()

 ········


In [3]:
client = openai.OpenAI(api_key=oai_api_key)

### Tool Calling

We all have heard about agents? 

Tool Calling is "one" of the things that is the foundation of Agents. There have been previous works like:

- [ReACT paper](https://klu.ai/glossary/react-agent-model) was the one to talk about reasoning and acting in sync using LLMs.
- OpenAI had a WebGPT paper in 2021
- ToolFormer was one paper from Meta that talked about tool calling

![react](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd02b2eaa-16c3-4f92-8f97-06329fbcccd4_716x550.gif)



Below is the example directly lifted off OpenAI's documentation (lazy I know). But it is the simplest example to do function calling/tool calling without writing shitty json schemas myself.

Function calling/Tool Calling is nothing but providing LLMs a way to get the exact output via simple functions below or external tools (Exa, Tavily, other APIs, or even Moondream?)

In [5]:
tools = [{
    "type": "function",
    "name": "get_weather",  # <- Add this at the top level
    "function": {
        "name": "get_weather",  # This is usually required too
        "description": "Get current temperature for a given location.",
        "parameters": {
            "type": "object",
            "properties": { 
                "location": {
                    "type": "string",
                    "description": "City and country e.g. Bogotá, Colombia"
                }
            },
            "required": ["location"],
            "additionalProperties": False
        },
        "strict": True
    }
}]


A similar python function would be:

In [6]:
def get_weather(location: str):
    # some processing
    extracted_location = "xxxx"
    return {"location": extracted_location}

We were using the `completions` API from OpenAI, but they have a new API called `responses`

In [7]:
response = client.responses.create(
    model="gpt-4.1",
    input=[{"role": "user", "content": "What is the weather like in Paris today?"}],
    tools=tools
)

In [8]:
print(response)

In [9]:
print(response.output[0])

In [10]:
tools = [
    {
        "type": "function",
        "name": "get_weather",
        "function": {
            "name": "get_weather",
            "description": "Get current temperature for a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and country e.g. Bogotá, Colombia"
                    }
                },
                "required": ["location"],
                "additionalProperties": False
            },
            "strict": True
        }
    },
    {
        "type": "function",
        "name": "get_time",
        "function": {
            "name": "get_time",
            "description": "Get current local time for a given location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and country e.g. Tokyo, Japan"
                    }
                },
                "required": ["location"],
                "additionalProperties": False
            },
            "strict": True
        }
    }
]


In [11]:
response = client.responses.create(
    model="gpt-4.1",
    input=[{"role": "user", "content": "What is the weather like in Paris today?"}],
    tools=tools
)

In [12]:
print(response.output[0])

In [13]:
response = client.responses.create(
    model="gpt-4.1",
    input=[{"role": "user", "content": "What is the time in London now?"}],
    tools=tools
)

In [14]:
print(response.output[0])

In [15]:
response = client.responses.create(
    model="gpt-3.5-turbo",
    input=[{"role": "user", "content": "What is the time in London now?"}],
    tools=tools
)

In [16]:
print(response.output[0])

In [17]:
response = client.responses.create(
    model="gpt-3.5-turbo",
    input=[{"role": "user", "content": "What can we add to the lasagna?"}],
    tools=tools
)

In [18]:
print(response.output[0])

The actual function can be written like this:

In [19]:
import requests

def get_weather_api(latitude, longitude):
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

In [20]:
tools = [{
    "type": "function",
    "name": "get_weather_api",
    "description": "Get current temperature for provided coordinates in celsius.",
    "parameters": {
        "type": "object",
        "properties": {
            "latitude": {"type": "number"},
            "longitude": {"type": "number"}
        },
        "required": ["latitude", "longitude"],
        "additionalProperties": False
    },
    "strict": True
}]

In [21]:
input_messages = [{"role": "user", "content": "What's the weather like in Paris today?"}]

response = client.responses.create(
    model="gpt-3.5-turbo",
    input=input_messages,
    tools=tools)

1. LLM call routes to the function available

In [22]:
print(response.output[0])

In [23]:
import json 

tool_call = response.output[0]
args = json.loads(tool_call.arguments)

In [24]:
args

{'latitude': 48.8566, 'longitude': 2.3522}

2. Calling the actual function (with the API)

In [25]:
result = get_weather_api(args["latitude"], args["longitude"])

In [26]:
result

15.8

3. 2nd LLM call to actually get the final output.

In [27]:
input_messages.append(tool_call)
input_messages.append({
    "type": "function_call_output",
    "call_id": tool_call.call_id,
    "output": str(result)
})

In [28]:
input_messages

[{'role': 'user', 'content': "What's the weather like in Paris today?"},
 ResponseFunctionToolCall(arguments='{"latitude":48.8566,"longitude":2.3522}', call_id='call_rAZqrluGwz2yJcFl5LIKfSjr', name='get_weather_api', type='function_call', id='fc_6829ae3a08d88191ab045d7d9ede04e70fab595beb736913', status='completed'),
 {'type': 'function_call_output',
  'call_id': 'call_rAZqrluGwz2yJcFl5LIKfSjr',
  'output': '15.8'}]

In [29]:
response_2 = client.responses.create(
    model="gpt-4.1",
    input=input_messages,
    tools=tools,
)
print(response_2.output_text)

Writing the function schemas is deadass boring job plus there's a high chance that we might do a mistake (even an LLM). Also, big sore point is that you have to do all the tasks. The LLM is just there to tell you, "Hey, use this function. Execute and tell me the output".

We can wrap it up into decoraters like below:

In [30]:
import json
import inspect
from typing import Any, Callable, Dict, List, Optional, Union
import openai

class ToolBasedChatCompleter:
    """
    A chat completion class that supports tool calling with the OpenAI API.
    This combines both the ChatCompleter and Tool functionality into one class.
    """
    
    def __init__(self, model: str, api_key: str):
        """
        Initialize the ToolBasedChatCompleter with the given model and API key.
        
        Args:
            model: The model to use for chat completion.
            api_key: The OpenAI API key.
        """
        self.model = model
        self.api_key = api_key
        self.client = openai.OpenAI(api_key=self.api_key)
        self.tools = {}
    
    def register_tool(self, description: str = None):
        """
        Decorator to register a tool (function) with the completer.
        
        Args:
            description: A description of what the tool does.
            
        Returns:
            A decorator function that registers the decorated function as a tool.
        """
        def decorator(func: Callable):
            sig = inspect.signature(func)
            properties = {}
            for name, param in sig.parameters.items():
                if param.annotation is inspect.Parameter.empty:
                    properties[name] = {"type": "string"}
                elif param.annotation == str:
                    properties[name] = {"type": "string"}
                elif param.annotation == int:
                    properties[name] = {"type": "integer"}
                elif param.annotation == float:
                    properties[name] = {"type": "number"}
                elif param.annotation == bool:
                    properties[name] = {"type": "boolean"}
                else:
                    properties[name] = {"type": "string"}
            
            required = [name for name in sig.parameters]
            
            tool_desc = description or func.__doc__ or f"Call the {func.__name__} function"
            
            self.tools[func.__name__] = {
                "type": "function",
                "function": {
                    "name": func.__name__,
                    "description": tool_desc,
                    "parameters": {
                        "type": "object",
                        "properties": properties,
                        "required": required,
                    }
                },
                "handler": func
            }
            return func
        return decorator
    
    def chat(self, messages: List[Dict[str, str]]) -> str:
        """
        Process a chat with the given messages, executing tools as needed.
        
        Args:
            messages: A list of message dictionaries with role and content keys.
            
        Returns:
            The response text, or the result of a tool call.
        """
        # Prepare tools for the API
        api_tools = []
        for tool_name, tool_info in self.tools.items():
            api_tools.append({
                "type": tool_info["type"],
                "function": tool_info["function"]
            })
        
        # Call the OpenAI API
        response = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=api_tools if api_tools else None,
            tool_choice="auto"
        )
        
        # Process the response
        response_message = response.choices[0].message
        
        # Check if the model wants to call a tool
        if response_message.tool_calls:
            # Process each tool call
            for tool_call in response_message.tool_calls:
                function_name = tool_call.function.name
                function_args = json.loads(tool_call.function.arguments)
                
                if function_name in self.tools:
                    # Call the handler function
                    handler = self.tools[function_name]["handler"]
                    result = handler(**function_args)
                    
                    # Add the tool call and result to the messages
                    messages.append({
                        "role": "assistant",
                        "content": None,
                        "tool_calls": [
                            {
                                "id": tool_call.id,
                                "type": "function",
                                "function": {
                                    "name": function_name,
                                    "arguments": tool_call.function.arguments
                                }
                            }
                        ]
                    })
                    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": str(result)
                    })
                    
                    # Call the API again with the tool result
                    return self.chat(messages)
                else:
                    return f"Tool {function_name} not found"
        
        # Return the text response if no tool calls
        return response_message.content

In [31]:
completer = ToolBasedChatCompleter(
            model="gpt-3.5-turbo",
            api_key=oai_api_key
        )

In [32]:
@completer.register_tool("Get temperature for a location")
def get_weather_api(latitude, longitude):
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

In [33]:
from datetime import datetime
import pytz

@completer.register_tool("Get current local time for a location")
def get_time(location: str) -> str:
    try:
        city_to_timezone = {
            "new york": "America/New_York",
            "london": "Europe/London",
            "tokyo": "Asia/Tokyo",
            "paris": "Europe/Paris",
        }
        
        timezone_str = city_to_timezone.get(location.lower(), location)
        timezone = pytz.timezone(timezone_str)
        
        local_time = datetime.now(timezone)
        formatted_time = local_time.strftime("%I:%M %p")
        
        return f"The local time in {location} is {formatted_time}."
    
    except pytz.exceptions.UnknownTimeZoneError:
        return f"Sorry, couldn't find timezone data for {location}. Please use a valid city name or timezone identifier (e.g. 'America/New_York')."

In [34]:
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the weather like in Paris?"}
]

response = completer.chat(messages)

In [35]:
response

'The current temperature in Paris is 15.8°C.'

In [36]:
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the time in Delhi?"}
]

response = completer.chat(messages)
print(response)

Now the above is very specific for OpenAI. What if I had to build it for other models? Too much headache to actually go and write a generic API for it. We can do that for academic purposes.

Or we can move to higher abstractions that do the same, but with us not breaking our heads.

There are a bunch of these:
- LangChain
- LlamaIndex
- smolagents
- Pydantic AI

All these have good properties. I hate LangChain the most, too many things happening and you tend to marry too much into the API. Similar case with LlamaIndex, but it is still manageable.

I like the very bare minimum `smolagents` which has integration with HuggingFace models as well + tool support. `Pydantic AI` might be slightly higher in the complexity curve, but has a lot of good things.

Again, choosing a framework is a personal choice. There are pros and cons wrt complexity, token usage and bunch of production constraints. Choose wisely.

### smolagents

Definition of an agent:

- Agent = LLM (Brain) + Tools (Limbs) + External Information (Organs)

We can do the tool call easily with `smolagents` -> `ToolCallingAgent`

But first we'll do a simple prompt

In [37]:
number = 102001

In [38]:
from math import isclose
from typing import Union

def calculate_square_root(number: float, precision: float = 1e-10, max_iterations: int = 100) -> Union[float, str]:
    # Handle edge cases
    if number < 0:
        return "Error: Cannot calculate square root of a negative number"
    
    if number == 0:
        return 0.0
    
    if number == 1:
        return 1.0
    
    # Initial guess
    guess = number / 2
    
    # Newton's method iteration
    for _ in range(max_iterations):
        new_guess = (guess + number / guess) / 2
        
        # Check if we've reached desired precision
        if isclose(new_guess, guess, rel_tol=precision):
            return new_guess
        
        guess = new_guess
    
    return f"Warning: Maximum iterations ({max_iterations}) reached. Last approximation: {guess}"

In [39]:
calculate_square_root(number)

319.37595401031683

In [40]:
from smolagents import CodeAgent, LiteLLMModel

prompt = f"Can you find me the square root of {number}"
model = LiteLLMModel(model_id="gpt-3.5-turbo", api_key=oai_api_key)
agent = CodeAgent(tools=[], model=model, add_base_tools=True)

agent.run(prompt)

319.37595401031683

In [41]:
from smolagents import ToolCallingAgent

model = LiteLLMModel(model_id="gpt-3.5-turbo", api_key=oai_api_key)
agent = ToolCallingAgent(tools=[], model=model, add_base_tools=True)

agent.run(prompt)

'319.37595401031683'

In [44]:
print(agent.system_prompt)

In [42]:
print(agent.logs)

The 'logs' attribute is deprecated and will soon be removed. Please use 'self.memory.steps' instead.


In [45]:
print(agent.replay())

In [46]:
from smolagents import WebSearchTool

search_tool = WebSearchTool()
print(search_tool("Who's the current manager of Arsenal FC?"))

Using the weather and time tool

In [47]:
from smolagents import tool

@tool
def get_weather_api(latitude: float, longitude: float) -> float:
    """
    This is a tool that gives weather of the location when provided with it's latitude and longitude

    Args:
        latitude: latitude of the location
        longitude: longitude of the location
    """
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

In [48]:
@tool
def get_time(location: str) -> str:
    """
    This is a tool to give the local time given a location

    Args:
        location: String location of the place
    """
    try:
        city_to_timezone = {
            "new york": "America/New_York",
            "london": "Europe/London",
            "tokyo": "Asia/Tokyo",
            "paris": "Europe/Paris",
        }
        
        timezone_str = city_to_timezone.get(location.lower(), location)
        timezone = pytz.timezone(timezone_str)
        
        local_time = datetime.now(timezone)
        formatted_time = local_time.strftime("%I:%M %p")
        
        return f"The local time in {location} is {formatted_time}."
    
    except pytz.exceptions.UnknownTimeZoneError:
        return f"Sorry, couldn't find timezone data for {location}. Please use a valid city name or timezone identifier (e.g. 'America/New_York')."

In [49]:
agent = CodeAgent(tools=[get_weather_api, get_time], model=model)
agent.run(
    "What's the time in Delhi?"
)

'03:38 PM'

In [50]:
agent = ToolCallingAgent(tools=[get_weather_api, get_time], model=model)
agent.run(
    "What's the time in Delhi?"
)

'03:41 PM'

In [51]:
agent = ToolCallingAgent(tools=[get_weather_api, get_time], model=model)
agent.run(
    "What's the temperature in Paris?"
)

'16.0°C'

In [52]:
agent = CodeAgent(tools=[get_weather_api, get_time], model=model)
agent.run(
    "What's the temperature in Paris?"
)

16.0

**TASK: Find the token usage in the OpenAI SDK**

### Building the Fbref Scouting Bot

The previous version, we did the following things:

- Scraped the data using url
- Sent that data to the model for parsing
- Use that parsed data to do some analysis. We can use spider charts

Instead of giving urls, we can give names and fetch the urls directly. The analysis can be done using the matplotlib tooling to automatically generate the charts

We'll use the default search tools and add a few more of us

In [53]:
from firecrawl import FirecrawlApp, ScrapeOptions
from firecrawl.firecrawl import ScrapeResponse

from getpass import getpass
fc_api_key = getpass()

 ········


In [54]:
@tool
def firecrawl_scrape(url: str) -> ScrapeResponse:
    """
    Firecrawl scraping tool to extract data in markdown and html format
    
    Args:
        url: url that we need to scrape
    """
    app = FirecrawlApp(api_key=fc_api_key)
    
    # Scrape a website:
    return app.scrape_url(
      url, 
      formats=['markdown', 'html']
    )

In [55]:
gpt_41_model = LiteLLMModel(model_id="gpt-4.1-mini", api_key=oai_api_key)

agent = CodeAgent(tools=[firecrawl_scrape], model=gpt_41_model)
agent.run(
    "I want to scout Bukayo Saka as a player. Want to get a comprehensive report about him. Please use data from fbref.com"
)

"\nPlayer Report: Bukayo Saka\n\nPersonal Information:\n- Full Name: Bukayo Ayoyinka T.M. Saka\n- Position: DF-FW-MF (AM, right)\n- Footed: Left\n- Height: 178 cm (5-10)\n- Weight: 64 kg (143 lb)\n- Date of Birth: September 5, 2001 (Age: 23)\n- Nationality: England\n\nClub Information:\n- Current Club: Arsenal FC\n- Weekly Wage: £195,000 (expires June 2027)\n\n2024-2025 Season Performance:\n\nPremier League:\n- Matches Played: 23\n- Minutes Played: 1626\n- Goals: 6\n- Assists: 10\n- Expected Goals (xG): 6.7\n- Non-Penalty Expected Goals (npxG): 5.9\n- Expected Assists (xA): 7.0\n- Shot Creating Actions (SCA): 109\n- Goal Creating Actions (GCA): 22\n\nChampions League:\n- Matches Played: 9\n- Minutes Played: 760\n- Goals: 6\n- Assists: 2\n- Expected Goals (xG): 4.4\n- Non-Penalty Expected Goals (npxG): 2.9\n- Expected Assists (xAG): 3.3\n- Shot Creating Actions (SCA): 31\n- Goal Creating Actions (GCA): 7\n\nAdditional Resources:\n- Scouting Report (Last 365 Days Men's Big 5 Leagues, UCL

In [216]:
print(agent.provide_final_answer(agent.task))

Let's try out Pydantic AI, if it actually makes life easier?

In [61]:
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider

number = 102001

model = OpenAIModel("gpt-4o-mini", provider=OpenAIProvider(api_key=oai_api_key))
agent = Agent(  
    model,
    system_prompt='Be concise, reply with one sentence.',
)

prompt = f"Can you find me square root of {number}?"

result = await agent.run(prompt)
print(result.output)

Okay, let's see how we can add tools here:

In [113]:
from pydantic_ai import RunContext
from httpx import AsyncClient # import requests
from dataclasses import dataclass

In [114]:
# standard python class: type

class enum():
    red = "RED"
    green = "GREEN"

@dataclass
class Deps:
    client: AsyncClient
    # firecrawl_api_key: str
    # model_name: Literal["moondream", "qwen"]

# we want to pass parameters into the tools of an agent

# Agent
    # Tools: tool_plain & tool
    # tool_plain: This doesn't require any context
    # tool: This requires context

In [115]:
agent = Agent(  
    model,
    instructions=(
        "Be concise,reply with one sentence"
    ),
    deps_type=Deps
)

In [116]:
@agent.tool
async def get_weather_api(ctx: RunContext[Deps], latitude: float, longitude: float) -> float:
    """
    Retrieves the current temperature for the specified latitude and longitude
    using the Open-Meteo API.

    Args:
        latitude: Latitude of the location.
        longitude: Longitude of the location.

    Returns:
        The current temperature in degrees Celsius.
    """
    url = "https://api.open-meteo.com/v1/forecast"
    params = {
        "latitude": latitude,
        "longitude": longitude,
        "current": "temperature_2m,wind_speed_10m",
        "hourly": "temperature_2m,relative_humidity_2m,wind_speed_10m"
    }

    response = await ctx.deps.client.get(url, params=params)
    response.raise_for_status()

    data = response.json()
    return data['current']['temperature_2m']


In [117]:
@agent.tool_plain
def get_time(location: str) -> str:
    """
    This is a tool to give the local time given a location

    Args:
        location: String location of the place
    """
    try:
        city_to_timezone = {
            "new york": "America/New_York",
            "london": "Europe/London",
            "tokyo": "Asia/Tokyo",
            "paris": "Europe/Paris",
        }
        
        timezone_str = city_to_timezone.get(location.lower(), location)
        timezone = pytz.timezone(timezone_str)
        
        local_time = datetime.now(timezone)
        formatted_time = local_time.strftime("%I:%M %p")
        
        return f"The local time in {location} is {formatted_time}."
    
    except pytz.exceptions.UnknownTimeZoneError:
        return f"Sorry, couldn't find timezone data for {location}. Please use a valid city name or timezone identifier (e.g. 'America/New_York')."

In [118]:
async with AsyncClient() as client:
    deps = Deps(client=client)
    result = await agent.run("What's the weather like in Paris today?", deps=deps)
    print(result.output)

11:30:38.849 agent run
11:30:38.850   chat gpt-4o-mini
11:30:38.852     POST api.openai.com/v1/chat/completions
11:30:39.964 Reading response body
             agent run
11:30:39.971   running 1 tool
11:30:39.972     running tool: get_weather_api
11:30:39.974       GET api.open-meteo.com/v1/forecast ? latitude='48.8566' & longitude='2.3522' & current='temperatu…speed_10m' & hourly='temperatu…speed_10m'
11:30:40.557 Reading response body
             agent run
11:30:40.566   chat gpt-4o-mini
11:30:40.569     POST api.openai.com/v1/chat/completions
11:30:41.929 Reading response body


In [119]:
result.usage()

Usage(requests=2, request_tokens=329, response_tokens=39, total_tokens=368, details={'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0, 'cached_tokens': 0})

In [120]:
async with AsyncClient() as client:
    deps = Deps(client=client)
    result = await agent.run("What's the time like Bogota now?", deps=deps)
    print(result.output)

11:30:41.977 agent run
11:30:41.978   chat gpt-4o-mini
11:30:41.980     POST api.openai.com/v1/chat/completions
11:30:43.086 Reading response body
             agent run
11:30:43.094   running 1 tool
11:30:43.097     running tool: get_time
11:30:43.101   chat gpt-4o-mini
11:30:43.108     POST api.openai.com/v1/chat/completions
11:30:44.477 Reading response body
             agent run
11:30:44.492   running 1 tool
11:30:44.493     running tool: get_time
11:30:44.496   chat gpt-4o-mini
11:30:44.500     POST api.openai.com/v1/chat/completions
11:30:45.481 Reading response body


Getting structured outputs is a good side-effect like in `instructor`, which `smolagents` at this moment doesn't have.

In [121]:
from pydantic import BaseModel
from typing import Optional

class LocationInfo(BaseModel):
    place: str
    temperature: Optional[float]
    # time: Optional[str]

In [122]:
agent = Agent(  
    model,
    instructions=(
        "Be concise,reply with one sentence"
    ),
    deps_type=Deps,
    output_type=LocationInfo
)

In [123]:
@agent.tool
async def get_weather_api(ctx: RunContext[Deps], latitude: float, longitude: float) -> float:
    """
    Retrieves the current temperature for the specified latitude and longitude
    using the Open-Meteo API.

    Args:
        latitude: Latitude of the location.
        longitude: Longitude of the location.

    Returns:
        The current temperature in degrees Celsius.
    """
    url = "https://api.open-meteo.com/v1/forecast"
    params = {
        "latitude": latitude,
        "longitude": longitude,
        "current": "temperature_2m,wind_speed_10m",
        "hourly": "temperature_2m,relative_humidity_2m,wind_speed_10m"
    }

    response = await ctx.deps.client.get(url, params=params)
    response.raise_for_status()

    data = response.json()
    return data['current']['temperature_2m']


In [124]:

@agent.tool_plain
def get_time(location: str) -> str:
    """
    This is a tool to give the local time given a location

    Args:
        location: String location of the place
    """
    try:
        city_to_timezone = {
            "new york": "America/New_York",
            "london": "Europe/London",
            "tokyo": "Asia/Tokyo",
            "paris": "Europe/Paris",
        }
        
        timezone_str = city_to_timezone.get(location.lower(), location)
        timezone = pytz.timezone(timezone_str)
        
        local_time = datetime.now(timezone)
        formatted_time = local_time.strftime("%I:%M %p")
        
        return f"The local time in {location} is {formatted_time}."
    
    except pytz.exceptions.UnknownTimeZoneError:
        return f"Sorry, couldn't find timezone data for {location}. Please use a valid city name or timezone identifier (e.g. 'America/New_York')."

In [125]:
async with AsyncClient() as client:
    deps = Deps(client=client)
    result = await agent.run("What's the weather like in Paris today?", deps=deps)
    print(result.output)

11:30:45.552 agent run
11:30:45.553   chat gpt-4o-mini
11:30:45.554     POST api.openai.com/v1/chat/completions
11:30:46.650 Reading response body
             agent run
11:30:46.663   running 1 tool
11:30:46.664     running tool: get_weather_api
11:30:46.666       GET api.open-meteo.com/v1/forecast ? latitude='48.8566' & longitude='2.3522' & current='temperatu…speed_10m' & hourly='temperatu…speed_10m'
11:30:47.233 Reading response body
             agent run
11:30:47.240   chat gpt-4o-mini
11:30:47.244     POST api.openai.com/v1/chat/completions
11:30:48.382 Reading response body


How did this run? I do not have a way to track like `smolagents`. Let's setup logfire

In [83]:
import logfire

logfire.configure(token='pylf_v1_us_YQP8PHhGTG0VS9BNv8FZmFBTRN9hvZVn65wpqWSS3S0r')
logfire.instrument_pydantic_ai()
logfire.instrument_httpx(capture_all=True)
# logfire.info('Hello, {place}!', place='World')

[1mLogfire[0m project URL: ]8;id=575591;https://logfire-us.pydantic.dev/pratos/starter-project\[4;36mhttps://logfire-us.pydantic.dev/pratos/starter-project[0m]8;;\


In [84]:
logfire.info('Hello, {place}!', place='World')

11:03:20.350 Hello, World!


In [85]:
async with AsyncClient() as client:
    deps = Deps(client=client)
    result = await agent.run("What's the weather like in Paris today?", deps=deps)
    print(result.output)

11:03:43.989 agent run
11:03:43.990   chat gpt-4o-mini
11:03:43.991     POST api.openai.com/v1/chat/completions
11:03:45.162 Reading response body
             agent run
11:03:45.165   running 1 tool
11:03:45.165     running tool: get_weather_api
11:03:45.166       GET api.open-meteo.com/v1/forecast ? latitude='48.8566' & longitude='2.3522' & current='temperatu…speed_10m' & hourly='temperatu…speed_10m'
11:03:45.731 Reading response body
             agent run
11:03:45.745   chat gpt-4o-mini
11:03:45.748     POST api.openai.com/v1/chat/completions
11:03:46.732 Reading response body


Now that we are set, we can start building our Fbref Scouting Bot.

Now this is the place where multi-step agents (workflows) that can be implemented to do our job. 

In [86]:
# write the model
from typing import Literal, List
from pydantic import HttpUrl, Field

class PlayerPersonal(BaseModel):
    name: Optional[str] = None
    profile_pic: Optional[HttpUrl] = None
    position: Optional[List[str]] = None
    foot: Literal["Left", "Right"] = None
    height: Optional[str]
    weight: Optional[str]
    birthday: Optional[str]
    birthplace: Optional[str]
    national_team: Optional[str]
    club: Optional[str]
    national_team: Optional[str]
    wages: Optional[str]
    contract_expiring_on: Optional[str]
    social_media: Optional[List[str]]

class AttackingStats(BaseModel):
    npg: float
    npg_percentile: int
    npxG: float
    npxG_percentile: int
    total_shots: float
    total_shots_percentile: int
    assists: float
    assists_percentile: int
    xAG: float
    xAG_percentile: int
    total_attacking_prowress: Optional[float] = Field(description="npXG + xAG")
    sca: Optional[float] = Field(description="Shot creating actions")
    sca_percentile: int
    passes_attempted: float
    passes_attempted_percentile: int
    pass_completion: float
    pass_completion_percentile: int
    progressive_passes: float
    progressive_passes_percentile: int
    progressive_carries: float
    progressive_carries_percentile: int
    successful_takeons: float
    successful_takeons_percentile: int
    touches: float
    touches_percentile: int

class DefensiveStats(BaseModel):
    tackles: float
    tackles_percentile: int
    interceptions: float
    interceptions_percentile: int
    blocks: float
    blocks_percentile: int
    clearances: float
    clearances_percentile: int
    aerials_won: float
    aerials_won_percentile: int

class SimilarPlayer(BaseModel):
    name: str
    url: Optional[HttpUrl]

class PlayerInfo(BaseModel):
    personal: PlayerPersonal
    attacking_stats: AttackingStats
    defensive_stats: DefensiveStats
    similar_players: List[SimilarPlayer] = Field(description="Comparing against Att Mid/Wingers with \
    tuple being Player Name and url of the player")

- Get the url from player name
- Scrape the data
- Send the data to another model to get the summary

Good thing with `pyadanitc-ai` is that tool calling (`web_search`, etc) is available

In [87]:
from openai.types.responses import WebSearchToolParam  
from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings

In [88]:
model_settings = OpenAIResponsesModelSettings(
    openai_builtin_tools=[WebSearchToolParam(type="web_search_preview")],
)

In [89]:
from pydantic import HttpUrl

class PlayerUrl(BaseModel):
    name: str
    fbref_url: Optional[HttpUrl] = Field(description="Url from fbref.com about player statistics")

In [90]:
model = OpenAIResponsesModel("gpt-4o-mini", provider=OpenAIProvider(
        api_key=oai_api_key
    ),)
agent = Agent(
    model=model,
    # instructions=('Be Specific'), 
    model_settings=model_settings, 
    output_type=PlayerUrl,
    output_retries=5
)

In [91]:
await agent.run("Who's Arsenal FC's manager?")

11:05:04.324 agent run
11:05:04.329   chat gpt-4o-mini
11:05:04.334     POST api.openai.com/v1/responses
11:05:11.603 Reading response body


AgentRunResult(output=PlayerUrl(name='Mikel Arteta', fbref_url=HttpUrl('https://fbref.com/en/managers/59c52334/Mikel-Arteta')))

In [92]:
player = "Bukayo Saka"
result = await agent.run(f"Find {player} statistics source")
print(result.output)

11:05:11.626 agent run
11:05:11.628   chat gpt-4o-mini
11:05:11.630     POST api.openai.com/v1/responses
11:05:13.072 Reading response body


In [290]:
openrouter_key = getpass()
# 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

 ········


In [131]:
from pydantic_ai.common_tools.tavily import tavily_search_tool

In [132]:
model = OpenAIModel("gpt-4o-mini", provider=OpenAIProvider(api_key=oai_api_key))

In [95]:
tavily_api = getpass()
#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

 ········


In [133]:
agent = Agent(  
    model,
    instructions=(
        "Be concise,reply with one sentence"
    ),
    tools=[tavily_search_tool(tavily_api)],
)

In [134]:
result = await agent.run(f"Find {player} statistics source")
print(result.output)

11:34:49.292 agent run
11:34:49.302   chat gpt-4o-mini
11:34:49.309     POST api.openai.com/v1/chat/completions
11:34:50.550 Reading response body
             agent run
11:34:50.557   running 1 tool
11:34:50.558     running tool: tavily_search
11:34:50.598       POST api.tavily.com/search
11:34:52.901 Reading response body
             agent run
11:34:52.916   chat gpt-4o-mini
11:34:52.921     POST api.openai.com/v1/chat/completions
11:34:55.485 Reading response body


In [135]:
agent = Agent(  
    model,
    tools=[tavily_search_tool(tavily_api)],
    output_type=PlayerUrl
)

In [136]:
player = "Benjamin Sesko"

result = await agent.run(f"Find {player} statistics source")
print(result.output)

11:34:55.519 agent run
11:34:55.522   chat gpt-4o-mini
11:34:55.523     POST api.openai.com/v1/chat/completions
11:34:56.532 Reading response body
             agent run
11:34:56.589   running 1 tool
11:34:56.590     running tool: tavily_search
11:34:56.626       POST api.tavily.com/search
11:34:59.615 Reading response body
             agent run
11:34:59.620   chat gpt-4o-mini
11:34:59.625     POST api.openai.com/v1/chat/completions
11:35:01.436 Reading response body


In [126]:
crawl_model = OpenAIModel("gpt-4.1", provider=OpenAIProvider(api_key=oai_api_key))

In [127]:
@dataclass
class Deps:
    client: AsyncClient
    firecrawl_api_key: str

In [128]:
crawl_agent = Agent(  
    crawl_model,
    output_type=PlayerInfo,
    deps_type=Deps
)

In [129]:
from firecrawl.firecrawl import ScrapeResponse

@crawl_agent.tool
def firecrawl_tool(ctx: RunContext[Deps], url: str) -> str:
    """
        Scraping tool using firecrawl

        Args:
            ctx: Pydantic AI context using dependency injection
            url: Url to be scraped
    """
    app = FirecrawlApp(api_key=ctx.deps.firecrawl_api_key)

    scrape_status = app.scrape_url(
      url, 
      formats=['markdown', 'html']
    )
    return scrape_status.html


In [138]:
async with AsyncClient() as client:
    deps = Deps(client=client, firecrawl_api_key=fc_api_key)
    crawl_result = await crawl_agent.run(f"Crawl and format from {result.output.fbref_url}", deps=deps)
    print(crawl_result.output)

11:35:50.184 crawl_agent run
11:35:50.186   chat gpt-4.1
11:35:50.187     POST api.openai.com/v1/chat/completions
11:35:51.157 Reading response body
             crawl_agent run
11:35:51.167   running 1 tool
11:35:51.168     running tool: firecrawl_tool
11:36:00.887   chat gpt-4.1
11:36:00.899     POST api.openai.com/v1/chat/completions
11:36:54.636 Reading response body


The next would be to generate the player statistics summary

In [139]:
summary_model = OpenAIModel("o3-mini", provider=OpenAIProvider(api_key=oai_api_key))

In [140]:
class PlayerSummary(BaseModel):
    overall: str = Field(description="Based on all the parameters, profile the player as an fwd, mid, def, gk. Write \
    2-3 lines about the same.")
    attacking: str = Field(description="Based on the attacking stats and position of the player, create a comprehensive profile of positives, negative and \
    what could be improved if it is applicable")
    defensive: str = Field(description="Based on the defensive stats and position of the player, create a comprehensive profile of positives, negative and \
    what could be improved if it is applicable")
    best_position: str = Field(description="Based on the stats, suggest the best position in the team. Give reasons why")
    which_team_is_suited: str = Field(description="Which club across all the leagues suits this players")

In [141]:
summary_agent = Agent(  
    summary_model,
    system_prompt="You are a forward thinking coach like Mikel Arteta. You want to profile players for 5-a-side team\
    Give your insights and suggestions to become a great 5-a-side player.",
    output_type=PlayerSummary,
)

In [142]:
summary_result = await summary_agent.run(f"Can you profile this player {crawl_result.output}")

11:37:45.807 summary_agent run
11:37:45.812   chat o3-mini
11:37:45.819     POST api.openai.com/v1/chat/completions
11:37:50.985 Reading response body


In [143]:
print(summary_result.output)

**TASK: Build a research agent**

- If I give a topic, the agent should search for relevant and latest topics, blogs, websites that it can find.
- Scrape them and generate a nice summary with the exact citations.
- Make sure that you also show what's the amount spent on this research.
- You can add any bells&whistles + extra things that you need in your own dev life.
- Bonus points if it has a frontend (could be a bot, desktop app or web app)

`pydantic-ai` with MCPs

We'll replicate https://x.com/jasonzhou1993/status/1920089480817717379

In [352]:
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerStdio

server = MCPServerStdio(  
    'deno',
    args=[
        'run',
        '-N',
        '-R=node_modules',
        '-W=node_modules',
        '--node-modules-dir=auto',
        'jsr:@pydantic/mcp-run-python',
        'stdio',
    ]
)

mcp_model = OpenAIModel("gpt-3.5-turbo", provider=OpenAIProvider(api_key=oai_api_key))
mcp_agent = Agent(mcp_model, mcp_servers=[server])

In [None]:
async with mcp_agent.run_mcp_servers():
    mcp_result = await mcp_agent.run('How many days between 2000-01-01 and 2025-03-18?')
    print(mcp_result.output)

In [None]:
async with mcp_agent.run_mcp_servers():
    mcp_result = await mcp_agent.run('How many days between 2000-01-01 and 2025-03-18?')
    print(mcp_result.output)

- Build a MCP server
- Connect to remote mcp server (Use Zapier)

In [110]:
from pydantic_ai import Agent
from pydantic_ai.mcp import MCPServerHTTP

server = MCPServerHTTP(url="https://mcp.zapier.com/api/mcp/s/YjZiNTRkM2MtYmE4Yi00NzBlLWFhY2QtMjdkMWFmZDczOGM1OjM4Y2Q3YTNiLTBiNTItNGMxNi04ZGMyLWY0YWVmZWY2NTcyZg==/sse")  
mcp_model = OpenAIModel("gpt-3.5-turbo", provider=OpenAIProvider(api_key=oai_api_key))
mcp_agent = Agent(mcp_model, mcp_servers=[server])

async def main():
    async with mcp_agent.run_mcp_servers():  
        mcp_result = await mcp_agent.run("Can you fetch me emails with subject line ?")
    return mcp_result

In [111]:
result = await main()

11:23:25.698 GET mcp.zapier.com/api/mcp/s/YjZiNTRkM2MtYmE4Yi00NzBlLWFhY2QtMjdkMWFmZDczOGM1OjM4Y2Q3YTNiLTBiNTItNGMxNi04ZGMyLWY0YWVmZWY2NTcyZg==/sse
11:23:26.029 POST mcp.zapier.com/api/mcp/s/YjZiNTRkM2MtYmE4Yi00NzBlLWFhY2QtMjdkMWFmZDczOGM1OjM4Y2Q3YTNiLTBiNTItNGMxNi04ZGMyLWY0YWVmZWY2NTcyZg==/message ? sessionId='bc30071c-…631565806'
11:23:26.346 Reading response body
11:23:26.351 POST mcp.zapier.com/api/mcp/s/YjZiNTRkM2MtYmE4Yi00NzBlLWFhY2QtMjdkMWFmZDczOGM1OjM4Y2Q3YTNiLTBiNTItNGMxNi04ZGMyLWY0YWVmZWY2NTcyZg==/message ? sessionId='bc30071c-…631565806'
11:23:26.353 mcp_agent run
11:23:26.641 Reading response body
11:23:26.646 POST mcp.zapier.com/api/mcp/s/YjZiNTRkM2MtYmE4Yi00NzBlLWFhY2QtMjdkMWFmZDczOGM1OjM4Y2Q3YTNiLTBiNTItNGMxNi04ZGMyLWY0YWVmZWY2NTcyZg==/message ? sessionId='bc30071c-…631565806'
11:23:26.932 Reading response body
             mcp_agent run
11:23:26.959   chat gpt-3.5-turbo
11:23:26.965     POST api.openai.com/v1/chat/completions
11:23:27.769 Reading response body
11:23:27.7

In [112]:
result

AgentRunResult(output="I couldn't find any emails from Vijayata.")