# Functions, tools and agents

We're going to improve two of the demos in this tutorial with the use of what you've just learnt in this section:

- Structured outputs
- Tool calls

In the last section, we built a basic RAG bot with a search using embeddings. In the data set, there is a price field. 

Embedding models aren't designed for filtering, especially since numbers are all treated with strong similarity.

Let's improve the search by requesting that the AI return some specific price filters.

In [6]:
# Setup code
from IPython.display import display, Markdown

import utils
from openai import OpenAI

# If you change the environment variables, you need to restart the kernel
base_url = utils.get_base_url()
api_key = utils.get_api_key()

if utils.MODE == "github":
    model = "openai/gpt-4.1-nano"  # A fast, small model
elif utils.MODE == "ollama":
    model = "llama3.1"  # llama and ollama are not related. It's a coincidence

# OpenAI client is a class. The old API used to use globals. Sometimes you might see code snippets for the old API. 

client = OpenAI(
    base_url=base_url,
    api_key=api_key,
)

# Defining tools

As with most things, OpenAI did this first and most other models copied. So even if you're not using OpenAI models, see [OpenAI Spec](https://platform.openai.com/docs/guides/function-calling?api-mode=responses#defining-functions) for usage.

Tools have:
 - A type (e.g. `function`)
 - A name which should be snake-case
 - A description. This is important, especially if there are multiple tools. Treat it like a prompt.
 - Parameters
 - Whether the function is "strict" i.e. function calls reliably adhere to the function schema, instead of being best effort. We recommend always enabling strict mode.

In [None]:
# A tool for filtering prices within a range. Is not required. User could say "less than $10", or "between $5 and $10".

tools = [
    {
        "type": "function",
        "function": {
            "name": "price_filter",
            "description": "Filter prices within a range.",
            "parameters": {
                "type": "object",
                "properties": {
                    "min_price": {
                        "type": "number",
                        "description": "Minimum price to filter.",
                    },
                    "max_price": {
                        "type": "number",
                        "description": "Maximum price to filter.",
                    },
                },
                # Both parameters are optional. But this is how you could specify them as required.
                # "required": ["min_price", "max_price"],
            },
        },
    }
]


In [None]:
# Let's try that with a prompt

def query_with_filter(query):
    response =client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant that can find products.",
            },
            {
                "role": "user",
                "content": query,
            }
        ],
        tools=tools,
        tool_choice="auto",
    )


    if response.choices[0].message.tool_calls:
        tool_call = response.choices[0].message.tool_calls[0]
        if tool_call.function.name == "price_filter":
            return tool_call.function.arguments

print(query_with_filter("Find me a product that costs between $5 and $10."))
print(query_with_filter("I'm looking for a hat that's less than $10."))
print(query_with_filter("Birthday scarf. At least $24.99."))
print(query_with_filter("Mothers day gift ideas for 20-30 bucks."))
print(query_with_filter("Find me a product that costs loads-a-money."))  # This one is a big ambiguous, but lets see what it does

## Task: 

1. Experiment with some different questions and see what the filters look like.
1. Modify the function below to filter out the dataframe by price before sorting it by similarity
1. Update the `rag_chat` function and complete the two TODO items
1. Test the discussion with various searches. Start with the suggestions above.

In [None]:
# Using the tool call responses
from typing import Optional
import utils
from utils.embeddings import get_embedding_client, cosine_similarity, get_embedding, load_clothing_data  # See utils/embeddings.py for the cosine similarity function (its not complicated)

embedding_client, dimensions, embedding_model = get_embedding_client()

def search_df(df, product_description, n=3, min_price : Optional[float] = None, max_price: Optional[float] = None):
    embedding = get_embedding(embedding_client, model=embedding_model, dimensions=dimensions, input=product_description)
    df['similarities'] = df.embedding.apply(lambda x: cosine_similarity(x, embedding))
    # TODO : filter the dataframe by price

    res = df.sort_values('similarities', ascending=False).head(n)
    return res

data = load_clothing_data(embedding_model)


def rag_chat(query, n=3):
    # TODO: Get the filter parameters for `query`

    # TODO: Update the search function to filter by price
    matches = search_df(data, query, n=n)
    
    # Merge this into a prompt
    # TODO : Find your "winning prompt from the last exercise"
    system_prompt = f"""
    The user has asked about a product, you are a helpful assistant that can give suggestions about products we have. 

    The matching products are:
    """

    for match in matches.iterrows():
        match = match[1]
        system_prompt += f"""
        Name: {match['name']}
        Description: {match.description}
        URL: https://www.superpythonshop.com/products/{match.id}

        """

    # Step 2: Call the model with the prompt
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": query},
        ],
        temperature=0.5,
        n=1,
    )

    # Step 3: Return the response
    return response.choices[0].message.content

from IPython.display import display, Markdown

display(Markdown(rag_chat("I need a warm hat for winter less than $20")))


# Elemental Clash AI ✨✨✨

I've taken 7 photos of playing cards on a table and we're going to get the AI to work out which player wins in our game **Elemental Clash**.

![Card picture](data/cards/IMG_9059.jpg)

If you want to add your own photos using a phone, please do!

First, lets define the data structure for the cards so we can use Structured Outputs to get the cards played as PyDantic models.

In [8]:
# from enum import StrEnum # Python 3.11 + otherwise use (str, Enum) as base class
from enum import Enum
from pydantic import BaseModel
from typing import cast


class CardSuit(str, Enum):
    hearts = "hearts"
    diamonds = "diamonds"
    clubs = "clubs"
    spades = "spades"


class CardValue(str, Enum):
    two = "2"
    three = "3"
    four = "4"
    five = "5"
    six = "6"
    seven = "7"
    eight = "8"
    nine = "9"
    ten = "10"
    jack = "J"
    queen = "Q"
    king = "K"
    ace = "A"

SUIT_GLYPHS = {
    CardSuit.hearts: "♥",
    CardSuit.diamonds: "♦",
    CardSuit.clubs: "♣",
    CardSuit.spades: "♠",
}

class PlayedCard(BaseModel):
    suit: CardSuit
    value: CardValue

    def __repr__(self):
        return f"{self.value.name} of {SUIT_GLYPHS[self.suit]}"


class PlayedCards(BaseModel):
    cards: list[PlayedCard]


def parse_played_cards(message) -> PlayedCards:
    completion = client.beta.chat.completions.parse(
        model=model,
        messages=[
            {"role": "system", "content": "What cards were played."},
            {"role": "user", "content": message},
        ],
        response_format=PlayedCards,
    )


    message = completion.choices[0].message
    if message.refusal:
        print(message.refusal)
        raise ValueError("Could not parse the message.")
    else:
        return cast(PlayedCards, message.parsed)

parse_played_cards("John played the ace of spades. Sarita played the queen of hearts.")

PlayedCards(cards=[ace of ♠, queen of ♥])

# Task

Next, let's combine this with a function called `determine_player_programmatically` which will evaluate (using Python, not AI) which player wins a turn.

You will need to:

1. Add player's name to the structured output by adding it as an attribute to `PlayedCard`
1. Verify it works
1. Call `determine_winner_programmatically` with a dictionary where `key` is player name and `value` is the card.
1. The value needs to be in the format of `[value] of [suit]` e.g. `"2 of spades"` or `"K of hearts"`
1. Print the winning player

In [None]:
from utils.game import determine_winner_programmatically

sample_query = "John played the ace of spades. Sarita played the queen of hearts."

# TODO: implement as per instructions
winner = ...

assert winner == "John"


# Adding a visual model

To recognise what cards are played on the table, we can use a visual, or "multi-modal" model.

The GPT-4o series is one of the most popular multi-modal models and it also supports structured outputs.

For local development, `gemma3`, `phi4` and `llama4:scout` are the best options (this week!).

Let's try it out.

In [None]:
import base64

image_model = "openai/gpt-4.1-mini" if utils.MODE == "github" else "gemma3:4b"

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# If we're using Github, use my copy of the photo on github
if utils.MODE == "github":
    image_url = "https://raw.githubusercontent.com/tonybaloney/PyCon-AI-Crash-Course/refs/heads/main/data/cards/IMG_9059.jpg"
else:
    # If local, you can use that or you can base64 encode it
    base64_image = encode_image("data/cards/IMG_9059.jpg")
    image_url = f"data:image/jpeg;base64,{base64_image}"


response = client.chat.completions.create(
    model=image_model,
    messages=[
        {
            "role": "user",
            "content": [
                { "type": "text", 
                   "text": "What cards are on the table" },
                {
                    "type": "image_url",
                    "image_url": {"url": image_url}
                },
            ],
        }
    ],
)

result = response.choices[0].message.content
display(Markdown(result))


# Task

1. Use the `parse_played_cards` or the structured output parameter to this code and see if you can get a list of cards as a Python object.
1. Write a program to determine the winner from any photo in `data/cards` by combining everything you've learned in this module. Since we don't know the player name in the image, just assign them a number and say what their card was.

NB: The smaller, local models might not be able to accurately get _all_ of the cards on the table.


# Agentic Frameworks

Everything you've done so far in this tutorial has led up to "agents"

Agentic programming is a way of connecting tasks, whether they be AI-driven, programmatic, or human-driven.

For our card game, the flow is:

- Someone deals each player 5 cards
- The players choose a card and place it face down
- The cards are turned over
- The cards are analysed (we just automated that part)
- The winner is decided

Lets build a deck of cards in Python using a generator to deal from a deck. 

In [4]:
from utils.game import deal, determine_winner_programmatically

deck = deal()  # Generates next card

player1_hand = [next(deck) for _ in range(5)]
player2_hand = [next(deck) for _ in range(5)]
player3_hand = [next(deck) for _ in range(5)]

scores = {
    "Player 1": 0,
    "Player 2": 0,
    "Player 3": 0,
}

for turn in range(5):
    print("Player 1 played", player1_hand[turn])
    print("Player 2 played", player2_hand[turn])
    print("Player 3 played", player3_hand[turn])

    # Determine winner
    winner = determine_winner_programmatically(
        {
            "Player 1": player1_hand[turn], 
            "Player 2": player2_hand[turn], 
            "Player 3": player3_hand[turn]
        }
    )
    print(f"Winner: {winner}\n")
    # Update scores
    scores[winner] += 1

print("Final Scores:")
for player, score in scores.items():
    print(f"{player}: {score} points")


Player 1 played 10 of clubs
Player 2 played Q of hearts
Player 3 played 7 of diamonds
Winner: Player 3

Player 1 played J of clubs
Player 2 played K of clubs
Player 3 played 5 of hearts
Winner: Player 2

Player 1 played J of diamonds
Player 2 played 10 of hearts
Player 3 played 7 of spades
Winner: Player 3

Player 1 played A of hearts
Player 2 played A of clubs
Player 3 played 3 of hearts
Winner: Player 2

Player 1 played 10 of diamonds
Player 2 played 9 of diamonds
Player 3 played 7 of clubs
Winner: Player 3

Final Scores:
Player 1: 0 points
Player 2: 2 points
Player 3: 3 points


In [14]:
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.providers.openai import OpenAIProvider
from openai import AsyncOpenAI

# Pydantic AI wants async clients by default
pai_model = OpenAIModel(
    model_name=model,
    provider=OpenAIProvider(openai_client=AsyncOpenAI(base_url=base_url, api_key=api_key)),
)

cardplaying_agent = Agent(
    pai_model,
    deps_type=PlayedCards,
    output_type=str,
    system_prompt="You are a card game assistant. You determine the winner using determine_winner",
)

@cardplaying_agent.tool
async def determine_winner(
    context: RunContext,
    cards: PlayedCards,
) -> str:
    """
    Determine the winner of a card game based on the played cards.
    """
    return determine_winner_programmatically(
        {f"Player {i}": repr(card) for i, card in enumerate(cards.cards, start=1)}
    )


result = await cardplaying_agent.run('Who wins this turn', deps=PlayedCards(cards=[
    PlayedCard(suit=CardSuit.spades, value=CardValue.ace),
    PlayedCard(suit=CardSuit.hearts, value=CardValue.queen),
    PlayedCard(suit=CardSuit.clubs, value=CardValue.jack),
]))
print(result.output)

KeyError: 'seven'