<div style="display: flex; align-items: flex-start; gap: 20px;">
    <div style="flex: 1;">
        <h1>A Tutorial on Function Calling with Gemma</h1>
        Function calling is a powerful construct that allows large language models like Gemma to use tools and interact with external APIs. It empowers Gemma to go beyond just generating text and actually do real world tasks for you, like checking the weather, booking a flight, keeping track of your expenses, and much much more! Combined with Gemma's multimodal capabilities and offline nature, function calling opens up a lot of amazing possibilities.<br><br>
        This tutorial will introduce you to function calling with Gemma 3 through a real world example - we will use Gemma to keep track of our expenses and stay under a monthly budget.
        <h3>Introduction</h3>
        Before we start writing code, let us understand how function calling works:<br><br>
        Now that we have a basic understanding of function calling in general, let's jump into implementing it with Gemma!
        <h3>Getting Started</h3>
        This tutorial assumes you have basic knowledge of Python. In addition, this tutorial uses [`ollama`](https://ollama.com) to run the models locally on your computer, and [`uv`](https://docs.astral.sh/uv/getting-started/) to run python code and manage dependencies. If you haven't already, please go to the linked pages to download and install the necessary tools on your computer.<br><br>
        Start Ollama by running the following command:<br><br>
        <pre>
        <code>ollama serve</code>
        </pre>
        In another terminal, clone this repository and open the notebook by running:<br><br>
        <pre>
        <code>git clone https://github.com/gamemaker1/gemma3-function-calling-tutorial.git</code>
        <code>cd gemma3-function-calling-tutorial/</code>
        <code>uv run --with jupyter jupyter lab tutorial.ipynb</code>
        </pre>
        This should open this tutorial notebook in your browser so you can follow along!
    </div>
    <div style="flex-shrink: 0;">
        <br><br><br>
        <img src="media/flowchart.svg" width="1250" />
    </div>
</div>

In [1]:
# (ignore) this code imports the toolkit and defines functions for rendering markdown in notebooks.

import re
import json
import toolkit

from hashlib import sha256
from datetime import datetime

from rich.console import Console
from rich.markdown import Markdown
from rich.live import Live
from rich.spinner import Spinner

console = Console(force_jupyter=True, width=100)


def render_markdown(text):
    text = re.sub(r"```thinking\b", "```markdown", text)

    text = re.sub(r"```function_spec\b", "```json", text)
    text = re.sub(r"```function_call\b", "```json", text)
    text = re.sub(r"```function_output\b", "```json", text)

    return Markdown(text)


def render_json(obj):
    return render_markdown(f"```json\n{json.dumps(obj, indent=2)}\n```\n")


def render_live(stream):
    response = ""

    with Live(
        Spinner("dots", text="Loading..."), refresh_per_second=10, console=console
    ) as live:
        for chunk in stream:
            response += chunk
            live.update(render_markdown(response))

    return response

### Selecting a Model

Function calling works best with the 12B and 27B parameter Gemma 3 models. If you do not have enough resources to run the 27B parameter model, please replace `gemma3:27b` with `gemma3:12b` in the following line and the `pull` command below.

In [2]:
model = "gemma3:27b"

Run the following command in terminal to download the model (this might take a while and several GB of data):

```bash
ollama pull gemma3:27b
```

### Instructing the Model

Gemma 3 is not specifically trained for function calling and [does not have a system prompt](https://ai.google.dev/gemma/docs/core/prompt-structure#system-instructions). However, providing instructions on how and what functions to use along with our messages effectively enables function calling in Gemma.

The instructions we provide for function calling might vary based on the model we are using. For the 12B and 27B parameter variants of Gemma 3, the following prompt has been observed to give great results.

In [3]:
prompt = toolkit.prompts.function_calling(model)
render_markdown(prompt)

Notice the following key instructions given to the model in the above prompt:

- The concept of 'threads' is introduced to provide a consistent context for the model to manage multi-step tasks from start to completion.
- A strict three-option response model (clarify, act, or conclude) is established to force deterministic behavior and structure the model's interactions.
- The model is required to externalize its reasoning in a thinking code block, making its chain of thought transparent before every response.
- Function specifications, calls, and outputs are handled in distinctly labeled code blocks (function_spec, function_call, function_output) to create a structured, machine-parsable API.
- The model must explicitly analyze dependencies between sub-tasks to determine if function calls can be executed in parallel within a single turn or must be executed sequentially across multiple turns.
- A core rule is defined, preventing the model from assuming or hallucinating information, forcing it to ask for clarification if any detail required for the entire task is missing.
- The final response must be a user-friendly, natural-language synthesis of all gathered information, abstracting away the technical jargon from function outputs.

The function specifications are valid JSON objects, following [this schema](schemas/function.schema.json). The specifications must include a detailed description of what the function does, what parameters it accepts, the possible outputs it produces, and the possible errors it throws, along with examples of prompts that should trigger its invocation. This level of detail enables the model to make best use of the functions provided to it.

### Defining the Functions

Now that we have the instructions for the model, we can write out our functions' code (in Python) and specifications (as per the schema mentioned in the prompt).

In [4]:
# (click to open) the custom errors our functions raise.


class BadParameters(Exception):
    pass


class NoResults(Exception):
    pass

In [5]:
# (click to open) the code for our expense tracking functions.

transactions = []


def make_expense_entry(
    amount: float, category: str, date: str, notes: str = None
) -> str:
    global transactions

    try:
        datetime.strptime(date, "%Y-%m-%d")
    except ValueError:
        raise BadParameters("The `date` must be in yyyy-mm-dd format.")

    entry = {
        "amount": amount,
        "category": category,
        "date": date,
        "notes": notes,
        "created": datetime.now().isoformat(),
    }

    identifier = sha256(str.encode(json.dumps(entry))).hexdigest()[:8]
    entry["id"] = identifier

    transactions += [entry]

    return identifier


def get_expense_entries(
    category: str = None, from_date: str = None, to_date: str = None
) -> dict:
    global transactions

    filtered = []
    for transaction in transactions:
        if category and transaction["category"].lower() != category.lower():
            continue
        if from_date and transaction["date"] < from_date:
            continue
        if to_date and transaction["date"] > to_date:
            continue
        filtered.append(transaction)

    if not filtered:
        raise NoResults("The search yielded no results with the given filters.")

    return {
        "transactions": filtered,
        "count": len(filtered),
        "total": sum(x["amount"] for x in filtered),
    }

In [6]:
# (click to open) the specifications for our two custom functions.

specifications = {
    "make_expense_entry": {
        "name": "make_expense_entry",
        "description": "Add an entry for expenditure of a certain amount on the given date to the specified category of the user's personal accounting records.",
        "parameters": {
            "type": "object",
            "properties": {
                "amount": {
                    "type": "float",
                    "description": "The amount spent by the user.",
                },
                "category": {
                    "type": "string",
                    "enum": [
                        "food",
                        "entertainment",
                        "clothing",
                        "travel",
                        "utilities",
                        "other",
                    ],
                    "description": "The category the expense falls under.",
                },
                "date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "The date of the expense",
                },
                "notes": {
                    "type": "string",
                    "description": "A short note about the expense, that will remind the user of it if needed.",
                },
            },
            "required": ["amount", "category", "date"],
        },
        "responses": [
            {"type": "string", "description": "The ID of the created expense entry."}
        ],
        "errors": [
            {"name": "BadEntry", "description": "The category or date are invalid."}
        ],
    },
    "get_expense_entries": {
        "name": "get_expense_entries",
        "description": "Fetches expense entries from the user's personal accounting records, filtering by category and date. To fetch expense entries for multiple categories at the same time, call this function in parallel.",
        "parameters": {
            "type": "object",
            "properties": {
                "category": {
                    "type": "string",
                    "enum": [
                        "food",
                        "entertainment",
                        "clothing",
                        "travel",
                        "utilities",
                        "other",
                    ],
                    "description": "The category the expense falls under, if filtering by category.",
                },
                "from_date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "If filtering by date, this is the start of the search period (inclusive).",
                },
                "to_date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "If filtering by date, this is the end of the search period (inclusive).",
                },
            },
            "required": [],
        },
        "responses": [
            {
                "type": "object",
                "properties": {
                    "entries": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "amount": {
                                    "type": "float",
                                    "description": "The amount spent by the user.",
                                },
                                "category": {
                                    "type": "string",
                                    "enum": [
                                        "food",
                                        "entertainment",
                                        "clothing",
                                        "travel",
                                        "utilities",
                                        "other",
                                    ],
                                    "description": "The category the expense falls under.",
                                },
                                "notes": {
                                    "type": "string",
                                    "description": "A short note about the expense, that will remind the user of it if needed.",
                                },
                                "date": {
                                    "type": "string",
                                    "format": "yyyy-mm-dd",
                                    "description": "The date of the expense",
                                },
                                "created": {
                                    "type": "string",
                                    "format": "iso8601",
                                    "description": "The timestamp recording when the entry was made.",
                                },
                                "id": {
                                    "type": "string",
                                    "description": "The ID of the expense entry.",
                                },
                            },
                        },
                    },
                    "count": {
                        "type": "int",
                        "description": "The number of transactions returned.",
                    },
                    "total": {
                        "type": "float",
                        "description": "The sum total of the amount spent in the transactions returned.",
                    },
                },
            }
        ],
        "errors": [
            {
                "name": "BadEntry",
                "description": "The category, from date or to date are invalid.",
            },
            {
                "name": "NoResults",
                "description": "The search yielded no results with the given filters.",
            },
        ],
    },
}

In the above blocks is the code and the specification for two functions:

1. `make_expense_entry`: allows us to enter the amount spent, which category the expense belongs to, the date of the transaction, and a short note about the expense (optional).
2. `get_expense_entries`: allows us to fetch all the expenses we have entered so far, and optionally filter them based on category or date.

Next, we register the function code and specification with the toolkit so that we can send the model the function specifications, and call the functions when they are invoked by the model.

In [7]:
toolkit.functions.register(make_expense_entry, specifications["make_expense_entry"])
toolkit.functions.register(get_expense_entries, specifications["get_expense_entries"])

Let's try calling the functions ourselves in Python first:

In [8]:
make_expense_entry(78, "travel", "2025-05-19")
make_expense_entry(231.5, "food", "2025-05-19")
make_expense_entry(459, "food", "2025-05-20")
make_expense_entry(917, "clothing", "2025-05-20")

render_json(get_expense_entries())

It works! Our expenses are being stored whenever the `make_expense_entry` function is called, and a list of expense entries is returned whenever the `get_expense_entries` function is called.

The next thing we need to do is collates the function specifications into code blocks with the `function_spec` label, since that is the format the model will expect it in. The toolkit provides a helper function that does this for all registered functions:

In [9]:
functions = toolkit.functions.specify()
render_markdown(functions)

### Calling the Function

<div style="display: flex; align-items: flex-start; gap: 20px;">
    <div style="flex: 1;">
        Now that we have the prompt and the function specifications, we can ask the model whatever we want to. Say we just ate at a restaurant, and got a bill (shown on the right).<br><br>
        We can give this image to Gemma, and ask it to add this to our expenses. We can even ask it to check if we're going over our monthly budget for food!<br><br>
        All we have to do is setup the conversation - the first message must be the function calling prompt, the second must be the specifications of the functions that Gemma can call, and the third must be our task for the model. 
    </div>
    <div style="flex-shrink: 0;">
        <img src="media/bill.jpg" width="400" height="100" />
    </div>
</div>

In [10]:
task = "Add this to my expenses, and let me know if I'm going over my food budget of 1000 this month."

In [11]:
messages = [
    {"role": "user", "content": prompt},
    {"role": "user", "content": functions},
    {"role": "user", "content": task, "images": ["media/bill.jpg"]},
]

The toolkit provides a helper function to chat with the model, which supports sending images and streaming responses. Use it to get the model's response and add that to the conversation.

In [12]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

The model's response contains two code blocks - the first with a `thinking` label, and the second with a `function_call` label. The first code block contains its thoughts on how to complete the task. The second code block contains the function to call, and the parameters to pass to it. It also contains an `id` field, which we must return along with the output of the function call. This lets the model know which function call produced the output it is receiving.

Notice how the model intelligently extracts parameters from the information available to it - it gets the amount, category, and date of the expense all from the bill we provided to it. It even assumes the current month is May, since the receipt is dated May 2025. 

Now that we have the complete model response, we need to parse it for function calls. The toolkit provides a function that extracts JSON contents of code blocks labelled `function_call`.

In [13]:
calls = toolkit.functions.parse_calls(complete_response)
render_json(calls)

We can now loop through these function calls and execute them one by one using a helper function from the toolkit. It identifies the function to execute from the list of registered functions, invokes the function with the given named parameters, and formats the function's output in JSON.

Since functions can be asynchronous, the results are passed via a callback. Our callback, the `respond` function, simply stores the results of the function call as a message from the user.

In [14]:
def respond(result: str) -> None:
    global messages
    messages.append({"role": "user", "content": result})


for call in calls:
    toolkit.functions.execute_call(call, respond)

Once the above code finishes executing, the contents of the last message will be the output of the function.

In [15]:
render_markdown(messages[-1]["content"])

Notice that the output is wrapped in a JSON object, with the `id` being the same as the `id` received in the function call. This lets the model know which function call produced the output it is receiving.

Now that we have the function output, we can continue the conversation with the model:

In [16]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

We get two code blocks again - one with the model's thoughts, and the other with another function call. Notice how it acknowledges the output of the previous function - it knows the call is successful, and that the returned string is the ID of the expense entry. This is thanks to the detailed function specification we gave it.

Now, let us parse and execute the new function call and inspect its result:

In [17]:
calls = toolkit.functions.parse_calls(complete_response)
for call in calls:
    toolkit.functions.execute_call(call, respond)

In [18]:
render_markdown(messages[-1]["content"])

Nice! We see that the `get_expense_entries` function works as intended, returning all the expense entries in the `food` category for the month of May. Let's send this output back to the model and get its response:

In [19]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

Nice! The model acknowledged the output of the function call, and used it to generate a response to our question of whether we had exceeded our food budget or not. This completes the task - and your first function calling enabled conversation with Gemma!

Next, let's try parallel function calling by asking the model to check if we've exceeded our clothes and travel budgets too. Since the two function calls the model needs to make are independent of each other, the model _should_ make them in parallel, i.e., produce two `function_call` labelled code blocks in a single response.

In [20]:
task = "Oh no! Could you also check if I have gone above my clothes and travel budgets?"
messages.append({"role": "user", "content": task})

In [21]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

Ah! Instead of calling the functions, the model is requesting a clarification. This is because we haven't mentioned our budget for the clothing or travel category, and the model has rightly not made any assumptions. Let us respond with a clarification:

In [22]:
clarification = "My clothing budget is 500, and my travel budget is 2k."
messages.append({"role": "user", "content": clarification})

In [23]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

Nice! It generated two calls to the `get_expense_entries` function with different IDs - passing 'clothing' as the category in the first one, and 'travel' as the category in the second. Let's parse and execute these function calls.

In [24]:
calls = toolkit.functions.parse_calls(complete_response)
for call in calls:
    toolkit.functions.execute_call(call, respond)

The successful execution of the functions means two messages have been added to the conversation:

In [25]:
render_markdown(messages[-2]["content"])

In [26]:
render_markdown(messages[-1]["content"])

Now that we have the outputs to both function calls, we can send it to the model and get its final response.

In [27]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

Output()

Nice! The model acknowledges both the outputs and uses them to construct a user-friendly response, answering our question about the clothing and travel budgets.

This concludes the tutorial on function calling with Gemma 3. Feel free to play around with different scenarios and functions using the toolkit, and start a discussion [here](https://github.com/gamemaker1/gemma3-function-calling-tutorial/discussions/new/choose) if you have any questions or just want to share something about function calling with Gemma :)