# Offline, Multimodal Function Calling with Gemma: A Tutorial

Function calling is a powerful construct that allows large language models like Gemma to use tools and interact with external APIs. It empowers Gemma to go beyond just generating text and actually do real world tasks for you, like checking the weather, booking a flight, keeping track of your expenses, and much much more! Combined with Gemma's multimodal capabilities and offline nature, function calling opens up a lot of amazing possibilities.

This tutorial will introduce you to the concept of function calling, and how to make use of it with Gemma 3's multimodal capabilities running offline on your computer. Here's a peak of a conversation you'll be having by the end of this tutorial:

<em>user</em>
> [[image of bill](media/bill.jpg)]
> 
> Add this to my expenses, and let me know if I'm going over my food budget of 1000 this month.

<em>model</em>
> You've spent a total of 1278.5 on food this month, which exceeds your budget of 1000 by 278.5. It looks like that dinner at TSC put you over the limit!

### Introduction

Let's take an example to work with for this tutorial. Say you've already written some code for a simple expense tracker in Python. Your code has two functions:

- `make_expense_entry` (to record expenses)
- `get_expense_entries` (to filter expenses)

Normally, using this code would require building a separate application to handle user input (things like the amount, category, and date of each transaction) and display data (like the stored transactions).

But what if we could make this process much easier? What if you could simply send a picture of a bill to Gemma and have it automatically add the expense to your tracker? Or ask Gemma to summarize your spending and help you stay within a monthly budget?

That's exactly what we'll explore today - using Gemma to discover and call your existing Python functions, turning natural language requests into automated actions. We will:

1. Install and setup the necessary tools
2. Write the Python functions for our simple expense tracker
3. Write specifications for our functions so the model can discover them
4. Select and download a multimodal Gemma model
5. Instruct the model on how to call our functions
6. Bring it all together and use the now intelligent expense tracker

Let's start with the first step - installing the necessary tools!

### Installating and setting up tools 

This tutorial assumes you have basic knowledge of Python. In addition, this tutorial uses the following tools:

- [`ollama`](https://ollama.com) to run the models locally on your computer
- [`uv`](https://docs.astral.sh/uv/getting-started/) to run python code and manage dependencies.

If you haven't already, please follow this links above to get them set up.

First, let's make sure everything is installed correctly. Open your terminal and check the versions:

```bash
uv --version
ollama --version
```

Next, start `ollama` by running the following command:

```bash
ollama serve
```

In another terminal, clone this repository and open the notebook by running:

```bash
git clone https://github.com/gamemaker1/gemma3-function-calling-tutorial.git
cd gemma3-function-calling-tutorial/
uv run --with jupyter jupyter lab tutorial.ipynb
```

This should open this tutorial notebook in your browser so you can follow along!

In [1]:
# (ignore) this code imports the toolkit and defines functions for rendering markdown in notebooks.

import re
import json
import toolkit

from hashlib import sha256
from datetime import datetime

from IPython.display import clear_output, display, Markdown


def render_markdown(text):
    pattern = r"```thinking\n(.*?)\n```"

    def replacement(match):
        content = match.group(1)
        return f"<blockquote>\n{content}\n</blockquote>\n\n"

    text = re.sub(pattern, replacement, text, flags=re.DOTALL)

    text = re.sub(r"```function_spec\b", "```json", text)
    text = re.sub(r"```function_call\b", "```json", text)
    text = re.sub(r"```function_output\b", "```json", text)

    display(Markdown(text))


def render_json(data):
    return render_markdown(f"```json\n{json.dumps(data, indent=2)}\n```\n")


def render_live(stream):
    response = ""
    render_markdown("> Loading...")

    for chunk in stream:
        response += chunk
        clear_output(wait=True)
        render_markdown(response)

    return response

> You'll notice our code uses a helper library called `toolkit`. This is not a standard Python library but a custom set of functions we've written specifically for this tutorial. It's included in the repository you just cloned and is designed to simplify the process of communicating with the model, so we can focus on the core concepts of function calling.

### Writing our expense tracking functions

Now that we have everything setup, let us quickly write the code for our tiny expense tracker :)

In [2]:
# (click to open) the code for our expense tracking functions.

transactions = []


class BadParameters(Exception):
    pass


class NoResults(Exception):
    pass


def make_expense_entry(
    amount: float, category: str, date: str, notes: str = None
) -> str:
    global transactions

    try:
        datetime.strptime(date, "%Y-%m-%d")
    except ValueError:
        raise BadParameters("The `date` must be in yyyy-mm-dd format.")

    entry = {
        "amount": amount,
        "category": category,
        "date": date,
        "notes": notes,
        "created": datetime.now().isoformat(),
    }

    identifier = sha256(str.encode(json.dumps(entry))).hexdigest()[:8]
    entry["id"] = identifier

    transactions += [entry]

    return identifier


def get_expense_entries(
    category: str = None, from_date: str = None, to_date: str = None
) -> dict:
    global transactions

    filtered = []
    for transaction in transactions:
        if category and transaction["category"].lower() != category.lower():
            continue
        if from_date and transaction["date"] < from_date:
            continue
        if to_date and transaction["date"] > to_date:
            continue
        filtered.append(transaction)

    if not filtered:
        raise NoResults("The search yielded no results with the given filters.")

    return {
        "transactions": filtered,
        "count": len(filtered),
        "total": sum(x["amount"] for x in filtered),
    }

Let's test it out:

In [3]:
make_expense_entry(78, "travel", "2025-05-19")
make_expense_entry(231.5, "food", "2025-05-19")
make_expense_entry(459, "food", "2025-05-20")
make_expense_entry(917, "clothing", "2025-05-20")

render_json(get_expense_entries())

```json
{
  "transactions": [
    {
      "amount": 78,
      "category": "travel",
      "date": "2025-05-19",
      "notes": null,
      "created": "2025-06-20T17:02:19.603784",
      "id": "0b7fe740"
    },
    {
      "amount": 231.5,
      "category": "food",
      "date": "2025-05-19",
      "notes": null,
      "created": "2025-06-20T17:02:19.603926",
      "id": "ba6c9c0e"
    },
    {
      "amount": 459,
      "category": "food",
      "date": "2025-05-20",
      "notes": null,
      "created": "2025-06-20T17:02:19.604021",
      "id": "0be11f7b"
    },
    {
      "amount": 917,
      "category": "clothing",
      "date": "2025-05-20",
      "notes": null,
      "created": "2025-06-20T17:02:19.604100",
      "id": "1b0ec625"
    }
  ],
  "count": 4,
  "total": 1685.5
}
```


It works! Our expenses are being stored whenever the `make_expense_entry` function is called, and a list of expense entries is returned whenever the `get_expense_entries` function is called.

> Normally, using this code would require building a separate application to handle user input (things like the amount, category, and date of each transaction) and display data (like the stored transactions).

Now, instead of doing this...

> But what if we could make this process much easier? What if you could simply send a picture of a bill to Gemma and have it automatically add the expense to your tracker? Or ask Gemma to summarize your spending and help you stay within a monthly budget?

...let's start doing this!

### Writing specifications to make our functions discoverable

For a model like Gemma to use our functions, it needs to know that they exist. More importantly, it needs to understand what each function does, what parameters it accepts, the possible outputs it produces, and the possible errors it throws. The model needs a 'manual' for each function.

For LLMs, this manual comes in the form of a function specification. It is a structured JSON object written in accordance with [this schema](schemas/function.schema.json) that tells the model everything it needs to know.

For our two expense tracker functions, the specifications looks like this:

In [4]:
# (click to open) the specifications for our two functions.

specifications = {
    "make_expense_entry": {
        "name": "make_expense_entry",
        "description": "Add an entry for expenditure of a certain amount on the given date to the specified category of the user's personal accounting records.",
        "parameters": {
            "type": "object",
            "properties": {
                "amount": {
                    "type": "float",
                    "description": "The amount spent by the user.",
                },
                "category": {
                    "type": "string",
                    "enum": [
                        "food",
                        "entertainment",
                        "clothing",
                        "travel",
                        "utilities",
                        "other",
                    ],
                    "description": "The category the expense falls under.",
                },
                "date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "The date of the expense",
                },
                "notes": {
                    "type": "string",
                    "description": "A short note about the expense, that will remind the user of it if needed.",
                },
            },
            "required": ["amount", "category", "date"],
        },
        "responses": [
            {"type": "string", "description": "The ID of the created expense entry."}
        ],
        "errors": [
            {"name": "BadEntry", "description": "The category or date are invalid."}
        ],
    },
    "get_expense_entries": {
        "name": "get_expense_entries",
        "description": "Fetches expense entries from the user's personal accounting records, filtering by category and date. To fetch expense entries for multiple categories at the same time, call this function in parallel.",
        "parameters": {
            "type": "object",
            "properties": {
                "category": {
                    "type": "string",
                    "enum": [
                        "food",
                        "entertainment",
                        "clothing",
                        "travel",
                        "utilities",
                        "other",
                    ],
                    "description": "The category the expense falls under, if filtering by category.",
                },
                "from_date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "If filtering by date, this is the start of the search period (inclusive).",
                },
                "to_date": {
                    "type": "string",
                    "format": "yyyy-mm-dd",
                    "description": "If filtering by date, this is the end of the search period (inclusive).",
                },
            },
            "required": [],
        },
        "responses": [
            {
                "type": "object",
                "properties": {
                    "entries": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "amount": {
                                    "type": "float",
                                    "description": "The amount spent by the user.",
                                },
                                "category": {
                                    "type": "string",
                                    "enum": [
                                        "food",
                                        "entertainment",
                                        "clothing",
                                        "travel",
                                        "utilities",
                                        "other",
                                    ],
                                    "description": "The category the expense falls under.",
                                },
                                "notes": {
                                    "type": "string",
                                    "description": "A short note about the expense, that will remind the user of it if needed.",
                                },
                                "date": {
                                    "type": "string",
                                    "format": "yyyy-mm-dd",
                                    "description": "The date of the expense",
                                },
                                "created": {
                                    "type": "string",
                                    "format": "iso8601",
                                    "description": "The timestamp recording when the entry was made.",
                                },
                                "id": {
                                    "type": "string",
                                    "description": "The ID of the expense entry.",
                                },
                            },
                        },
                    },
                    "count": {
                        "type": "int",
                        "description": "The number of transactions returned.",
                    },
                    "total": {
                        "type": "float",
                        "description": "The sum total of the amount spent in the transactions returned.",
                    },
                },
            }
        ],
        "errors": [
            {
                "name": "BadEntry",
                "description": "The category, from date or to date are invalid.",
            },
            {
                "name": "NoResults",
                "description": "The search yielded no results with the given filters.",
            },
        ],
    },
}

Now we just need to 'register' our Python functions and their corresponding JSON 'manuals' with our toolkit. This lets our helper library know which functions are available to be called.

In [5]:
toolkit.functions.register(make_expense_entry, specifications["make_expense_entry"])
toolkit.functions.register(get_expense_entries, specifications["get_expense_entries"])

To let the model know what tools it can use, we provide it with these specifications in a special format, wrapped in code blocks labeled function_spec. Our toolkit has a helper function that compiles all our registered function specs into the right format for the model:

In [6]:
functions = toolkit.functions.specify()
render_markdown(functions)

```json
{
  "name": "make_expense_entry",
  "description": "Add an entry for expenditure of a certain amount on the given date to the specified category of the user's personal accounting records.",
  "parameters": {
    "type": "object",
    "properties": {
      "amount": {
        "type": "float",
        "description": "The amount spent by the user."
      },
      "category": {
        "type": "string",
        "enum": [
          "food",
          "entertainment",
          "clothing",
          "travel",
          "utilities",
          "other"
        ],
        "description": "The category the expense falls under."
      },
      "date": {
        "type": "string",
        "format": "yyyy-mm-dd",
        "description": "The date of the expense"
      },
      "notes": {
        "type": "string",
        "description": "A short note about the expense, that will remind the user of it if needed."
      }
    },
    "required": [
      "amount",
      "category",
      "date"
    ]
  },
  "responses": [
    {
      "type": "string",
      "description": "The ID of the created expense entry."
    }
  ],
  "errors": [
    {
      "name": "BadEntry",
      "description": "The category or date are invalid."
    }
  ]
}
```

```json
{
  "name": "get_expense_entries",
  "description": "Fetches expense entries from the user's personal accounting records, filtering by category and date. To fetch expense entries for multiple categories at the same time, call this function in parallel.",
  "parameters": {
    "type": "object",
    "properties": {
      "category": {
        "type": "string",
        "enum": [
          "food",
          "entertainment",
          "clothing",
          "travel",
          "utilities",
          "other"
        ],
        "description": "The category the expense falls under, if filtering by category."
      },
      "from_date": {
        "type": "string",
        "format": "yyyy-mm-dd",
        "description": "If filtering by date, this is the start of the search period (inclusive)."
      },
      "to_date": {
        "type": "string",
        "format": "yyyy-mm-dd",
        "description": "If filtering by date, this is the end of the search period (inclusive)."
      }
    },
    "required": []
  },
  "responses": [
    {
      "type": "object",
      "properties": {
        "entries": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "amount": {
                "type": "float",
                "description": "The amount spent by the user."
              },
              "category": {
                "type": "string",
                "enum": [
                  "food",
                  "entertainment",
                  "clothing",
                  "travel",
                  "utilities",
                  "other"
                ],
                "description": "The category the expense falls under."
              },
              "notes": {
                "type": "string",
                "description": "A short note about the expense, that will remind the user of it if needed."
              },
              "date": {
                "type": "string",
                "format": "yyyy-mm-dd",
                "description": "The date of the expense"
              },
              "created": {
                "type": "string",
                "format": "iso8601",
                "description": "The timestamp recording when the entry was made."
              },
              "id": {
                "type": "string",
                "description": "The ID of the expense entry."
              }
            }
          }
        },
        "count": {
          "type": "int",
          "description": "The number of transactions returned."
        },
        "total": {
          "type": "float",
          "description": "The sum total of the amount spent in the transactions returned."
        }
      }
    }
  ],
  "errors": [
    {
      "name": "BadEntry",
      "description": "The category, from date or to date are invalid."
    },
    {
      "name": "NoResults",
      "description": "The search yielded no results with the given filters."
    }
  ]
}
```


### Selecting and downloading a multimodal model

Next, we'll select and download a model (the 'brain' of our setup) using `ollama`. For function calling, the larger models work best, so we recommend the 27B parameter version of Gemma 3 (`gemma3:27b`).

In [7]:
model = "gemma3:27b"

This next command pulls the selected model from `ollama`. This download is large, so it might take a few minutes.

In [8]:
! ollama pull {model}

bind [127.0.0.1]:5555: Address already in use
channel_setup_fwd_listener_tcpip: cannot listen to port: 5555
bind [127.0.0.1]:11434: Address already in use
channel_setup_fwd_listener_tcpip: cannot listen to port: 11434
Could not request local forwarding.
[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling e796792eba26: 100% ▕██████████████████▏  17 GB     

### Instructing the model to use our functions

Gemma 3 is not specifically trained for function calling and [does not have a system prompt](https://ai.google.dev/gemma/docs/core/prompt-structure#system-instructions). However, providing instructions on what functions it can use and how to use them at the beginning of the conversation effectively enables function calling in Gemma. Think of this as the 'rulebook' or 'job description' for the assistant we want Gemma to be.

The instructions we provide for function calling might vary based on the model we are using. The prompt we're using has been carefully designed to produce the best results for the selected model.

In [9]:
prompt = toolkit.prompts.function_calling(model)
render_markdown(prompt)

You are a helpful assistant to me, the user.

You have access to programmatic functions that you can invoke to better assist me. These functions are described via function specifications - essentially, your manual to using them. Each specification contains the name of the function, a detailed description on what it does, the parameters it accepts, the possible responses and errors it produces, and examples of when to use the given function. A function specification is a JSON object, provided by me in a special code block labelled `function_spec`, like so:

```json
{
    "name": the name of the function,
    "description": a detailed description of what the function does,
    "parameters": a dictionary of arguments to pass to the function,
    "responses": a list of possible outputs the function can produce,
    "errors": a list of possible errors the function can raise,
    "examples": a list of examples of when calling the function is required
}
```

You can call these functions by including special code blocks labelled `function_call` in your responses, like so:

```json
{
	"id": a string used to pair the function call with its output,
	"function": the name of the function to call,
	"parameters": a dictionary of parameter names and values to pass to the function
}
```

I will execute the function and the result of the function call will be returned in a special code block labelled `function_output` in their response, like so:

```json
{
	"id": the same string provided when calling the function,
	"result": the output of the function - present only when the function execution succeeds,
	"error": the error thrown by the function - present only when the function raises an exception or throws an error
}
```

Please remember the following guidelines while responding.

If you need information that I haven't provided in the conversation uptil now, *ask me for it*. You must never, ever, guess or make assumptions. This is a critical safeguard that prevents hallucinations.

Before responding, briefly consider:

- what information do you need to fulfill the request?
- what functions, if any, can help?
- what are the potential dependencies between functions?

You are encouraged to combine subtasks and call functions in parallel by including multiple `function_call` blocks in a single response IF AND ONLY IF:

- the functions do not affect each other's execution,
- the result of one or more function call(s) is not needed for the others to run, and
- a function does not need to read the data that another function has just created, updated, or deleted.

Please note that the order in which functions execute is NOT guaranteed. If all the above conditions are not satisfied, do not call the dependent functions in parallel. It is generally safe to call the same function with different parameters in parallel when you need to fetch, update, create or delete multiple of the same kind of resources.

Ensure your final response fully addresses my request, and is helpful to me.

Please preface all your responses with a special code block labelled `thinking` that include your thoughts, observations and justification for actions that you are taking. Include the guidelines you have followed and not followed while producing your response as well.

This prompt looks long, but it's doing some very important things to ensure Gemma behaves predictably:

- It tells the model it's a "thinking assistant" and introduces the concept of a "thread" to manage a task from start to finish.
- It restricts the model to one of three actions: ask for clarification, call a function, or conclude the conversation. This prevents unstructured responses.
- It requires the model to explain its reasoning in a thinking block before every action. This is like asking someone to 'show their work' and is invaluable for understanding and debugging.
- It defines special code blocks (`function_spec`, `function_call`, `function_output`) so our Python code can easily parse the model's intent without getting it mixed up with the conversational text.
- It instructs the model to analyze tasks and run function calls in parallel whenever possible.
- It includes a crucial rule: never assume or hallucinate information. If a detail is missing (like a budget amount), the model must stop and ask for it.
- It demands that the final answer be a friendly, natural-language summary, not just a raw dump of data from a function.

### Bringing it all together

With our 'manuals' written, our 'brain' downloaded and setup with our 'rulebook', we're ready to put it all together!

We can now engage in a back-and-forth conversation with Gemma to complete our tasks. First, it might be helpful to understand the flow of this conversation, which we'll call the user-model interaction loop:

```yaml
    user: give an instruction or task to the model (e.g., log this expense).
    
    model: thinks about the task and decides if it is necessary to use a function.
    model: if yes, it produces a response with a 'function_call' code block.
    
    code: parses the model response to extract function calls.
    code: executes the actual python function with the parameters the model provided.
    code: feed the result of that function back to the model in a 'function_output'.
    
    model: jumps back to step 2 - but thinks about the result instead of the task.
```

For a diagrammatic representation of this flow, take a look at [this flowchart](media/flowchart.svg).

### Trying it out

<div style="display: flex; align-items: flex-start;">
    <div style="flex: 1;">
        Let's try this out. Say we just ate at a restaurant, and got a bill (shown on the right). Let's ask Gemma to add it as an expense, and check if this puts us over our monthly budget for food.<br><br>
        We'll give Gemma the image of the bill and our request. Notice how we are bundling everything into a list of messages: our prompt (the rulebook), the function specifications (the manuals), and then the task itself.
    </div>
    <div style="flex-shrink: 0;">
        <img src="media/bill.jpg" width="300" height="100" />
    </div>
</div>

In [10]:
task = "Add this bill to my expenses, and let me know if I'm going over my food budget of 1000 this month."

In [11]:
messages = [
    {"role": "user", "content": prompt},
    {"role": "user", "content": functions},
    {"role": "user", "content": task, "images": ["media/bill.jpg"]},
]

Our toolkit's `get_response` function handles sending this to the model. Let's see what the model's first response is.

In [12]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
Okay, I need to add the bill details to the user's expenses and then check if their total food expenses this month are over 1000. I will first add the bill as a new expense entry, then fetch all food expenses for the current month, and finally compare the total to 1000.

I will use the `make_expense_entry` function to add the expense. The amount is 588.00, the category is "food", and the date is 2025-05-04.

After that, I will use `get_expense_entries` to retrieve all expenses for the "food" category for the current month (May 2025). I will need to determine the start and end dates for May 2025. Then, I will sum up the amounts from those entries and check if the total exceeds 1000.
</blockquote>



```json
{
	"id": "1",
	"function": "make_expense_entry",
	"parameters": {
		"amount": 588.00,
		"category": "food",
		"date": "2025-05-04",
		"notes": "Dinner bill"
	}
}
```

### Making sense of the model's response

The model's response contains two key parts - a `thinking` block, and a `function_call` block. The `thinking` block shows us its reasoning - it understood it needs to first add the expense, and then check the total. The `function_call` block is its chosen action - it contains the function to call, and the parameters to pass to it.

Notice how the model intelligently extracts parameters from the information available to it - it gets the amount, category, and date of the expense all from the bill we provided to it. It even assumes the current month is May, since the receipt is dated May 2025. 

Now, let's follow the user-model loop and move on to step 3 - parsing this response to get the function call.

In [13]:
calls = toolkit.functions.parse_calls(complete_response)
render_json(calls)

```json
[
  {
    "id": "1",
    "function": "make_expense_entry",
    "parameters": {
      "amount": 588.0,
      "category": "food",
      "date": "2025-05-04",
      "notes": "Dinner bill"
    }
  }
]
```


### Executing the function calls

We can now loop through these function calls and execute them one by one using a helper function from the toolkit. It identifies the function to execute from the list of registered functions, invokes the function with the given named parameters, and formats the function's output in JSON.

Since functions can be asynchronous, the results are passed via a callback. Our callback, the `respond` function, simply stores the results of the function call as a message from the user.

In [14]:
def respond(result: str) -> None:
    global messages
    messages.append({"role": "user", "content": result})


for call in calls:
    toolkit.functions.execute_call(call, respond)

Let's look at the result that was added to our messages. It's a `function_output` block containing the ID of the newly created expense.

In [15]:
render_markdown(messages[-1]["content"])

```json
{
    "id": "1",
    "result": "14fad560"
}
```

Notice that the output is wrapped in a JSON object, with the `id` being the same as the `id` received in the function call. This lets the model know which function call produced the output it is receiving.

### Continuing the conversation

Now that we have the results of the first function call, we can provide it to the model. Let's send the updated conversation history back to Gemma and see what it does for its next step.

In [16]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
The `make_expense_entry` function call was successful and returned the ID "14fad560". Now I need to fetch all food expenses for May 2025 to check if the budget has been exceeded. I'll call the `get_expense_entries` function with the category set to "food", the from\_date set to 2025-05-01, and the to\_date set to 2025-05-31.
</blockquote>



```json
{
	"id": "2",
	"function": "get_expense_entries",
	"parameters": {
		"category": "food",
		"from_date": "2025-05-01",
		"to_date": "2025-05-31"
	}
}
```

With this, we go back to the second step of the loop, and receive two blocks again - one with the model's thoughts, and the other with another function call. Notice how it acknowledges the output of the previous function - it knows the call is successful, and that the returned string is the ID of the expense entry. This is thanks to the detailed function specification ('manual') we gave it.

The model also indicates that it's now moving on to the second part of our request: checking the food budget. It has correctly decided to call the `get_expense_entries` function. Let's parse and execute this new call.

In [17]:
calls = toolkit.functions.parse_calls(complete_response)
for call in calls:
    toolkit.functions.execute_call(call, respond)

Let's inspect the result of this second function call:

In [18]:
render_markdown(messages[-1]["content"])

```json
{
    "id": "2",
    "result": {
        "transactions": [
            {
                "amount": 231.5,
                "category": "food",
                "date": "2025-05-19",
                "notes": null,
                "created": "2025-06-20T17:02:19.603926",
                "id": "ba6c9c0e"
            },
            {
                "amount": 459,
                "category": "food",
                "date": "2025-05-20",
                "notes": null,
                "created": "2025-06-20T17:02:19.604021",
                "id": "0be11f7b"
            },
            {
                "amount": 588.0,
                "category": "food",
                "date": "2025-05-04",
                "notes": "Dinner bill",
                "created": "2025-06-20T17:03:36.627404",
                "id": "14fad560"
            }
        ],
        "count": 3,
        "total": 1278.5
    }
}
```

The function returned a list of all food expenses this month, including the one we just added. Now for the final step: let's send this information to Gemma and get our final answer.

In [19]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
I have successfully retrieved the food expenses for May 2025. The total amount spent on food is 1278.5. Since this amount is greater than 1000, the user has exceeded their food budget. I will inform them of this.
</blockquote>



Your total food expenses for May 2025 are 1278.5, which exceeds your budget of 1000.

Nice! The model's final thinking block shows that it has all the information it needs. It compares the total spending to our budget and generates a helpful natural-language response.

Congratulations! You've just completed your first function-calling conversation with Gemma!

### Trying out parallel function calls

Let's give Gemma a slightly more complex task. We'll ask about two different budgets at once: clothing and travel. Since checking the clothing budget doesn't depend on the travel budget, the model should be smart enough to call the function for both categories in parallel.

In [20]:
task = "Oh no! Could you also check if I have gone above my clothes and travel budgets?"
messages.append({"role": "user", "content": task})

In [21]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
Okay, the user wants me to check their spending against their clothing and travel budgets as well. I'll need to call the `get_expense_entries` function twice - once for "clothing" and once for "travel" - for the current month (May 2025). I'll compare the total for each category with their respective budgets (which I currently do not know, so I need to ask the user for these).

First, I'll ask the user for their monthly clothing and travel budgets.
</blockquote>



Could you please tell me your monthly budget for clothing and travel?

Ah, look at that! Instead of calling a function, Gemma responded by asking a question. This is perfect! We never told it our budgets for clothing and travel. Because of the "no guessing" rule in our prompt, it correctly identified that it was missing information and chose to ask for clarification. Let us respond with a clarification:

In [22]:
clarification = "My clothing budget is 500, and my travel budget is 2k."
messages.append({"role": "user", "content": clarification})

Now let's see what it does with this new information.

In [23]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
Okay, I now have the user's clothing budget (500) and travel budget (2000). I will now call `get_expense_entries` for "clothing" and "travel" for May 2025 and compare the results with these budgets.
</blockquote>



```json
{
	"id": "3",
	"function": "get_expense_entries",
	"parameters": {
		"category": "clothing",
		"from_date": "2025-05-01",
		"to_date": "2025-05-31"
	}
}
```

```json
{
	"id": "4",
	"function": "get_expense_entries",
	"parameters": {
		"category": "travel",
		"from_date": "2025-05-01",
		"to_date": "2025-05-31"
	}
}
```

Exactly what we wanted! It generated two calls to the `get_expense_entries` function with different IDs - passing 'clothing' as the category in the first one, and 'travel' as the category in the second. This is parallel function calling in action!

Let's parse and execute both these function calls.

In [24]:
calls = toolkit.functions.parse_calls(complete_response)
for call in calls:
    toolkit.functions.execute_call(call, respond)

We can inspect the last two messages to see the results for our clothing and travel expenses.

In [25]:
render_markdown(messages[-2]["content"])

```json
{
    "id": "3",
    "result": {
        "transactions": [
            {
                "amount": 917,
                "category": "clothing",
                "date": "2025-05-20",
                "notes": null,
                "created": "2025-06-20T17:02:19.604100",
                "id": "1b0ec625"
            }
        ],
        "count": 1,
        "total": 917
    }
}
```

In [26]:
render_markdown(messages[-1]["content"])

```json
{
    "id": "4",
    "result": {
        "transactions": [
            {
                "amount": 78,
                "category": "travel",
                "date": "2025-05-19",
                "notes": null,
                "created": "2025-06-20T17:02:19.603784",
                "id": "0b7fe740"
            }
        ],
        "count": 1,
        "total": 78
    }
}
```

Finally, let's send both of these results back to the model to get its final summary.

In [27]:
streaming_response = toolkit.model.get_response(model, messages)
complete_response = render_live(streaming_response)
messages.append({"role": "assistant", "content": complete_response})

<blockquote>
I have retrieved the clothing and travel expenses for May 2025. The clothing expense is 917, and the travel expense is 78. Now, I will compare these amounts with the user's budgets (500 for clothing and 2000 for travel) and inform them.
</blockquote>



Here's a summary of your budget performance for May 2025:

*   **Food:** You spent 1278.5, exceeding your 1000 budget.
*   **Clothing:** You spent 917, exceeding your 500 budget.
*   **Travel:** You spent 78, which is well within your 2000 budget.

And there we have it! The model correctly processed both function outputs and provided a final, user-friendly summary comparing our spending in each category to the budgets we provided.

### Conclusion and next steps

This brings us to the end of the tutorial. Congratulations! You've successfully built a multimodal, function-calling agent that runs entirely on your own machine.

Feel free to play around with the code in this notebook, try different scenarios, and see what you can build. If you have any questions or want to share what you've build, please start a discussion [here](https://github.com/gamemaker1/gemma3-function-calling-tutorial/discussions/new/choose)!