# Build with AI
## GDG Antanarivo
### Authors: Tommy & Mickaël


# Multimodal LiveAPI: Tool use

To get more information about it go [here](https://ai.google.dev/gemini-api/docs/live#pyhon)

## Install SDK
More details about this new SDK on the [documentation](https://ai.google.dev/gemini-api/docs/sdks) or in the [Getting started](../quickstarts/Get_started.ipynb) notebook.

In [1]:
pip install -U -q google-genai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/199.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m194.6/199.2 kB[0m [31m9.7 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m199.2/199.2 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25h

## Set up your API key

In [2]:
from google.colab import userdata
import os

os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')

## Initialize SDK client
The client will pick up you API key from the environnement variable

In [3]:
from google import genai

client = genai.Client()

## Select a model
You can see more Gemini models [here](https://ai.google.dev/gemini-api/docs/models#live-api)

In [4]:
#MODEL = "gemini-2.0-flash-exp"
MODEL = "gemini-2.0-flash-live-001"

### Import all necessary modules

In [5]:
import asyncio
import contextlib
import json
import wave

from IPython import display

from google import genai
from google.genai import types

### Utilities

You're going to use the Live API's audio

*   List item
*   List item

output, the easiest way hear it in Colab is to write the `PCM` data out as a `WAV` file:

In [7]:
@contextlib.contextmanager
def wave_file(filename, channels=1, rate=24000, sample_width=2):
    with wave.open(filename, "wb") as wf:
        wf.setnchannels(channels)
        wf.setsampwidth(sample_width)
        wf.setframerate(rate)
        yield wf

In [8]:
import logging
logger = logging.getLogger('Live')
#logger.setLevel('DEBUG')  # Switch between "INFO" and "DEBUG" to toggle debug messages.
logger.setLevel('INFO')

### Get started

In [27]:
n = 0
async def run(prompt, modality="TEXT", tools=None):
  global n
  if tools is None:
    tools=[]

  config = {
          "tools": tools,
          "response_modalities": [modality]
  }

  async with client.aio.live.connect(model=MODEL, config=config) as session:
    display.display(display.Markdown(prompt))
    display.display(display.Markdown('-------------------------------'))
    await session.send_client_content(
      turns={"role": "user", "parts": [{"text": prompt}]}, turn_complete=True
    )

    audio = False
    filename = f'audio_{n}.wav'
    with wave_file(filename) as wf:
      async for response in session.receive():
        logger.debug(str(response))
        if text:=response.text:
          display.display(display.Markdown(text))
          continue

        if data:=response.data:
          print('.', end='')
          wf.writeframes(data)
          audio = True
          continue

        server_content = response.server_content
        if server_content is not None:
          handle_server_content(wf, server_content)
          continue

        tool_call = response.tool_call
        if tool_call is not None:
          await handle_tool_call(session, tool_call)


  if audio:
    display.display(display.Audio(filename, autoplay=True))
    n = n+1

Since this tutorial demonstrates several tools, you'll need more code to handle the different types of objects it returns.

- The `function_declaration` - [documentation](https://ai.google.dev/gemini-api/docs/function-calling?example=meeting)
- The `code_execution` tool can return `executable_code` and `code_execution_result` parts.
- The `google_search` tool may attach a `grounding_metadata` object.

In [20]:
def handle_server_content(wf, server_content):
  model_turn = server_content.model_turn
  if model_turn:
    for part in model_turn.parts:
      executable_code = part.executable_code
      if executable_code is not None:
        print("Executable code")
        display.display(display.Markdown('-------------------------------'))
        display.display(display.Markdown(f'``` python\n{executable_code.code}\n```'))
        display.display(display.Markdown('-------------------------------'))

      code_execution_result = part.code_execution_result
      if code_execution_result is not None:
        print("Code execution result")
        display.display(display.Markdown('-------------------------------'))
        display.display(display.Markdown(f'```\n{code_execution_result.output}\n```'))
        display.display(display.Markdown('-------------------------------'))

  grounding_metadata = getattr(server_content, 'grounding_metadata', None)
  if grounding_metadata is not None:
    display.display(
        display.HTML(grounding_metadata.search_entry_point.rendered_content))

  return

- Finally, with the `function_declarations` tool, the API may return `tool_call` objects. To keep this code minimal, the `tool_call` handler just replies to every function call with a response of `"ok"`.

In [14]:
async def handle_tool_call(session, tool_call):
  function_responses = []
  for fc in tool_call.function_calls:
    function_response = types.FunctionResponse(
        id=fc.id,
        name=fc.name,
        response={"result": "ok"},
    )
    function_responses.append(function_response)
  print('\n>>> ', function_responses)
  await session.send_tool_response(function_responses=function_responses)

Try running it for a first time:

In [15]:
await run(prompt="Hello?", tools=None, modality = "TEXT")

Hello?

-------------------------------

--------- text response --------------


Hello! How can

--------- text response --------------


 I help you today?


In [16]:
## Simple function call

Note that in the live API function calls are independent of the chat turns. The conversation can continue while a function call is being processed.

In [18]:
turn_on_the_lights = {'name': 'turn_on_the_lights'}
turn_off_the_lights = {'name': 'turn_off_the_lights'}

In [21]:
prompt = "Turn on the lights"

tools = [
    {'function_declarations': [turn_on_the_lights, turn_off_the_lights]}
]

await run(prompt, tools=tools, modality = "TEXT")

Turn on the lights

-------------------------------



Executable code


-------------------------------

``` python
print(default_api.turn_on_the_lights())

```

-------------------------------




>>>  [FunctionResponse(will_continue=None, scheduling=None, id='function-call-17523228502071487970', name='turn_on_the_lights', response={'result': 'ok'})]
Code execution result


-------------------------------

```
{'result': 'ok'}

```

-------------------------------

--------- text response --------------


OK

--------- text response --------------


. I've turned on the lights.


### Code execution

The `code_execution` lets the model write and run python code.

In [22]:
prompt="Can you compute the largest prime palindrome under 100000."

tools = [
    {'code_execution': {}}
]

await run(prompt, tools=tools, modality="TEXT")

Can you compute the largest prime palindrome under 100000.

-------------------------------

--------- text response --------------


Okay

--------- text response --------------


, I can help you with that. Here's my plan:

1

--------- text response --------------


.  **Generate Palindromes:** Create a list of all palindromes under 100000.
2.  **Check for Primality:**

--------- text response --------------


 Iterate through the palindromes and check if each one is prime.
3.  **Find the Largest:** Keep track of the largest prime palindrome found so far.



--------- text response --------------


Here's the code to do that:





Executable code


-------------------------------

``` python
def is_palindrome(n):
  """Checks if a number is a palindrome."""
  return str(n) == str(n)[::-1]


def is_prime(n):
  """Checks if a number is prime."""
  if n < 2:
    return False
  for i in range(2, int(n**0.5) + 1):
    if n % i == 0:
      return False
  return True


largest_prime_palindrome = 0
for i in range(100000):
  if is_palindrome(i) and is_prime(i):
    largest_prime_palindrome = i

print(largest_prime_palindrome)

```

-------------------------------



Code execution result


-------------------------------

```
98689

```

-------------------------------

--------- text response --------------


The largest prime palindrome

--------- text response --------------


 under 100000 is 98689.

--------- text response --------------





### Compositional Function Calling

Compositional function calling refers to the ability to combine user defined functions with the `code_execution` tool.

In [24]:
prompt="Can you turn on the lights wait 10s and then turn them off?"

tools = [
    {'code_execution': {}},
    {'function_declarations': [turn_on_the_lights, turn_off_the_lights]}
]

await run(prompt, tools=tools, modality="TEXT")

Can you turn on the lights wait 10s and then turn them off?

-------------------------------



Executable code


-------------------------------

``` python
import time

default_api.turn_on_the_lights()
time.sleep(10)
default_api.turn_off_the_lights()

```

-------------------------------


>>>  [FunctionResponse(will_continue=None, scheduling=None, id='function-call-11501730627214656751', name='turn_on_the_lights', response={'result': 'ok'})]

>>>  [FunctionResponse(will_continue=None, scheduling=None, id='function-call-14939870685884538964', name='turn_off_the_lights', response={'result': 'ok'})]
--------- text response --------------


OK

--------- text response --------------


. I have turned on the lights, waited 10 seconds, and then turned them

--------- text response --------------


 off.


## Google search

The `google_search` tool lets the model conduct google searches. For example, try asking it about events that are too recent to be in the training data.

In [26]:
prompt = "When is the latest time that people from Madagascar were happy?"

tools = [
   {'google_search': {}}
]

await run(prompt, tools=tools, modality="TEXT")

When is the latest time that people from Madagascar were happy?

-------------------------------



Executable code


-------------------------------

``` python
print(google_search.search(queries=["Madagascar happiness index recent years", "recent news reports Madagascar positive", "Madagascar quality of life"]))

```

-------------------------------



Code execution result


-------------------------------

```
Looking up information on Google Search.

```

-------------------------------

--------- text response --------------


Determining a

--------- text response --------------


 specific moment when "people from Madagascar were happy" is challenging because happiness is subjective

--------- text response --------------


 and varies among individuals. However, I can provide information based on available reports and data:

*   **Happiness Index:** According to the 2024 World

--------- text response --------------


 Happiness Report (published on March 20, 2025), Madagascar's happiness index has remained stable at around 4 out of 10 since

--------- text response --------------


 2021. In 2021, the happiness index reached its maximum value of 4.34 points. While this is below the global

--------- text response --------------


 average, it represents a relatively recent period where the index was at its highest in the 2013-2024 period.
*   **Af

--------- text response --------------


robarometer Survey (November 2024):** A survey indicated that a majority of respondents felt the country's economic situation and their living conditions had worsened compared

--------- text response --------------


 to the previous year.
*   **Legatum Prosperity Index 2023:** Madagascar ranked 137th overall. It performs most strongly in Natural

--------- text response --------------


 Environment and Personal Freedom but is weakest in Living Conditions.

