# Build with AI
## GDG Antanarivo
### Authors: Tommy & Mickaël


# Getting started with LiveAPI.

To get more information about it go [here](https://ai.google.dev/gemini-api/docs/live#pyhon)

## Install SDK
More details about this new SDK on the [documentation](https://ai.google.dev/gemini-api/docs/sdks) or in the [Getting started](../quickstarts/Get_started.ipynb) notebook.

In [1]:
pip install -U -q google-genai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/199.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m199.2/199.2 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[?25h

## Set up your API key

In [2]:
from google.colab import userdata
import os

os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')

## Initialize SDK client
The client will pick up you API key from the environnement variable

In [3]:
from google import genai

client = genai.Client()

## Select a model
You can see more Gemini models [here](https://ai.google.dev/gemini-api/docs/models#live-api)

In [32]:
#MODEL = "gemini-2.0-flash-exp"
MODEL = "gemini-2.0-flash-live-001"

### Import all necessary modules

In [33]:
import asyncio
import base64
import contextlib
import datetime
import os
import json
import wave
import itertools

from IPython.display import display, Audio
from google import genai
from google.genai import types

# Text to Text
The simplest way to use the Live API is as text-to-text chat interface, but It can do a lot more than this

In [34]:
config = {
    "generation_config": {"response_modalities": ["TEXT"]}
}

async with client.aio.live.connect(model=MODEL, config=config) as session:
  message = "Hello? Gemini are you there?"
  print(">", message, "\n")
  await session.send_client_content(
      turns={"role": "user", "parts": [{"text": message}]},
      turn_complete=True
  )

  # For text responses, When the model's turn is complete it breaks out of the loop
  turn = session.receive()
  async for chunk in turn:
    if chunk.text is not None:
      print(f"- {chunk.text}")

> Hello? Gemini are you there? 

- Yes, I am
-  here! How can I help you today?



## Simple text to audio
The simplest way to playback the audio in Colab, is to write it out to a `.wav` file. So here is a simple wave file writer:

In [36]:
# The following explantion is from ChatGPT (Thanks ChatGPT ^-^)

# This defines a function that returns a context manager for creating a .wav audio file.
# Arguments:
#   filename: The name of the output .wav file.
#   channels: Number of audio channels (default is 1, i.e., mono).
#   rate: Sampling rate in Hz (default 24000).
#   sample_width: Number of bytes per sample (default 2 → 16-bit audio).

@contextlib.contextmanager
def wave_file(filename, channels=1, rate=24000, sample_width=2):
  with wave.open(filename, "wb") as wf:
    wf.setnchannels(channels)
    wf.setsampwidth(sample_width)
    wf.setframerate(rate)
    yield wf

The next step is to tell the model to return audio by setting `"response_modalities": ["AUDIO"]` in the `LiveConnectConfig`.  

When you get a response from the model, then you write out the data to a `.wav` file.

In [37]:
config = {
    "generation_config": {"response_modalities": ["AUDIO"]}
}

async def async_enumerate(aiterable):
  n=0
  async for item in aiterable:
    yield n, item
    n+=1

async with client.aio.live.connect(model=MODEL, config=config) as session:
  file_name = "audio.wav"
  with wave_file(file_name) as wav:
    message = "Hello? Gemini are you there?"
    print(">", message, "\n")
    await session.send_client_content(
        turns={"role": "user", "parts": [{"text": message}]}, turn_complete=True
    )
    turn = session.receive()
    async for n, response in async_enumerate(turn):
      if response.data is not None:
        wav.writeframes(response.data)

        if n == 0:
          print(response.server_content.model_turn.parts[0].inline_data.mime_type)
        print(".", end=" ")

  display(Audio(file_name, autoplay=True))

> Hello? Gemini are you there? 

audio/pcm;rate=24000
. . . . . . . . . . . . . . . . . 

## Towards Async Tasks

The real power of the Live API is that it's real time, and interruptable. You can't get that full power in a simple sequence of steps. To really use the functionality you will move the `send` and `recieve` operations (and others) into their own [async tasks](https://docs.python.org/3/library/asyncio-task.html).

Because of the limitations of Colab this tutorial doesn't totally implement the interactive async tasks, but it does implement the next step in that direction:

- It separates the `send` and `receive`, but still runs them sequentially.  


In [None]:
import logging

logger = logging.getLogger('Live')
logger.setLevel('INFO')

In [None]:
class AudioLoop:
    def __init__(self, config=None):
        self.session = None
        self.index = 0
        self.queue = asyncio.Queue()
        self.can_ask = asyncio.Event()
        self.config = config or {
            "response_modalities": ["AUDIO"]
        }

    async def run(self):
        async with client.aio.live.connect(model=MODEL, config=self.config) as session:
            self.session = session
            self.can_ask.set()  # allow first prompt
            await asyncio.gather(
                self.send(),
                self.recv()
            )

    async def send(self):
        print("Type 'q' to quit")
        while True:
            await self.can_ask.wait()  # wait until audio finishes
            text = await asyncio.to_thread(input, "message > ")
            if text.lower() == 'q':
                await self.queue.put(None)
                break
            self.can_ask.clear()  # stop further inputs until audio finishes
            await self.session.send_client_content(
                turns={"role": "user", "parts": [{"text": text}]},
                turn_complete=True
            )
            await self.queue.put(self.index)
            self.index += 1

    async def recv(self):
        while True:
            index = await self.queue.get()
            if index is None:
                break

            file_name = f"audio_{index}.wav"
            with wave_file(file_name) as wav:
                turn = self.session.receive()
                async for n, response in async_enumerate(turn):
                    if response.data:
                        wav.writeframes(response.data)
                        if n == 0:
                            print(response.server_content.model_turn.parts[0].inline_data.mime_type)
                        print('.', end='', flush=True)
                print('\n<Turn complete>')

            display(Audio(file_name, autoplay=True))
            await asyncio.sleep(2) 
            self.can_ask.set() 

In [None]:
await AudioLoop().run()