# Gemini API: Audio Quickstart

This notebook provides an example of how to prompt Gemini 1.5 Pro using an audio file. In this case, you'll use a [sound recording](https://www.jfklibrary.org/asset-viewer/archives/jfkwha-006) of President John F. Kennedy’s 1961 State of the Union address.

In [1]:
!pip install -q -U google-generativeai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m164.2/164.2 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m718.3/718.3 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import google.generativeai as genai

## Configure your API key

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example.

In [7]:
import os
from google.colab import userdata

os.environ['GOOGLE_API_KEY']=userdata.get('gemini_key')

genai.configure()

In [8]:
import pathlib
import textwrap

import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown


def to_markdown(text):
  text = text.replace('•', '  *')
  return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))

## Upload an audio file with the File API

To use an audio file in your prompt, you must first upload it using the [File API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb).


In [9]:
URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3"

In [5]:
!wget -q $URL -O sample.mp3

In [10]:
your_file = genai.upload_file(path='sample.mp3')

In [None]:
#genai.delete_file(your_file.name)

## Use the file in your prompt

In [11]:
prompt = "Listen carefully to the following audio file. Provide a brief summary."

model = genai.GenerativeModel('models/gemini-1.5-flash-latest') # models/gemini-1.5-pro-latest
response = model.generate_content([prompt, your_file])
Markdown(response.text)

President John F. Kennedy delivered his State of the Union address to a joint session of Congress on January 30, 1961. He discussed the state of the economy, which he said was troubling, and outlined proposals to improve unemployment compensation, provide more food for the unemployed, re-develop areas with labor surplus, and other measures. He also discussed the international situation, saying the US needs to strengthen its military, improve economic tools, and work with allies to achieve peace. 

## Count audio tokens

You can count the number of tokens in your audio file like this.

In [12]:
model.count_tokens([your_file])
# 6 usd for 1000 minutes

total_tokens: 83552

## Learning more

* Learn more about the [File API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb) with the quickstart.

* Learn more about prompting with [media files](https://ai.google.dev/tutorials/prompting_with_media) in the docs, including the supported formats and maximum length for audio files.