### Explore audio capabilities with the Gemini API
# Gemini can respond to prompts about audio. For example, Gemini can:

-Describe, summarize, or answer questions about audio content.
-Provide a transcription of the audio.
-Provide answers or a transcription about a specific segment of the audio.
Note: You can't generate audio output with the Gemini API.
This guide demonstrates different ways to interact with audio files and audio content using the Gemini API.

# Supported audio formats
Gemini supports the following audio format MIME types:

WAV - audio/wav
MP3 - audio/mp3
AIFF - audio/aiff
AAC - audio/aac
OGG Vorbis - audio/ogg
FLAC - audio/flac
Technical details about audio
Gemini imposes the following rules on audio:

Gemini represents each second of audio as 25 tokens; for example, one minute of audio is represented as 1,500 tokens.
Gemini can only infer responses to English-language speech.
Gemini can "understand" non-speech components, such as birdsong or sirens.
The maximum supported length of audio data in a single prompt is 9.5 hours. Gemini doesn't limit the number of audio files in a single prompt; however, the total combined length of all audio files in a single prompt cannot exceed 9.5 hours.
Gemini downsamples audio files to a 16 Kbps data resolution.
If the audio source contains multiple channels, Gemini combines those channels down to a single channel.
Before you begin: Set up your project and API key
Before calling the Gemini API, you need to set up your project and configure your API key.

In [1]:
pip install -q -U google-generativeai

Note: you may need to restart the kernel to use updated packages.


In [3]:
import google.generativeai as genai
import os
from dotenv import load_dotenv

load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model=genai.GenerativeModel("gemini-1.5-flash")

myfile = genai.upload_file("motivational.mp3")
print(f"{myfile=}")

result = model.generate_content([myfile, "Describe this audio clip"])
print(result.text)

myfile=genai.File({
    'name': 'files/25qvpa7wotx3',
    'display_name': 'motivational.mp3',
    'mime_type': 'audio/mpeg',
    'sha256_hash': 'OTMzNDk0ZmMyYmU5NWRiMzhjNmFjYTc5NjkxZGQ0NDY2MjIzZDgyMDI2NzUwN2FhYWU1NGU2MGIzYWE3NTIyZQ==',
    'size_bytes': '21739928',
    'state': 'ACTIVE',
    'uri': 'https://generativelanguage.googleapis.com/v1beta/files/25qvpa7wotx3',
    'create_time': '2025-01-23T19:50:14.527006Z',
    'expiration_time': '2025-01-25T19:50:14.444817898Z',
    'update_time': '2025-01-23T19:50:14.527006Z'})
This audio clip is a motivational speech focusing on the importance of perseverance, discipline, and taking ownership of one's life.  The speaker uses several analogies and examples to illustrate his points.  Here's a summary:


**Core Themes:**

* **Embrace Failure:** The speech begins by emphasizing that failure is an inevitable part of life, a stepping stone to eventual success.  It's not about avoiding failure, but learning from it.
* **Action over Inaction:**  T