# Gemini API: Audio Quickstart

This notebook provides an example of how to prompt Gemini 1.5 Pro using an audio file. 

In [159]:
!pip install -q -U google-generativeai langchain langchain-google-genai langchain-openai singlestoredb --quiet

In [110]:
import google.generativeai as genai

## Configure your API key

This API key will be from aistudio.google.com

In [111]:
import os

os.environ['GOOGLE_API_KEY']='ap'

genai.configure(api_key=os.environ['GOOGLE_API_KEY'])

## Upload an audio file with the File API

To use an audio file in your prompt, you must first upload it using the [File API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb).


In [112]:
URL = "https://ia803402.us.archive.org/14/items/lp_mozart-divertimento17-k-334-horn-quintet-k_wolfgang-amadeus-mozart-members-of-the-ber/disc1/01.03.%20Divertmento%20In%20D%20Major%2C%20K.%20334%20Menuetto.mp3"

In [114]:
!wget -q $URL -O sample.mp3

In [115]:
your_file = genai.upload_file(path='sample.mp3')

## Use the file in your prompt

In [122]:
prompt = "Listen carefully to the following audio file. Provide a one sentence summary."
model = genai.GenerativeModel('models/gemini-1.5-pro-latest')
response = model.generate_content([prompt, your_file])
print(response.text)

The audio file contains a series of classical music pieces featuring piano, strings, and woodwinds. 



## RAG over audio files using SingleStoreDB

Now we will embed the text descriptions of the audio file(s). This allows us to search and retrieve relevant files for RAG later.

In [172]:
from langchain.vectorstores import SingleStoreDB
import os

from langchain_google_genai import GoogleGenerativeAIEmbeddings

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

os.environ["SINGLESTOREDB_URL"] = f'{connection_user}:{connection_password}@{connection_host}:{connection_port}/{connection_default_database}'

In [244]:
vectorstore=SingleStoreDB(table_name="audio1", embedding=embeddings)

In [248]:
from langchain_core.documents import Document

mozart_doc = Document(page_content=response.text, metadata={'path': 'sample.mp3'})

vectorstore.add_documents([mozart_doc])   

vectorstore.add_texts(['foo', 'bar'])

[]

In [259]:
query = "beethoven"
docs = vectorstore.similarity_search(query)  # Find documents that correspond to the query

In [260]:
print(docs[-1].page_content)

The audio file contains a series of classical music pieces featuring piano, strings, and woodwinds. 

