##### Copyright 2025 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# RAG example with Gemini and ChromaDB

Taken from the Kaggle 5-day Generative AI course 2025.


## Setup / Python environment

First, install ChromaDB and the Gemini API Python SDK.

In [2]:
!pip uninstall -qqy kfp  # Remove unused conflicting packages
!pip install -qU "google-genai==1.7.0" "chromadb==0.6.3"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [1]:
from google import genai
from google.genai import types

from IPython.display import Markdown

genai.__version__

'1.7.0'

## Setup / Google API key
1. You need to create an API key at https://aistudio.google.com
2. Example code loads it from environment variables.


In [2]:
import os

GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']

## Explore available models

You will be using the [`embedContent`](https://ai.google.dev/api/embeddings#method:-models.embedcontent) API method to calculate embeddings in this guide. Find a model that supports it through the [`models.list`](https://ai.google.dev/api/models#method:-models.list) endpoint. You can also find more information about the embedding models on [the models page](https://ai.google.dev/gemini-api/docs/models/gemini#text-embedding).

`text-embedding-004` is the most recent generally-available embedding model, so you will use it for this exercise, but try out the experimental `gemini-embedding-exp-03-07` model too.

In [3]:
client = genai.Client(api_key=GOOGLE_API_KEY)

for m in client.models.list():
    if "embedContent" in m.supported_actions:
        print(m.name)

models/embedding-001
models/text-embedding-004
models/gemini-embedding-exp-03-07
models/gemini-embedding-exp


## Data
Example company is a hat shop offering three kinds of hats: caps, bands and top hats.

Here is 20 example question-and-answer pairs suitable for training a RAG model for hat company's customer support. These cover various aspects like sizing, colors, materials, care, shipping, and returns across your product types (caps, bands, top hats).

In [4]:
documents = [
    # --- Sizing Questions ---
    "Question: How do I determine the correct size for a top hat?\nAnswer: To find your perfect top hat size, measure the circumference of your head just above your ears, where the hat will sit. Compare this measurement (in cm or inches) to the size chart available on each top hat product page. If you're between sizes, we generally recommend sizing up for comfort.",

    "Question: My head measures 58cm. What size cap should I order?\nAnswer: A 58cm head circumference typically corresponds to a Large (L) in most of our cap styles. However, please always consult the specific size chart on the product page for the cap you're interested in, as fit can vary slightly between styles (e.g., fitted vs. adjustable).",

    "Question: Are your head bands one-size-fits-all?\nAnswer: Most of our bands are made with stretchy materials designed to fit a wide range of head sizes comfortably, typically accommodating circumferences from 55cm to 60cm. Please check the product description for any specific sizing notes or variations for particular band styles.",

    "Question: What if I order a hat and the size is wrong?\nAnswer: No problem! We offer a 30-day return and exchange policy for unworn hats with tags still attached. You can easily initiate an exchange for a different size through our online returns portal. Please see our Returns & Exchanges page for full details.",

    # --- Color Questions ---
    "Question: Is the 'Forest Green' color for the cap accurate on the website?\nAnswer: We strive to represent colors as accurately as possible online. However, screen calibration can vary. The 'Forest Green' is a deep, rich green. If you have concerns, customer photos in the reviews section might offer additional perspectives.",

    "Question: Do you offer top hats in colors other than black?\nAnswer: While classic black is our most popular top hat color, we do offer limited runs in other colors like charcoal grey and navy blue depending on the season and style. Please check the Top Hats category on our website for current color availability.",

    "Question: Are the colors for the bands vibrant?\nAnswer: Yes, our bands come in a variety of colors, many of which are designed to be vibrant and eye-catching. The specific vibrancy can depend on the material used. Check the product description and images for the best representation of each color.",

    # --- Material & Care Questions ---
    "Question: What material are your standard baseball caps made from?\nAnswer: Our standard baseball caps are typically made from 100% cotton twill for breathability and comfort. Some performance or speciality caps might use synthetic blends. Material composition is always listed on the product details page.",

    "Question: How should I clean my wool top hat?\nAnswer: Wool top hats require careful cleaning. We recommend spot cleaning minor marks with a slightly damp cloth. For more thorough cleaning or reshaping, professional hat cleaning is advised. Avoid submerging the hat in water.",

    "Question: Can I machine wash the head bands?\nAnswer: Care instructions vary by band material. Cotton or synthetic blend bands can often be machine washed on a gentle cycle with cold water and air-dried. However, bands with delicate embellishments or specific materials might require hand washing. Always check the care label or product page instructions.",

    "Question: Are your hats suitable for rainy weather?\nAnswer: Material matters most here. Cotton caps offer minimal water resistance. Some performance caps might have a water-repellent finish. Wool top hats can handle light drizzle but should not be saturated. Bands are generally not designed for rain protection. Check product descriptions for specific water-resistance information.",

    # --- Product Specific Questions ---
    "Question: Are your caps structured or unstructured?\nAnswer: We offer both! Structured caps have a stiff buckram lining in the front panels to maintain their shape, while unstructured caps have a softer, more relaxed fit. This information is specified in the product description for each cap style.",

    "Question: What's the difference between a cap and a band?\nAnswer: Caps cover the entire top of the head and usually feature a brim or visor for sun protection. Bands are typically strips of fabric worn around the head primarily for style, sweat absorption during activities, or keeping hair back.",

    "Question: Are top hats only for formal events?\nAnswer: Traditionally, top hats are associated with formal wear (like morning dress or white tie). However, they can also be a bold fashion statement for less formal, stylish occasions depending on the specific design and how you style them.",

    # --- Availability & Ordering ---
    "Question: The medium red cap I want is out of stock. When will it be back?\nAnswer: We restock popular items regularly. You can sign up for a 'Back in Stock' notification directly on the product page. Enter your email, and we'll notify you as soon as the medium red cap is available again.",

    "Question: Do you offer any discounts for first-time buyers?\nAnswer: Yes! We often have a welcome discount for new customers who sign up for our email newsletter. Please check the banner at the top or bottom of our website for current promotions.",

    # --- Shipping & Returns ---
    "Question: How long does shipping usually take within the UK?\nAnswer: Standard shipping within the UK typically takes 3-5 business days after processing. We also offer expedited shipping options at checkout, usually delivering within 1-2 business days. You'll receive tracking information once your order ships.",

    "Question: Can I return a hat purchased on sale?\nAnswer: Our return policy generally applies to sale items as well, provided they are returned unworn with tags attached within the 30-day window. However, items marked as 'Final Sale' are non-returnable. Please check the item's description and our full Return Policy page.",

    # --- Miscellaneous ---
    "Question: Can I get a hat customized with my logo?\nAnswer: We currently do not offer individual customization services through our website. However, for bulk orders (typically 50+ units), we may be able to accommodate custom embroidery. Please contact our corporate sales team for inquiries.",

    "Question: Where are your hats manufactured?\nAnswer: We partner with various manufacturers globally to produce our hats, ensuring high quality standards. Specific country of origin information can usually be found on the care label inside the hat."
]

## Creating the embedding database with ChromaDB

Create a [custom function](https://docs.trychroma.com/guides/embeddings#custom-embedding-functions) to generate embeddings with the Gemini API. In this task, you are implementing a retrieval system, so the `task_type` for generating the *document* embeddings is `retrieval_document`. Later, you will use `retrieval_query` for the *query* embeddings. Check out the [API reference](https://ai.google.dev/api/embeddings#v1beta.TaskType) for the full list of supported tasks.

Key words: Documents are the items that are in the database. They are inserted first, and later retrieved. Queries are the textual search terms and can be simple keywords or textual descriptions of the desired documents.

In [8]:
!pip install -qU "google-api_core"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [5]:
from chromadb import Documents, EmbeddingFunction, Embeddings
from google.api_core import retry

from google.genai import types


# Define a helper to retry when per-minute quota is reached.
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})


class GeminiEmbeddingFunction(EmbeddingFunction):
    # Specify whether to generate embeddings for documents, or queries
    document_mode = True

    @retry.Retry(predicate=is_retriable)
    def __call__(self, input: Documents) -> Embeddings:
        if self.document_mode:
            embedding_task = "retrieval_document"
        else:
            embedding_task = "retrieval_query"

        response = client.models.embed_content(
            model="models/text-embedding-004",
            contents=input,
            config=types.EmbedContentConfig(
                task_type=embedding_task,
            ),
        )
        return [e.values for e in response.embeddings]

Now create a [Chroma database client](https://docs.trychroma.com/getting-started) that uses the `GeminiEmbeddingFunction` and populate the database with the documents you defined above.

In [6]:
import chromadb

DB_NAME = "emaildb"

embed_fn = GeminiEmbeddingFunction()
embed_fn.document_mode = True

chroma_client = chromadb.Client()
db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)

db.add(documents=documents, ids=[str(i) for i in range(len(documents))])

Confirm that the data was inserted by looking at the database.

In [7]:
db.count()
# You can peek at the data too.
# db.peek(1)

20

In [8]:
db.peek(1)

{'ids': ['0'],
 'embeddings': array([[-1.78592559e-02,  7.11010071e-03, -1.39390246e-03,
         -4.97947708e-02, -6.95656566e-03,  3.63590457e-02,
          1.81056075e-02, -5.80168068e-02,  4.23502810e-02,
          3.08909100e-02, -3.40557247e-02, -2.03881916e-02,
          3.81441675e-02, -1.00490395e-02, -2.61711255e-02,
         -1.15821818e-02,  1.06935657e-03, -4.10843035e-03,
         -8.73519778e-02, -2.11254489e-02,  3.32121588e-02,
          5.33391442e-03,  3.58477160e-02,  1.75748821e-02,
          1.99162234e-02,  4.56856079e-02, -3.83582637e-02,
          5.88292778e-02, -3.69718708e-02, -6.24979883e-02,
          4.83883172e-02,  3.58200185e-02,  4.95013362e-03,
         -6.89058080e-02,  4.96878400e-02,  1.08250100e-02,
          3.65690049e-03,  6.78346753e-02,  5.21615259e-02,
         -6.29851967e-02, -3.36982980e-02,  3.54755372e-02,
         -1.11675709e-02,  5.81644475e-03, -3.67455892e-02,
          5.09840809e-02, -5.40083786e-03,  4.05586585e-02,
         -6

## Retrieval: Find relevant documents

To search the Chroma database, call the `query` method. Note that you also switch to the `retrieval_query` mode of embedding generation.


In [17]:
# Switch to query mode when generating embeddings.
embed_fn.document_mode = False

# Search the Chroma DB using the specified query.
query = "Hi! Do you sell caps?"

result = db.query(query_texts=[query], n_results=2)
[all_passages] = result["documents"]

Markdown(all_passages[0])

Question: Are your caps structured or unstructured?
Answer: We offer both! Structured caps have a stiff buckram lining in the front panels to maintain their shape, while unstructured caps have a softer, more relaxed fit. This information is specified in the product description for each cap style.

In [18]:
# Search the Chroma DB using the specified query.
query1 = "I want to buy a car."

result1 = db.query(query_texts=[query1], n_results=2)
[all_passages1] = result1["documents"]

Markdown(all_passages1[0])

Question: Do you offer any discounts for first-time buyers?
Answer: Yes! We often have a welcome discount for new customers who sign up for our email newsletter. Please check the banner at the top or bottom of our website for current promotions.

## Augmented generation: Answer the question

Now that you have found a relevant passage from the set of documents (the *retrieval* step), you can now assemble a generation prompt to have the Gemini API *generate* a final answer. Note that in this example only a single passage was retrieved. In practice, especially when the size of your underlying data is large, you will want to retrieve more than one result and let the Gemini model determine what passages are relevant in answering the question. For this reason it's OK if some retrieved passages are not directly related to the question - this generation step should ignore them.

In [19]:
query_oneline = query.replace("\n", " ")

# This prompt is where you can specify any guidance on tone, or what topics the model should stick to, or avoid.
prompt = f"""You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: {query_oneline}
"""

# Add the retrieved documents to the prompt.
for passage in all_passages:
    passage_oneline = passage.replace("\n", " ")
    prompt += f"PASSAGE: {passage_oneline}\n"

print(prompt)

You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: Hi! Do you sell caps?
PASSAGE: Question: Are your caps structured or unstructured? Answer: We offer both! Structured caps have a stiff buckram lining in the front panels to maintain their shape, while unstructured caps have a softer, more relaxed fit. This information is specified in the product description for each cap style.
PASSAGE: Question: The medium red cap I want is out of stock. When will it be back? Answer: We restock popular items regularly. You can sign up for a 'Back in Stock' notification directly on the product page. Enter your em

Now use the `generate_content` method to to generate an answer to the question.

In [20]:
answer = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt)

Markdown(answer.text)

We offer both structured and unstructured caps; structured caps have a stiff buckram lining in the front panels to maintain their shape, while unstructured caps have a softer, more relaxed fit.


## Another example Query

In [21]:
query_oneline1 = query1.replace("\n", " ")

# This prompt is where you can specify any guidance on tone, or what topics the model should stick to, or avoid.
prompt1 = f"""You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: {query_oneline1}
"""

# Add the retrieved documents to the prompt.
for passage in all_passages1:
    passage_oneline = passage.replace("\n", " ")
    prompt1 += f"PASSAGE: {passage_oneline}\n"

print(prompt1)

You are a helpful and informative bot that answers questions using text from the reference passage included below. 
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. 
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and 
strike a friendly and converstional tone. If the passage is irrelevant to the answer, you may ignore it.

QUESTION: I want to buy a car.
PASSAGE: Question: Do you offer any discounts for first-time buyers? Answer: Yes! We often have a welcome discount for new customers who sign up for our email newsletter. Please check the banner at the top or bottom of our website for current promotions.
PASSAGE: Question: The medium red cap I want is out of stock. When will it be back? Answer: We restock popular items regularly. You can sign up for a 'Back in Stock' notification directly on the product page. Enter your email, and we'll notify you as soon as the medium red c

In [22]:
answer = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=prompt1)

Markdown(answer.text)

Based on the reference passage, while I do not sell cars, if you are a new customer and sign up for an email newsletter, you may receive a welcome discount. Please check the banner at the top or bottom of the website for current promotions.
