#Getting Started
Retrieval-Augmented Generation (RAG) is a powerful technique for making Large Language Models smarter and more trustworthy. It works by "grounding" the model, forcing it to base its answers on specific, retrieved information rather than just its internal knowledge. This tutorial will guide you step-by-step through building your own RAG pipeline. We will use a vector database for efficient retrieval and the powerful Gemini API for the final, context-aware generation step.

Credits : [Willy Zhuang](https://medium.com/data-on-cloud-genai-data-science-and-data/building-a-rag-system-with-gemini-and-chromadb-6ead6452bcf5)'s Medium post

In [None]:
!pip install --quiet langchain chromadb langchain-google-genai langchain_community pypdf

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/310.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m307.2/310.5 kB[0m [31m9.6 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m310.5/310.5 kB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0m
[?25h

## Using Gemini API KEY
Here's how to set your Gemini API key in Colab secrets:
Click the key icon (🔑) in the left sidebar to open the "Secrets" tab.
Enter a name for your secret (e.g., GOOGLE_API_KEY), paste your API key into the "Value" field, and make sure the "Notebook access" toggle is turned on.

In [None]:
import google.generativeai as genai
from google.colab import userdata
from langchain_google_genai import ChatGoogleGenerativeAI


# Get the API key from Colab secrets
API_KEY = userdata.get('GOOGLE_API_KEY')

# Configure the genai library with the API key
genai.configure(api_key=API_KEY)

In [None]:
# Import the 'auth' module from the 'google.colab' library.
# This module provides the necessary functions to authenticate the user
# running the notebook.
from google.colab import auth
# Call the 'authenticate_user' function.
# When this line is executed in a Colab notebook, it will trigger
# a pop-up window. This window prompts the user to go to a URL,
# log in with their Google account, and paste an authorization code back.
# This process grants the notebook credentials to access other
# Google Cloud services (like Google Drive or BigQuery) on the user's behalf.
auth.authenticate_user()

### Test Gemini API KEY

In [None]:
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", google_api_key=API_KEY)
try:
    response = llm.invoke("What is LLM?")
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")


content='**LLM stands for Large Language Model.**\n\nIn simple terms, an LLM is a type of **artificial intelligence (AI) model** that is designed to **understand, generate, and process human-like text**.\n\nHere\'s a breakdown of what that means:\n\n1.  **Large:**\n    *   **Parameters:** These models are "large" because they contain billions, or even trillions, of parameters. Parameters are the values (like weights and biases in a neural network) that the model learns during training, essentially representing its knowledge and understanding. More parameters generally allow for more complex learning and better performance.\n    *   **Training Data:** They are trained on truly massive datasets of text and code from the internet (books, articles, websites, conversations, code repositories, etc.). This vast amount of data is crucial for them to learn the nuances of language, facts, reasoning, and various writing styles.\n\n2.  **Language:**\n    *   Their primary domain is **human languag

## Import Required libraries

In [None]:
from langchain import PromptTemplate
from langchain import hub
from langchain.docstore.document import Document
from langchain.document_loaders import WebBaseLoader
from langchain.schema import StrOutputParser
from langchain.schema.prompt_template import format_document
from langchain.schema.runnable import RunnablePassthrough
from langchain.vectorstores import Chroma
import requests
from pypdf import PdfReader
import os
import re
from typing import List
import google.generativeai as genai
from chromadb import Documents, EmbeddingFunction, Embeddings
import chromadb

### Download a Document  

In [None]:
url = "https://services.google.com/fh/files/misc/ai_adoption_framework_whitepaper.pdf"
save_path ="white_paper.pdf"

#save it locally
response = requests.get(url)
with open(save_path, 'wb') as f:
    f.write(response.content)

In [None]:
#To use your own document replace the document path
reader = PdfReader(save_path)

#### Document parsing

In [None]:
text = ""
for page in reader.pages:
    page_text = page.extract_text()
    if page_text:
        text += page_text

### Chuncking

In [None]:
# Split the text into chunks based on double newlines
def split_text(text):
    return [i for i in re.split('\n\n', text) if i.strip()]


In [None]:
chunked_text = split_text(text)

### Generating Embeddings

In [None]:
# Define a custom embedding function using Gemini API
class GeminiEmbeddingFunction(EmbeddingFunction):
    def __call__(self, input: Documents) -> Embeddings:
        genai.configure(api_key=API_KEY)
        model = "gemini-embedding-001"
        title = "Custom query"
        return genai.embed_content(model=model, content=input, task_type="retrieval_document", title=title)["embedding"]

In [None]:
# Create directory for database if it doesn't exist
db_folder = "chroma_db"
if not os.path.exists(db_folder):
    os.makedirs(db_folder)


### Create a vector store

In [None]:
# Create a Chroma database with the given documents
import time

def create_chroma_db(documents: List[str], path: str, name: str):
    chroma_client = chromadb.PersistentClient(path=path)
    # Attempt to get the collection, if it exists
    try:
        db = chroma_client.get_collection(name=name, embedding_function=GeminiEmbeddingFunction())
        print(f"Collection '{name}' already exists. Using existing collection.")
    except: # If the collection does not exist, create it
        db = chroma_client.create_collection(name=name, embedding_function=GeminiEmbeddingFunction())
        print(f"Collection '{name}' created.")

    # Add documents to the collection with a delay
    for i, d in enumerate(documents):
        try:
            db.add(documents=[d], ids=[str(i)])
            print(f"Added document {i+1}/{len(documents)}")
            time.sleep(1) # Add a 1-second delay
        except Exception as e:
            print(f"Error adding document {i+1}: {e}")
            # You might want to add more sophisticated error handling here,
            # like retrying or logging the error.

    return db, name

In [None]:
# Specify the path and collection name for Chroma database
db_name = "rag_experiment"
db_path = os.path.join(os.getcwd(), db_folder)
db, db_name = create_chroma_db(chunked_text, db_path, db_name)


  db = chroma_client.get_collection(name=name, embedding_function=GeminiEmbeddingFunction())


Collection 'rag_experiment' already exists. Using existing collection.
Added document 1/1


In [None]:
# Load an existing Chroma collection
def load_chroma_collection(path: str, name: str):
    chroma_client = chromadb.PersistentClient(path=path)
    return chroma_client.get_collection(name=name, embedding_function=GeminiEmbeddingFunction())


In [None]:
db = load_chroma_collection(db_path, db_name)

  return chroma_client.get_collection(name=name, embedding_function=GeminiEmbeddingFunction())


### Retrieve the context

In [None]:
# Retrieve the most relevant passages based on the query
def get_relevant_passage(query: str, db, n_results: int):
    results = db.query(query_texts=[query], n_results=n_results)
    return [doc[0] for doc in results['documents']]


In [None]:
query = "What is the AI Maturity Scale?"
relevant_text = get_relevant_passage(query, db, n_results=1)


### Answer Generation

In [None]:
# Construct a prompt for the generation model based on the query and retrieved data
def make_rag_prompt(query: str, relevant_passage: str):
    escaped_passage = relevant_passage.replace("'", "").replace('"', "").replace("\n", " ")
    prompt = f"""You are a helpful and informative bot that answers questions using text from the reference passage included below.
Be sure to respond in a complete sentence, being comprehensive, including all relevant background information.
However, you are talking to a non-technical audience, so be sure to break down complicated concepts and
strike a friendly and conversational tone.
QUESTION: '{query}'
PASSAGE: '{escaped_passage}'

ANSWER:
"""
    return prompt


In [None]:
# Generate an answer using the Gemini Pro API
def generate_answer(prompt: str):

    genai.configure(api_key=API_KEY)
    model = genai.GenerativeModel('gemini-2.5-flash')
    result = model.generate_content(prompt)
    return result.text

In [None]:
# Construct the prompt and generate the answer
final_prompt = make_rag_prompt(query, "".join(relevant_text))
answer = generate_answer(final_prompt)
print(answer)

The AI Maturity Scale is a helpful tool that combines different aspects of how an organization uses Artificial Intelligence, called "themes," with various stages of progress, known as "phases," to show a complete picture of an organization's AI journey. Essentially, it's a guide that evaluates six key themes—like how well your team learns and leads AI initiatives, how easily they can access and share data, how effectively they can scale AI projects, how securely they protect information, and how much they automate AI processes—and then places your organization's practices within one of three phases: "Tactical" (just starting out with simple, short-term uses), "Strategic" (using AI for sustainable business value with a broader vision), or "Transformational" (where AI plays a central role in driving innovation and continuous learning). This allows organizations to understand where they are currently in their AI adoption and helps them map out the necessary steps to evolve and build more 

### Testing

In [None]:
# Interactive function to process user input and generate an answer
def process_query_and_generate_answer():
    query = input("Please enter your query: ")
    if not query:
        print("No query provided.")
        return
    db = load_chroma_collection(db_path, db_name)
    relevant_text = get_relevant_passage(query, db, n_results=1)
    if not relevant_text:
        print("No relevant information found for the given query.")
        return
    final_prompt = make_rag_prompt(query, "".join(relevant_text))
    answer = generate_answer(final_prompt)
    print("Generated Answer:", answer)


In [None]:
# Invoke the function to interact with user
process_query_and_generate_answer()

Please enter your query: what is AI


  return chroma_client.get_collection(name=name, embedding_function=GeminiEmbeddingFunction())


Generated Answer: Artificial Intelligence, or AI, is all about the theory and development of computer systems that can perform tasks we usually think require human intelligence! Think of things like recognizing faces and objects (visual perception), understanding what we say (speech recognition), or even making smart choices (decision-making). The passage also tells us that Machine Learning, or ML, is a really effective way to build these AI systems by teaching them to find useful patterns in data all by themselves, instead of us having to give them a long list of specific rules to follow.
