<a href="https://colab.research.google.com/github/sahiti3636/AIML_SESSION/blob/main/Module2_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Retrieval-Augmented Generation (RAG) — Simple Demo

In this notebook, we build a simple RAG system:
1. Store documents
2. Convert text into embeddings
3. Retrieve relevant context
4. Generate an answer using an LLM (Gemini)

## Install Dependencies



In [1]:
!pip install langchain langchain-google-genai langchain_community langchain_core chromadb sentence-transformers pypdf

Collecting langchain-google-genai
  Downloading langchain_google_genai-4.2.0-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain_community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting chromadb
  Downloading chromadb-1.4.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.2 kB)
Collecting pypdf
  Downloading pypdf-6.6.0-py3-none-any.whl.metadata (7.1 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-genai<2.0.0,>=1.56.0 (from langchain-google-genai)
  Downloading google_genai-1.59.0-py3-none-any.whl.metadata (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-classic<2.0.0,>=1.0.0 (from langchain_community)
  Downloading langchain_classic-1.0.1-py3-none-any.whl.metadata (4.2 kB)
Collecting requests<3.0.0,>=2.32.5 (from l

## Importing necessary libraries

In [1]:
from langchain_google_genai import GoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings

In [2]:
import os
os.environ["GOOGLE_API_KEY"] = "AIzaSyCXrHbhHajhjI1fWZELYAgF6KbUxTRUyrI"

## Our Knowledge Base

Upload PDF

In [3]:
from google.colab import files

uploaded = files.upload()

Saving RAG_PDF2.pdf to RAG_PDF2.pdf
Saving RAG_PDF1.pdf to RAG_PDF1.pdf


Load PDF as Documents

In [4]:
from langchain_community.document_loaders import PyPDFLoader

all_documents = []

for pdf_path in uploaded.keys():
    loader = PyPDFLoader(pdf_path)
    docs = loader.load()
    all_documents.extend(docs)



Split into Chunks

In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100
)

documents = text_splitter.split_documents(all_documents)

## Create Embeddings + Store in Schema

In [6]:
embeddings = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2"
)

  embeddings = HuggingFaceEmbeddings(
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [7]:
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings
)

## Retriever (Semantic Search)

In [8]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

In [15]:
query = "Explain the 'Software as a Service' (SaaS) model in simple terms based on the document"

retrieved_docs = retriever.invoke(query)

In [16]:
retrieved_docs

[Document(metadata={'creator': 'PyPDF', 'page_label': '20', 'producer': 'Mac OS X 10.12.6 Quartz PDFContext', 'moddate': "D:20220125183455Z00'00'", 'total_pages': 32, 'creationdate': "D:20220125183455Z00'00'", 'source': 'RAG_PDF2.pdf', 'page': 19}, page_content='Software-as-a-Service (SaaS)§Applications are supplied by the service provider.§The user does not manage or control the underlying Cloud infrastructure or individual application capabilities. §Services offered include: §Enterprise services such as: workflow management, communications, digital signature, customer relationship management (CRM), desktop software, financial management, geo-spatial, and search.§Not suitable for real-time applications or for those where data is not allowed to be'),
 Document(metadata={'source': 'RAG_PDF2.pdf', 'creator': 'PyPDF', 'page': 16, 'page_label': '17', 'creationdate': "D:20220125183455Z00'00'", 'moddate': "D:20220125183455Z00'00'", 'producer': 'Mac OS X 10.12.6 Quartz PDFContext', 'total_pag

In [17]:
len(retrieved_docs)

2

## Gemini LLM

In [18]:
llm = GoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0
)

## Final RAG Answer

In [19]:
context = "\n".join(doc.page_content for doc in retrieved_docs)

prompt = f"""
You are an assistant that answers questions using the given context.

Rules:
- Use ONLY the information in the context.
- Do NOT use outside knowledge.
- If the context is insufficient, ask the user to rephrase the question
  or provide more information.

Answer Guidelines:
- Write a clear and complete explanation.
- Use 3–5 sentences.
- Prefer simple language suitable for a beginner.

Context:
{context}

Question:
{query}

Answer:
"""

In [20]:
response = llm.invoke(prompt)
print(response)

Software as a Service (SaaS) is a cloud delivery model where applications are supplied by a service provider. In this model, the user does not manage or control the underlying cloud infrastructure or individual application capabilities. SaaS offers various enterprise services, such as workflow management, customer relationship management (CRM), and financial management. However, it is not suitable for real-time applications.


In [None]:
#Explain the 'Software as a Service' (SaaS) model in simple terms based on the document