# From LLMs to RAG (Student Exercise - Colab Version)

<a href="https://colab.research.google.com/github/suthekshan/Agentic-Ai-Foundations/blob/main/04_LLM_RAG/04_LLM_RAG_colab_student_version.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is a 'tunable' version of the RAG implementation designed for Google Colab.
**Your Task:** The logic is implemented for you, but the **parameters** are missing or need tuning. 
Fill in the variables marked with `___` to get the code running, then experiment with different values to optimize the results.

## 1. Installation
Since we are in Colab, we need to install the necessary libraries first.

## 1. Installation
Since we are in Colab, we need to install the necessary libraries first.

In [None]:
!pip install -q langchain langchain-community langchain-chroma langchain-text-splitters langchain-groq langchain-huggingface sentence-transformers pypdf python-dotenv

## 2. API Key Setup
Enter your Groq API key below.

In [None]:
import os
from google.colab import userdata

# Best practice in Colab: Use the "Secrets" feature (key icon on the left).
# Name your secret 'GROQ_API_KEY'.
try:
    os.environ["GROQ_API_KEY"] = userdata.get('GROQ_API_KEY')
except:
    # Fallback if secrets are not used
    import getpass
    os.environ["GROQ_API_KEY"] = getpass.getpass("Enter your GROQ_API_KEY: ")

## 3. Download Data
We will download the sample PDF directly from the repository so you don't have to upload it manually.
try downloading anyother pdf 

In [None]:
!wget https://raw.githubusercontent.com/suthekshan/Agentic-Ai-Foundations/main/04_LLM_RAG/pdf1.pdf

## 4. Document Loading
We use `PyPDFLoader` to load the PDF.

In [None]:
from langchain_community.document_loaders import PyPDFLoader

pdf_path = "pdf1.pdf"

# Logic is provided, just run it
loader = PyPDFLoader(pdf_path)
documents = loader.load()

# Check how many pages we have
if documents:
    print(f"Loaded {len(documents)} pages")

## 5. Text Chunking (Tunable)

This is a critical step. You need to define how large the chunks are and how much they overlap.

In [None]:
from langchain_text_splitters import CharacterTextSplitter

# TODO: Tuning Step 1
# Experiment with different chunk sizes (e.g., 500, 1000, 2000)
chunk_size_val = ___ 

# TODO: Tuning Step 2
# Experiment with overlap (e.g., 0, 100, 200). meaningful overlap helps preserve context.
chunk_overlap_val = ___

# The logic uses your parameters:
text_splitter = CharacterTextSplitter(
    chunk_size=chunk_size_val,
    chunk_overlap=chunk_overlap_val
)

texts = text_splitter.split_documents(documents)

if texts:
    print(f"Created {len(texts)} chunks")

## 6. Embeddings
Select the embedding model.

In [None]:
from langchain_huggingface import HuggingFaceEmbeddings

# TODO: Specify the model name
# Common option: "sentence-transformers/all-MiniLM-L6-v2"
# Explore anyother model from https://huggingface.co/models?pipeline_tag=sentence-transformers&sort=downloads
model_name_val = "___"

embeddings = HuggingFaceEmbeddings(
    model_name=model_name_val
)

## 7. Vector Store
Initialize ChromaDB with your parameters.

In [None]:
from langchain_chroma import Chroma

# TODO: Name your collection
collection_name_val = "___"

# Logic to create/load the vector store
vector_store = Chroma(
    collection_name=collection_name_val,
    embedding_function=embeddings,
    persist_directory="./chroma_db_colab" 
)

if vector_store._collection.count() == 0:
    vector_store.add_documents(texts)

## 8. Retriever (Tunable)
The retrieval step is highly sensitive to `k`.

In [None]:
# TODO: Tuning Step 3
# Set 'k' - the number of documents to retrieve.
# Try small values (1) and larger values (5).
k_val = ___ 

retriever = vector_store.as_retriever(search_kwargs={"k": k_val})

## 9. LLM Setup (Tunable)
Configure the LLM generation parameters.

In [None]:
from langchain_groq import ChatGroq

# TODO: Tuning Step 4
# Experiment with temperature (0.0 = deterministic, 1.0 = creative)
temperature_val = ___ 

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=temperature_val
)

## 10. Prompt Template (Tunable)
The prompt instructions (system prompt) can changed to improve results.

In [None]:
from langchain_core.prompts import PromptTemplate

# TODO: Tuning Step 5
# Define the prompt template. It MUST include {context} and {query}.
prompt_template_str = """
Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

Context:
{context}

Question:
{query}

Answer:
"""

prompt = PromptTemplate.from_template(prompt_template_str)

## 11. RAG Chain Construction
The pipeline logic is pre-built for you.

In [None]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# The Chain Logic
rag_chain = (
    {"context": retriever | format_docs, "query": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

## 12. Testing
Run the chain with your parameters.

In [None]:
question = "What is the main topic of the document?"

# Invoke the chain
response = rag_chain.invoke(question)

print(response)