# Building a Naive RAG Application with LangChain and Groq 📚

This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system using LangChain and Groq. We'll create a PDF question-answering system that can:
1. Process PDF documents
2. Generate embeddings
3. Perform similarity search
4. Generate context-aware responses

## Prerequisites
- A Groq API key (get one from [Groq Console](https://console.groq.com))
- Python 3.9+
- Google Colab or local Jupyter environment

## Setup Instructions
1. Set your Groq API key in the environment
2. Install required packages
3. Sample PDF document already provided as URL for testing

In [None]:
# Install required packages
!pip install -q langchain-core langchain-community langchain-groq langchain-huggingface faiss-cpu pypdf python-dotenv requests

In [None]:
# Import necessary libraries
import os
from typing import List

# LangChain components
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.documents import Document
from langchain_groq import ChatGroq
from langchain_huggingface import HuggingFaceEmbeddings

import requests
from io import BytesIO
import tempfile

## Set Environment Variables

In [None]:
import os
from google.colab import userdata

os.environ['GROQ_API_KEY'] = userdata.get('GROQ_API_KEY')

print("API keys have been set!")

API keys have been set!


## Step 1: Document Processing

First, we'll create functions to load and process PDF documents. The process includes:
1. Loading the PDF
2. Splitting text into manageable chunks
3. Creating embeddings and building a vector store

In [None]:
# def load_pdf(file_path: str) -> List[Document]:
#     """Load a PDF file and convert it to a list of documents."""
#     loader = PyPDFLoader(file_path)
#     documents = loader.load()
#     print(f"Loaded PDF with {len(documents)} pages")
#     return documents

def load_pdf(url: str) -> List[Document]:
    """Load a PDF file from a URL and convert it to a list of documents."""
    if not url.lower().endswith(".pdf"):
        print("Error: Please provide a valid PDF URL ending with '.pdf'")
        return []

    response = requests.get(url)
    response.raise_for_status()

    # Create a temporary file to store the PDF content
    with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as temp_file:
        temp_file.write(response.content)
        temp_file_path = temp_file.name

    # Use the temporary file path with PyPDFLoader
    loader = PyPDFLoader(temp_file_path)
    documents = loader.load()
    print(f"Loaded PDF with {len(documents)} pages")

    # Remove the temporary file
    os.remove(temp_file_path)

    return documents

def create_vector_store(documents: List[Document]) -> FAISS:
    """Create a FAISS vector store from documents."""
    # Initialize text splitter
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len
    )

    # Split documents into chunks
    docs = text_splitter.split_documents(documents)
    print(f"Split documents into {len(docs)} chunks")

    # Initialize embeddings (using HuggingFaceEmbeddings here)
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2"
    )

    # Create vector store with FAISS
    vector_store = FAISS.from_documents(docs, embeddings)
    print("Vector store created successfully")
    return vector_store

## Step 2: Process a PDF

 Load a PDF document from a specified URL and create a vector store for efficient content retrieval.

Provided the example PDF URL, you can change to play around with your specific PDF as needed.

In [None]:
pdf_url = "https://arxiv.org/pdf/1706.03762.pdf"  # Example working PDF - Attention is all you need !!
  # Input the PDF URL


# Process the document
documents = load_pdf(pdf_url)
vector_store = create_vector_store(documents)

Loaded PDF with 15 pages
Split documents into 52 chunks
Vector store created successfully


## Step 3: Initialize Groq LLM

We'll set up the Groq model for generating responses using the deepseek-r1-distill-llama-70b model.

In [None]:
def init_llm():
    """Initialize the Groq LLM."""
    return ChatGroq(
        temperature=0.0,  # Lower temperature for focused responses
        model_name="deepseek-r1-distill-llama-70b"
    )

llm = init_llm()
print("Groq LLM initialized.")

Groq LLM initialized.


## Step 4: Create the Question-Answering Function

This function:
1. Takes a user question
2. Retrieves relevant context from the vector store
3. Generates a response using the LLM


In [None]:
def get_answer(question: str, vector_store: FAISS, llm: ChatGroq) -> str:
    """Get an answer to a question using RAG."""
    # Retrieve relevant documents
    relevant_docs = vector_store.similarity_search(question, k=4)
    print(f"Found {len(relevant_docs)} relevant documents")

    # Assemble context
    context = "\n\n".join(doc.page_content for doc in relevant_docs)
def get_answer(question: str, vector_store: FAISS, llm: ChatGroq) -> str:
    """Get an answer to a question using RAG."""
    # Retrieve relevant documents
    relevant_docs = vector_store.similarity_search(question, k=4)
    print(f"Found {len(relevant_docs)} relevant documents")

    # Assemble context retrieved from Vector Store
    context = "\n\n".join(doc.page_content for doc in relevant_docs)

    # Construct the prompt
    prompt = f"""Answer the question based only on the following context:\n\n"""
    prompt += f"Context:\n{context}\n\n"
    prompt += f"Question: {question}\n\n"
    prompt += "Answer the question concisely and accurately. If you cannot answer the question based on the context, say \"I cannot answer this question based on the provided context.\"\n\n"
    prompt += "Answer:"

    # Get the response from the LLM
    response = llm.invoke(prompt)
    return response
    prompt += "Answer the question concisely and accurately. If you cannot answer the question based on the context, say \"I cannot answer this question based on the provided context.\"\n\n"
    prompt += "Answer:"

    # Get the response from the LLM
    response = llm.invoke(prompt)
    return response

## Step 5: Try It Out!

Now you can ask questions about your PDF document. Let's try a few examples.

In [None]:
from IPython.display import display, HTML

def show_rag_output(question, response):
    # Check if the response contains a thinking process section
    if "<think>" in response.content and "</think>" in response.content:
        think_text = response.content.split("<think>")[1].split("</think>")[0].strip()
        if "Answer:" in response.content:
            answer_text = response.content.split("Answer:")[-1].strip()
        else:
            answer_text = response.content.split("</think>")[-1].strip()
    else:
        think_text = ""
        # Access the 'content' attribute of the AIMessage object
        answer_text = response.content.strip()


    html_content = f"""
    <div style="border:1px solid #ccc; padding:15px; margin:15px 0; border-radius:8px; max-height:500px; overflow:auto; font-family:Arial;">
        <h2 style="margin-top:0; color:#333;">Question:</h2>
        <p style="font-size:16px;">{question}</p>
        <hr style="border:1px solid #eee;">
        <h2 style="margin-top:0; color:#555;">Thinking Process:</h2>
        <pre style="background:#f9f9f9; border:1px solid #ddd; padding:10px; border-radius:5px; font-size:14px; white-space:pre-wrap;">{think_text}</pre>
        <hr style="border:1px solid #eee;">
        <h2 style="margin-top:0; color:#555;">Final Answer:</h2>
        <pre style="background:#e8f4ff; border:1px solid #cce7ff; padding:10px; border-radius:5px; font-size:14px; white-space:pre-wrap;">{answer_text}</pre>
    </div>
    """
    display(HTML(html_content))

# Assuming you already have your vector_store and llm initialized
questions = [
    #What is the main topic of this document?"#,
    "What are the key findings discussed in the document?"#,
    #"Can you summarize the conclusion?"
]

for question in questions:
    response = get_answer(question, vector_store, llm)
    show_rag_output(question, response)

Found 4 relevant documents


## Understanding the Flow

1. **Document Processing**:
   - PDF is loaded and split into pages
   - Text is chunked into smaller pieces
   - Each chunk is converted to embeddings

2. **Vector Store**:
   - Embeddings are stored in FAISS for fast similarity search
   - Retrieves relevant context for the given question

3. **Question Answering**:
   - The question is processed, and context is formed
   - The LLM generates an answer based on the context

## Next Steps

- Experiment with different chunk sizes
- Adjust the number of retrieved documents (k value)
- Modify prompt templates and temperature settings to optimize responses

## Common Issues and Solutions

1. **API Key Errors**: Ensure your Groq API key is set correctly
2. **Memory Issues**: Adjust chunk sizes or number of retrieved documents if errors occur
3. **Poor Responses**: Fine-tune the prompt or increase the retrieved context

Happy exploring!