# RAG Q&A with Google Gemini 2.5 Flash

This notebook demonstrates a Retrieval-Augmented Generation (RAG) pipeline:
1. Load a PDF document
2. Split into chunks
3. Create embeddings & vector store (FAISS)
4. Ask questions using Gemini 2.5 Flash

## Step 1: Load PDF Document

In [None]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("Transformers.pdf")
data = loader.load()
print(f"Loaded {len(data)} pages")
data[0]

## Step 2: Split Documents into Chunks

In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = text_splitter.split_documents(data)

print(f"Total number of chunks: {len(docs)}")
docs[0]

## Step 3: Create Embeddings & Vector Store

Using HuggingFace `all-MiniLM-L6-v2` embeddings with FAISS vector store.

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Test embedding
vector = embeddings.embed_query("hello, world!")
print(f"Embedding dimension: {len(vector)}")
vector[:5]

In [None]:
vectorstore = FAISS.from_documents(documents=docs, embedding=embeddings)
print("Vector store created successfully!")

## Step 4: Test Retrieval

In [None]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

retrieved_docs = retriever.invoke("What is the main topic of this paper?")
print(f"Retrieved {len(retrieved_docs)} chunks")

# Preview first 3 chunks
for i, doc in enumerate(retrieved_docs[:3]):
    print(f"\n--- Chunk {i+1} (Page {doc.metadata.get('page', '?')}) ---")
    print(doc.page_content[:200] + "...")

## Step 5: Set Up Gemini 2.5 Flash & Ask Questions

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
from dotenv import load_dotenv
load_dotenv()

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.3, max_tokens=500)
print("Gemini 2.5 Flash LLM ready!")

In [None]:
def ask_question(question):
    """Ask a question using RAG pipeline"""
    # Retrieve relevant context
    docs = retriever.invoke(question)
    context = "\n\n".join([doc.page_content for doc in docs])
    
    # Build prompt
    prompt = f"""You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, say that you don't know.
Use three sentences maximum and keep the answer concise.

Context: {context}

Question: {question}

Answer:"""
    
    response = llm.invoke(prompt)
    return response.content

print("ask_question() function ready!")

## Step 6: Ask Questions!

In [None]:
answer = ask_question("What is the main contribution of this paper?")
print(answer)

In [None]:
answer = ask_question("What methods or models are discussed in this paper?")
print(answer)

In [None]:
answer = ask_question("What are the key results or findings?")
print(answer)