<a href="https://colab.research.google.com/github/anuradha1105/RAG-Assignment/blob/main/RAG_Quickstart_LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Retrieval-Augmented Generation (RAG) — Zero-to-One Notebook
**Stack:** LangChain • OpenAI (swap-friendly) • Chroma (FAISS optional)

This is a **step-by-step** notebook for absolute beginners. Run each cell **top to bottom**.  
It creates a tiny sample knowledge base so you can get a working demo **without any extra files**.



## ✅ What you'll do
1. Install libraries  
2. Set API keys  
3. Create sample documents (or point to your own)  
4. Split into chunks  
5. Build a **Chroma** vector store (FAISS optional)  
6. Create a retriever + LLM chain  
7. Ask grounded questions  
8. (Optional) Try other providers (Claude/Cohere/Bedrock)

**Screenshots:**
- Installation success
- Vector store summary (num docs/chunks)
- First grounded answer
- (Optional) FAISS run + answer


## 1) Install libraries

In [None]:

# If in Colab, this will take ~1-2 minutes
!pip -q install --upgrade pip
!pip -q install "langchain>=0.2.11" "langchain-community>=0.2.9" "langchain-openai>=0.1.7"                "chromadb>=0.5.3" "tiktoken>=0.7.0" "pypdf>=4.2.0" "faiss-cpu>=1.8.0.post1"                "langchain-anthropic>=0.1.19" "langchain-cohere>=0.1.9"
print("✅ Installs complete")



## 2) Set your API key(s)

- Default path uses **OpenAI** (you can swap later).  
- Replace `"paste-your-openai-key-here"` with your real key from https://platform.openai.com/.  
- For Claude/Cohere, uncomment and paste keys.


In [None]:

import os
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY", "paste-your-openai-key-here")

# Optional: other providers
# os.environ["ANTHROPIC_API_KEY"] = "paste-your-claude-key-here"
# os.environ["COHERE_API_KEY"] = "paste-your-cohere-key-here"
print("🔑 Keys set (replace placeholders before running LLM cells).")



## 3) Create a tiny sample knowledge base (you can also add your own .txt/.pdf later)

If you add PDFs or TXT files to the `data_rag_demo/` folder, they'll be picked up automatically.


In [None]:

from pathlib import Path
DATA_DIR = Path("data_rag_demo")
DATA_DIR.mkdir(exist_ok=True)

samples = {
    "rag_intro.txt": """Retrieval-Augmented Generation (RAG) couples information retrieval with text generation.
The retriever pulls relevant chunks from a knowledge base; the LLM uses them to produce grounded answers.""",

    "vector_stores.txt": """Vector stores like Chroma and FAISS index embeddings so we can search by similarity.
They store pairs of (vector, metadata) to find context quickly at query time.""",

    "best_practices.txt": """Good RAG practice: reasonable chunk sizes with overlap, clean text, track sources,
persist your DB for reuse, and evaluate faithfulness & relevance of answers."""
}
for name, text in samples.items():
    (DATA_DIR / name).write_text(text, encoding="utf-8")

print("✅ Sample files:", [p.name for p in DATA_DIR.iterdir()])



## 4) Load and split documents into chunks
We use `RecursiveCharacterTextSplitter` which works well across formats.


In [None]:

from langchain_community.document_loaders import DirectoryLoader, TextLoader, PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

txt_loader = DirectoryLoader(str(DATA_DIR), glob="**/*.txt", loader_cls=TextLoader, show_progress=True)
docs_txt = txt_loader.load()

pdf_loader = DirectoryLoader(str(DATA_DIR), glob="**/*.pdf", loader_cls=PyPDFLoader, show_progress=True)
docs_pdf = pdf_loader.load()

docs = docs_txt + docs_pdf
print(f"📄 Loaded {len(docs)} documents")

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
splits = splitter.split_documents(docs)
print(f"🔪 Created {len(splits)} chunks")


## 5) Build a Chroma vector store

In [None]:

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

PERSIST_DIR = "chroma_db_demo"
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectordb = Chroma.from_documents(documents=splits, embedding=embeddings, persist_directory=PERSIST_DIR)
vectordb.persist()
print("✅ Chroma ready. Persist dir:", PERSIST_DIR)
try:
    print("Collection count:", vectordb._collection.count())
except Exception as e:
    print("Collection ready.")



## 6) Create the retriever and the RAG chain
We'll use a simple **stuff** chain: retrieved docs → inserted into a prompt → answered by the LLM.


In [None]:

from langchain_openai import ChatOpenAI
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

retriever = vectordb.as_retriever(search_kwargs={"k": 4})
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_template(
    "You are a helpful assistant. Use ONLY the context below to answer. "
    "If the answer isn't in the context, say you don't know.\n\n"
    "Context:\n{context}\n\nQuestion: {input}"
)
doc_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, doc_chain)

print("✅ RAG chain is ready. Ask questions in the next cell.")


## 7) Ask a question

In [None]:

question = "What is RAG and why do we use a vector store?"
result = rag_chain.invoke({"input": question})
print("Q:", question)
print("\nA:", result["answer"])


## 8) (Optional) See retrieved sources

In [None]:

# Inspect top retrieved docs for transparency
docs = retriever.get_relevant_documents("Explain the purpose of vector stores in RAG.")
for i, d in enumerate(docs, 1):
    print(f"--- Source #{i} ---")
    print("Metadata:", d.metadata)
    print(d.page_content[:400], "...\n")



## ⭐ Optional: Use FAISS instead of Chroma
This block shows how to build a FAISS index. Run it if you want a second screenshot.


In [None]:

from langchain_community.vectorstores import FAISS

faiss_db = FAISS.from_documents(documents=splits, embedding=embeddings)
faiss_retriever = faiss_db.as_retriever(search_kwargs={"k": 4})

faiss_chain = create_retrieval_chain(faiss_retriever, doc_chain)
ans = faiss_chain.invoke({"input": "Give two best practices for RAG chunking"})
print(ans["answer"])



## ⭐ Optional: Try other LLM providers (Claude / Cohere / Bedrock)
Uncomment the code and add your keys above to try alternative chat models.
