## Step 2.1: Load and Chunk Text Data

In [6]:
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Simulate a text file (or use any local .txt file)
with open("../data/data.txt", "w") as f:
    f.write("""
    Natural Language Processing (NLP) is a subfield of artificial intelligence concerned with the interaction between computers and human language. 
    It enables machines to understand, interpret, and generate text in a meaningful way. NLP powers applications like chatbots, sentiment analysis, 
    machine translation, and question answering systems.
    """)

# Load the document
loader = TextLoader("../data/data.txt")
documents = loader.load()

# Split into smaller chunks (RAG works better with chunks)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
docs = text_splitter.split_documents(documents)

## Step 2.2: Embed Documents + Store in FAISS

In [7]:
from langchain_community.vectorstores import FAISS
from sentence_transformers import SentenceTransformer
from langchain_huggingface import HuggingFaceEmbeddings

# Load HF embedding model
embedding_model = HuggingFaceEmbeddings(
    model_name="all-MiniLM-L6-v2"
)

# Create FAISS index
vectorstore = FAISS.from_documents(docs, embedding_model)

# Block 3 — Build RetrievalQA Chain (Latest Syntax)

## Step 3.1: Create a Retriever from FAISS

In [8]:
retriever = vectorstore.as_retriever()

## Step 3.2: Load HuggingFace LLM (Flan-T5)

In [9]:
from langchain_huggingface import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

# Load model
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Build pipeline with sampling
pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer, max_length=100)

# LangChain-compatible LLM wrapper
llm = HuggingFacePipeline(pipeline=pipe)

Device set to use mps:0


## Step 3.3: Build and Run the RetrievalQA Chain

In [10]:
from langchain.chains import RetrievalQA

# Combine retriever and model
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"  # simple chain: concatenate context + ask
)

# Ask a question
query = "What is NLP?"
result = qa_chain.invoke({"query": query})

print("🔍 Question:", query)
print("📘 Answer:", result['result'])

🔍 Question: What is NLP?
📘 Answer: Natural Language Processing


What's happening:
retriever: Pulls relevant chunks using FAISS, 
llm: Generates answer using HuggingFace model, 
stuff chain: Simply stuffs all retrieved text into prompt


# Block: 4 — Save and Reload FAISS Index

## Step 4.1: Save the FAISS Index to Disk

In [14]:
# Save FAISS vectorstore to disk
vectorstore.save_local("../data/vectorstores/nlp_index")

This creates a folder called vectorstores/nlp_index/ with:
	•	index.faiss: vector index
	•	index.pkl: metadata and documents

## Step 4.2: Load FAISS Index Later

In [17]:

# Reuse the same embedding model
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Load the index
vectorstore = FAISS.load_local(
    "../data/vectorstores/nlp_index",
    embeddings=embedding_model,
    allow_dangerous_deserialization=True
)

# Rebuild retriever + QA chain
retriever = vectorstore.as_retriever()
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, chain_type="stuff")

You can now use qa_chain.invoke({"query": ...}) again without needing to rerun chunking + embedding.