Install and Import Libraries:

In [4]:
!pip install openai sentence-transformers faiss-cpu hf_xet 

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.0.1 -> 25.1
[notice] To update, run: C:\Users\Pipilika\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.13_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Load and Chunk your Document:

In [16]:
enc = 'utf-8'
with open('winnie_the_pooh.txt', 'r', encoding=enc) as file:
    # Read the entire content of the file into a string
    text = file.read()

chunks = [text[i:i+200] for i in range(0, len(text), 200)]

Generate Embeddings with SenteceTransformers:

In [17]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = model.encode(chunks)

Store Embeddings in a FAISS Index for Similarity Search:

In [21]:
import faiss
import numpy as np

index = faiss.IndexFlatL2(embeddings[0].shape[0])
index.add(np.array(embeddings))

# Search
query = "Who is always sad?"
query_embedding = model.encode([query])
D, I = index.search(np.array(query_embedding), k=3)

In [22]:
for i in I[0]:
    print(chunks[i])
    print("....")


"You seem so sad, Eeyore."

"Sad? Why should I be sad? It's my birthday. The happiest day of the
year."

"Your birthday?" said Pooh in great surprise.

"Of course it is. Can't you see? Look at all th
....
eyore is in
a Very Sad Condition, because it's his birthday, and nobody has taken
any notice of it, and he's very Gloomy--you know what Eeyore is--and
there he was, and----What a long time whoever liv
....
ing miserable
myself, what with no presents and no cake and no candles, and no proper
notice taken of me at all, but if everybody else is going to be
miserable too----"

This was too much for Pooh. "S
....


Build the Prompt from Retrieved Chunks:

In [23]:

retrieved_chunks = [chunks[i] for i in I[0]]

# Format the prompt
context = "\n\n".join(retrieved_chunks)
#query = "What is the capital of France?"

prompt = f"""You are a helpful assistant. Use the following context to answer the question.

Context:
{context}

Question:
{query}

Answer:"""

print(prompt)

You are a helpful assistant. Use the following context to answer the question.

Context:

"You seem so sad, Eeyore."

"Sad? Why should I be sad? It's my birthday. The happiest day of the
year."

"Your birthday?" said Pooh in great surprise.

"Of course it is. Can't you see? Look at all th

eyore is in
a Very Sad Condition, because it's his birthday, and nobody has taken
any notice of it, and he's very Gloomy--you know what Eeyore is--and
there he was, and----What a long time whoever liv

ing miserable
myself, what with no presents and no cake and no candles, and no proper
notice taken of me at all, but if everybody else is going to be
miserable too----"

This was too much for Pooh. "S

Question:
Who is always sad?

Answer:


Generate an Answer Using a Lightweight Language Model:

In [24]:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

# Load a small, instruction-tuned model
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Build prompt from chunks
retrieved_chunks = [chunks[i] for i in I[0]]
context = "\n\n".join(retrieved_chunks)


# Simple instruction-style prompt for T5
prompt = f"Answer the question based on the context.\n\nContext:\n{context}\n\nQuestion:\n{query}"

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt", truncation=True)

# Generate output
with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=100)

# Decode and print
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Answer:", answer)

Answer: Eeyore
