# Imports

In [1]:
# Importing Chroma from Langchain Framework
from langchain_community.vectorstores import Chroma

# This line imports the HuggingFaceEmbeddings class, which lets you use pre-trained Hugging Face models to convert text into numerical vectors 
#. These embeddings are used for tasks like semantic search, similarity comparison, or feeding into vector databases like Chroma.
from langchain_community.embeddings import HuggingFaceEmbeddings

# Import a text splitter that breaks large texts into smaller chunks based on character count (useful for chunking documents before embedding)
from langchain.text_splitter import CharacterTextSplitter

# Import a PDF loader that reads PDF files and converts each page into a LangChain Document object
from langchain_community.document_loaders import PDFPlumberLoader

# Import the prompt template system to structure chat prompts with variables and roles (e.g., system/human)
from langchain.prompts import ChatPromptTemplate

# Import the Ollama-compatible LLM wrapper for chatting with local models like LLaMA via LangChain
from langchain_community.chat_models import ChatOllama

import warnings
warnings.filterwarnings("ignore")

## 📥 Step 1: Document Ingestion
We load the source documents (e.g., a PDF) to extract text content for further processing.

In [2]:
# Directory for our Vector Database
directory = "my_chroma_db"

# Choose a sentence transformer model (like 'all-MiniLM-L6-v2')
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Load the full PDF containing William Shakespeare's historical biography for text extraction and analysis
pdf_path = "pdf_used.pdf"

In [3]:
# Wrap the string into a list of Document
loader = PDFPlumberLoader(pdf_path)
documents = loader.load()

In [4]:
# Feel free to check these out if you're curious.
for i, doc in enumerate(documents[0:3]):
    print(f"Document #{0 + i}:")
    print(doc.page_content)
    print("-" * 50)

Document #0:
# **The Life, Times, and Works of William Shakespeare: An Exhaustive History**
### **Introduction: The Paradox of the Bard**
William Shakespeare stands as the undisputed titan of world literature, a cultural monument so vast and enduring that his name is
synonymous with the very art of writing. His works—a staggering collection of some 39 plays, 154 sonnets, and several long narrative
poems—represent the pinnacle of achievement in the English language. They have been translated into every major tongue,
performed on countless stages from high school auditoriums to the grandest national theatres, and adapted into every conceivable
medium. He is, as his contemporary Ben Jonson so presciently declared, "not of an age, but for all time."
Yet, behind this colossal literary legacy stands a man whose life is known to us only in frustratingly sparse detail. We have the public
records: a baptismal certificate, a marriage license, property deeds, a last will and testament, and a few 

## ✂️ Text Chunking
Breaking large documents into smaller chunks

In [5]:
# Split the input text into smaller chunks of 500 characters each with no overlap.
# This helps in feeding manageable pieces of content into a vector store or embedding model.
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

## 🧠 Embedding and Storing in Vector Database (ChromaDB)
Each chunk is converted into a vector (embedding) and stored in a vector database for similarity search.

In [6]:
# Create a Chroma vector database from the list of document chunks.
# - docs: List of text chunks to embed and store.
# - embedding_model: Embedding function used to convert text into vector representations.
# - persist_directory: Folder where the vector store will be saved for later use.
db = Chroma.from_documents(
docs, embedding_model, persist_directory=directory)

In [7]:
# Ask a question
query = input("Ask a question about the document: ")

## 🔍 Step 2: Retrieval
Based on a user query, the top relevant chunks are retrieved from the vector store using semantic similarity.

In [8]:
# Perform semantic similarity search on the vector database.
# Retrieves the top 3 document chunks most similar to the user's query.
# These chunks will be used as context for answering the query via the LLM.
docs = db.similarity_search(query, k=3)
# Combine the page content from all retrieved documents into a single context string
context = "\n\n".join([doc.page_content for doc in docs])

In [9]:
# Print each retrieved document's content 
for idx,doc in enumerate(docs):
    print(f"Retrieved Data #{idx+1}")
    print(doc.page_content)
    print("-" * 25)

Retrieved Data #1
### **Chapter 1: The Man from Stratford**
Every aspect of Shakespeare's work is colored by his origins. His understanding of nature, his grasp of social hierarchy, his fascination
with legal and financial matters, and the very texture of his language are rooted in his life in the prosperous market town of Stratford-
upon-Avon.
**Section 1.1: Birth, Parents, and Early Stratford Life**
**William Shakespeare was born in Stratford-upon-Avon, Warwickshire, in April 1564.** While his exact birthdate is not recorded,
tradition holds it to be **April 23, 1564**, a date that conveniently aligns with the date of his death 52 years later. The official record we
possess is that of his baptism at Holy Trinity Church on **April 26, 1564**. He was the third of eight children, and the eldest surviving
son, of John Shakespeare and Mary Arden.
His father, John, was a figure of significant local importance and a model of the era's potential for social mobility. He was a skilled
craftsma

## 🧾 Step 3: Augmentation
The retrieved context is combined with the user query to form a prompt for the LLM.

In [10]:
# Define a teacher-style prompt using system/user roles with placeholders for context and query
messages = [
    ("system", "You are a knowledgeable teacher. Answer the question using only the provided context:\n\n{context}\n\nIf the answer isn't in the context, say: 'Sorry, I couldn’t find the answer in the given material.'"),
    ("human", "{query}")
]

In [11]:
# Define a chat prompt template from the structured message list
prompt_template = ChatPromptTemplate.from_messages(messages)

# Load the LLaMA 3.2 model locally using Ollama
llm = ChatOllama(model="llama3.2")

## 🧠 Step 4: Generation 
The prompt is passed to the LLM (e.g., LLaMA via Ollama) to generate a final answer using both query and retrieved context.

In [12]:
# Pipe the prompt into the LLM using LangChain's chaining syntax
chain = prompt_template | llm

# Invoke the chain with input variables
response = chain.invoke({"context": context, "query": query})

# Print the generated answer from the model
print("\nAnswer:\n")
print(response.content)


Answer:

Shakespeare died on April 23, 1616.
