## 2. `ChatPromptTemplate`

### Description:
- `ChatPromptTemplate` is a high-level template designed for **chat-based models**.
- Instead of formatting one single string, it organizes a sequence of messages (system, human, AI, etc.).
- It creates `ChatMessages` which are then sent to chat-based models like GPT-3.5, GPT-4, Claude, etc.

### Key Use-Cases:
- Required for multi-turn conversations or contextual instructions.
- Common in advanced RAG pipelines using `ChatOpenAI` or `ChatAnthropic`.

In [1]:
# ===================== INSTALL DEPENDENCIES =====================
!pip install -q langchain sentence-transformers faiss-cpu pypdf groq langchain-community langchain-groq scikit-learn

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m304.2/304.2 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m130.2/130.2 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m438.1/438.1 kB[0m [31m18.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.0/363.0 kB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
# ===================== IMPORTS =====================
import os
import torch
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.schema.messages import SystemMessage, AIMessage
from langchain_core.prompts import HumanMessagePromptTemplate
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains import RetrievalQA
from langchain_groq import ChatGroq
import numpy as np
from sklearn.decomposition import PCA
from sentence_transformers.cross_encoder import CrossEncoder

import pandas as pd
from IPython.display import display, Markdown

In [4]:
# ===================== LOAD & SPLIT PDF =====================
loader = PyPDFLoader("/content/solid-python.pdf")
documents = loader.load_and_split()

splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = splitter.split_documents(documents)
print(f"Total Chunks Created: {len(docs)}")

Total Chunks Created: 22


In [36]:
# Store reduced vectors manually in FAISS
from langchain.vectorstores.faiss import FAISS
from langchain.embeddings.base import Embeddings
from sentence_transformers import SentenceTransformer
from typing import List
# ===================== EMBEDDINGS + VECTORSTORE =====================
st_model = SentenceTransformer("all-MiniLM-L6-v2")
# Step 1: Extract text chunks
texts = [doc.page_content for doc in docs]
# Step 2: Generate 384D embeddings
original_embeddings = st_model.encode(texts, convert_to_numpy=True)

In [37]:
original_embeddings.shape

(22, 384)

In [38]:
# ===================== PCA DIM REDUCTION =====================
# Perform PCA to reduce 384D → 22D (since we have 22 samples)
pca = PCA(n_components=18)
# Extract vectors from FAISS object - This line is incorrect, original_embeddings is already a numpy array
reduced_embeddings = pca.fit_transform(original_embeddings)
# Custom embedding wrapper to use PCA-reduced vectors
class PCAEmbeddings(Embeddings):
    def __init__(self, model, pca):
        self.model = model
        self.pca = pca

    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        vectors = self.model.encode(texts, convert_to_numpy=True)
        return self.pca.transform(vectors).tolist()

    def embed_query(self, text: str) -> List[float]:
        vector = self.model.encode([text], convert_to_numpy=True)
        return self.pca.transform(vector)[0].tolist()
# Wrap with PCA reducer
embedding_wrapper = PCAEmbeddings(st_model, pca)
vectorstore = FAISS.from_documents(docs, embedding_wrapper)
retriever = vectorstore.as_retriever()

In [39]:
reduced_embeddings.shape

(22, 18)

In [40]:
# ===================== DEFINE LLM =====================
from google.colab import userdata
llm = ChatGroq(
    model_name="llama-3.3-70b-versatile",
    api_key=userdata.get('GROQ_API_KEY')  # Replace with your Groq API key
)

In [41]:
# ===================== DEFINE PROMPT ===================
chat_prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="You are a helpful assistant that answers based only on the given context."),
    HumanMessagePromptTemplate.from_template("Given the following context:\n\n{context}\n\nAnswer the question:\n\n{question}")
])

In [42]:
# ===================== RERANKER =====================
reranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2")

config.json:   0%|          | 0.00/794 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.33k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/3.66k [00:00<?, ?B/s]

In [43]:
question = "What is the main objective of the document?"
# Step 1: Retrieve top chunks using MMR
retrieved_docs = retriever.get_relevant_documents(question)

  retrieved_docs = retriever.get_relevant_documents(question)


In [44]:
# Show pre-reranked chunks
print("\n🔹 Top K Retrieved Chunks (Before Reranking):")
for i, doc in enumerate(retrieved_docs):
    print(f"\n--- Chunk {i+1} ---")
    print(f"Page: {doc.metadata.get('page', 'Unknown')}")
    print(f"Content:\n{doc.page_content[:300]}...")


🔹 Top K Retrieved Chunks (Before Reranking):

--- Chunk 1 ---
Page: 18
Content:
Aspects of a Class
Thursday, Feb 22nd 2024 19/22
The 5 aspects of the class are:
a
responsibility towards parent
interface towards callers
interface towards callees
responsibility towards inheritors
class'
purpose
a
Mike Lindner: The Five Principles For SOLID Software Design...

--- Chunk 2 ---
Page: 19
Content:
The 5 Principles
Thursday, Feb 22nd 2024 20/22
The 5 corresponding principles are:
a
Liskov substitution principle
single
responsibility
principle
interface segregation principle
dependency inversion principle
open-closed principle
a
Mike Lindner: The Five Principles For SOLID Software Design...

--- Chunk 3 ---
Page: 1
Content:
Motivation
Thursday, Feb 22nd 2024 2/22
Find guiding design principles to
maintain software quality over
time....

--- Chunk 4 ---
Page: 4
Content:
SOLID Authors
Thursday, Feb 22nd 2024 5/22
Robert C. Martin
• Author of Clean Code, Functional Design,
and more books
• Author

In [45]:
# Step 2: Rerank the retrieved chunks using cross-encoder
pairs = [[question, doc.page_content] for doc in retrieved_docs]
scores = reranker.predict(pairs)
scored_docs = list(zip(retrieved_docs, scores))
sorted_docs = sorted(scored_docs, key=lambda x: x[1], reverse=True)

In [46]:
# Show reranked chunks
print("\n🔸 Reranked Chunks (CrossEncoder):")
for i, (doc, score) in enumerate(sorted_docs):
    print(f"\n--- Reranked Chunk {i+1} ---")
    print(f"Page: {doc.metadata.get('page', 'Unknown')}")
    print(f"Score: {score:.4f}")
    print(f"Content:\n{doc.page_content[:300]}...")


🔸 Reranked Chunks (CrossEncoder):

--- Reranked Chunk 1 ---
Page: 18
Score: -10.3713
Content:
Aspects of a Class
Thursday, Feb 22nd 2024 19/22
The 5 aspects of the class are:
a
responsibility towards parent
interface towards callers
interface towards callees
responsibility towards inheritors
class'
purpose
a
Mike Lindner: The Five Principles For SOLID Software Design...

--- Reranked Chunk 2 ---
Page: 19
Score: -10.8815
Content:
The 5 Principles
Thursday, Feb 22nd 2024 20/22
The 5 corresponding principles are:
a
Liskov substitution principle
single
responsibility
principle
interface segregation principle
dependency inversion principle
open-closed principle
a
Mike Lindner: The Five Principles For SOLID Software Design...

--- Reranked Chunk 3 ---
Page: 1
Score: -10.9173
Content:
Motivation
Thursday, Feb 22nd 2024 2/22
Find guiding design principles to
maintain software quality over
time....

--- Reranked Chunk 4 ---
Page: 4
Score: -11.0552
Content:
SOLID Authors
Thursday, Feb 22nd 2024

In [48]:
# Answer using original MMR top chunks
context_before = "\n\n".join([doc.page_content for doc in retrieved_docs[:3]])
messages_before = chat_prompt.format_messages(context=context_before, question=question)
answer_before = llm.invoke(messages_before)

In [50]:
display(Markdown("### Final Answer (Before Reranking):"))
display(Markdown(answer_before.content))

### Final Answer (Before Reranking):

The main objective of the document is to find guiding design principles to maintain software quality over time.

In [51]:
# Answer using reranked top chunks
top_reranked_docs = [doc for doc, _ in sorted_docs[:3]]
context_after = "\n\n".join([doc.page_content for doc in top_reranked_docs])
messages_after = chat_prompt.format_messages(context=context_after, question=question)
answer_after = llm.invoke(messages_after)

In [52]:
display(Markdown("### Final Answer (After Reranking):"))
display(Markdown(answer_after.content))

### Final Answer (After Reranking):

The main objective of the document is to find guiding design principles to maintain software quality over time.