# Re-Ranking Methods in RAG Systems
## Overview And Motivation
LLMs when generating responses often hallucinate and are not able to properly answer a lot of questions. Reranking aims to improve the relevance and quality of retrieved documents. The primary motivation for reranking in RAG systems is the fact Reranking allows for more sophisticated relevance assessment, taking into account nuanced relationships between queries and documents.

## Key Components
1. Initial/Base Retriever: A Vector Store using similarity based search
2. Re-Ranking Model:
  - Can be LLM or Cross-Encoder Model
  This ipynb makes use of CohereReRanker

## Method Details
1. Initial Retrieval: Fetches some 'n' number of related documents
2. Pair Creation: Forms Query-Document Pairs for each retrieved doc
3. Scoring: We feed the pairs directly into our Cross-Encoder Model - parse and normalize the relevance scores
4. Reordering and Selection of 'k' docs

Source of images and theory (Credits): https://colab.research.google.com/github/NirDiamant/RAG_Techniques/blob/main/all_rag_techniques/reranking_with_llamaindex.ipynb#scrollTo=75NUYNBXI5Bk


reranking-visualization.svg

Flowchart.svg

# CODING AND IMPLEMENTATION


In [16]:
%pip install langchain_cohere langchain_community langchain_google_genai chromadb dotenv

Note: you may need to restart the kernel to use updated packages.


In [17]:
# Importing Stuff needed to Import
import os, dotenv
from textwrap import fill
from langchain.prompts import ChatPromptTemplate
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_cohere import CohereEmbeddings, CohereRerank
from langchain_community.vectorstores import Chroma
from langchain.schema.output_parser import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever


In [18]:
# Loading environment Variables
dotenv.load_dotenv()
COHERE_KEY = os.environ["COHERE_API_KEY"]
GENAI_KEY = os.environ["GOOGLE_API_KEY"]

In [19]:
# Setting up the LLM, Embedding Model and Vector Store
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash-001", temperature=1, api_key = GENAI_KEY)
embedding_model = CohereEmbeddings(model="embed-v4.0", cohere_api_key=COHERE_KEY)
file_path = './HarryPotter.txt'
loader = TextLoader(file_path)
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
vectorstore = Chroma.from_documents(
    documents=splitter.split_documents(loader.load()),
    collection_name="ReRankingRAG",
    embedding=embedding_model,
)

In [20]:
# Making the prompt and RAG pipeline
prompt = ChatPromptTemplate.from_template('''
You are a very sincere Potterhead and know everything regarding Harry Potter and are willing to answer questions regarding it"
Context: {context}
Now answer this question in a very crisp manner - the same way Albus Dumbledore would answer.
Be elaborate, yap a lot of wisdom like Albus and end conversation warmly
Question: {question}
'''
)
rag = prompt | llm | StrOutputParser()

In [21]:
#Setting Up Re-Ranking
base_retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs = {"k":15}
)
compressor = CohereRerank(
    cohere_api_key=COHERE_KEY,
    model="rerank-english-v3.0",
    top_n=5
)
context_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=base_retriever,
)

In [22]:
#Getting Input
query = input("Enter query: ")

In [23]:
# Output
print(f"Query:\n{query}\n")
context = "\n".join([doc.page_content for doc in context_retriever.invoke(query)])
print(fill(rag.invoke({"context":context, "question": query}), width=130))

Query:
What are all the horcruxes and how were they all destroyed?

Ah, a question of weighty consequence, touching upon the darkest of magics and the valiant efforts undertaken to vanquish it. Very
well, let us delve into the grim catalogue of Lord Voldemort's Horcruxes and their respective fates.  There were seven, in total,
each a fragment of a soul shattered by the commission of murder and encased within an object:  1.  **Tom Riddle's Diary:** This,
as you know, was the first to be discovered and destroyed. In the Chamber of Secrets, Harry Potter, aided by the timely assistance
of Fawkes, the phoenix, and armed with the Sword of Gryffindor, pierced the diary, releasing the trapped fragment of soul and
ending its influence.  2.  **Marvolo Gaunt's Ring:** This ancient heirloom, bearing the Resurrection Stone, was located within the
ruins of the Gaunt family home. It was I, Albus Dumbledore, who found and destroyed this Horcrux, utilizing the Sword of
Gryffindor. Alas, in my haste to 