# Lesson 5: Reranking Retrieved Documents

**Objective**: Enhance the relevance of retrieved documents to improve the quality of generated responses.

**Topics**:
- Reranking techniques: monoT5, monoBERT, RankLLaMA, TILDEv2, Cohere ReRanker
- Trade-offs between speed and accuracy

**Practical Task**: Implement a reranking model and evaluate its impact on retrieval performance.

**Resources**:
- Cohere reranker
- Open-source alternative


In [1]:
from langchain_community.document_loaders import PyPDFLoader

file_path = (
    "../data/Regulaciones cacao y chocolate 2003.pdf"
)
loader = PyPDFLoader(file_path)
splitted_doc = loader.load_and_split()

In [None]:
from dotenv import load_dotenv

load_dotenv()

In [None]:
from langchain_huggingface import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

In [6]:
from langchain_qdrant import RetrievalMode
from langchain_qdrant import QdrantVectorStore

qdrant = QdrantVectorStore.from_documents(
    splitted_doc,
    embedding=embedding_model,
    location=":memory:",
    collection_name="my_documents",
    retrieval_mode=RetrievalMode.DENSE,
)

query = "What is chocolate"
found_docs = qdrant.similarity_search(query)

In [None]:
found_docs

In [17]:
retriever = qdrant.as_retriever(search_type="mmr", search_kwargs={"k": 10})
retrieved_docs = retriever.invoke("What is chocolate")

In [None]:
from rerankers import Reranker
from dotenv import load_dotenv
import os

load_dotenv()
ranker = Reranker("cohere", lang='en', api_key=os.getenv("COHERE_API_KEY"))

In [None]:
retrieved_docs

In [21]:
str_docs = []

for doc in retrieved_docs:
    str_docs.append(doc.page_content)

In [None]:
str_docs

In [None]:
ranker.rank("What is chocolate", str_docs)

In [None]:
ranker.rank("What is chocolate", str_docs)