# Document Reranking Demo

Document reranking improves retrieval quality by using a specialized reranking model to reorder initial retrieval results. Rather than relying solely on semantic similarity from vector embeddings, reranking applies a more sophisticated relevance model (such as Cohere's rerank model) to score retrieved documents and return only the most relevant ones. This post-processing step refines results and reduces noise in the retrieved context.

## What this notebook contains
- Loading and chunking documents from web sources.
- Creating a vector database with OpenAI embeddings for initial retrieval.
- Setting up a base retriever to fetch top-k candidate documents.
- Using Cohere's reranking model to score and reorder retrieved documents.
- Building a `ContextualCompressionRetriever` that combines initial retrieval with reranking.
- Executing queries to retrieve and rerank documents based on relevance.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter  
from langchain_community.document_loaders import WebBaseLoader  
from langchain_community.vectorstores import Chroma  
from langchain_core.output_parsers import StrOutputParser  
from langchain_core.runnables import RunnablePassthrough  
from langchain_openai import ChatOpenAI, OpenAIEmbeddings 
from langchain.prompts import ChatPromptTemplate
from langchain.load import dumps, loads
from langchain_community.llms import Cohere
from langchain.retrievers import  ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CohereRerank
import numpy as np
import yaml
import bs4  
import os

In [None]:
# Get the current working directory
cwd = os.getcwd()

# Build the path to config.yaml
config_path = os.path.join(cwd, '..', 'configs', 'config.yaml')

# Normalize the path
config_path = os.path.abspath(config_path)

# Load credential from config file
with open(config_path, 'r') as file:
    config = yaml.safe_load(file)

# Set environment variables
os.environ['LANGCHAIN_API_KEY'] = config['API']['LANGCHAIN']
os.environ['OPENAI_API_KEY'] = config['API']['OPENAI']
os.environ['TAVILY_API_KEY'] = config['API']['TAVILY']

# Configure chat LLM (deterministic)
llm = ChatOpenAI(temperature=0) 