# Lab 4: Enhancing RAG with Voyage-AI Reranking

This optional lab demonstrates how to integrate Voyage-AI's reranking capabilities into your RAG pipeline. Reranking helps to improve the precision of retrieved documents, ensuring that the most relevant information is passed to the LLM, leading to higher quality responses.

## Objectives
- Understand why reranking is beneficial in RAG.
- Implement Voyage-AI's reranking API to re-order search results.
- Observe the impact of reranking on the context provided to the LLM.

## Prerequisites
- Complete Lab 1, Lab 2, and Lab 3.
- Python environment set up with `pymongo`, `voyageai`, and `python-dotenv` installed.
- `.env` file containing `MONGODB_URI` and `VOYAGEAI_API_KEY`.

In [None]:
%pip install langchain-openai --quiet

## Step 1: Load Environment Variables and Initialize Clients

In [None]:
from dotenv import load_dotenv
import os
import voyageai
from pymongo import MongoClient
from langchain_openai import AzureChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Load environment variables
load_dotenv()

# Initialize Voyage-AI Client
voyageai_api_key = os.environ.get("VOYAGEAI_API_KEY")
if not voyageai_api_key:
    raise ValueError("VOYAGEAI_API_KEY not found in .env file or environment variables.")
vo = voyageai.Client(api_key=voyageai_api_key)

# Initialize MongoDB Client
mongodb_uri = os.environ.get("MONGODB_URI")
if not mongodb_uri:
    raise ValueError("MONGODB_URI not found in .env file or environment variables.")
client = MongoClient(mongodb_uri)

azure_openai_api_key = os.environ.get("AZURE_OPENAI_API_KEY")
if not azure_openai_api_key:
    raise ValueError("AZURE_OPENAI_API_KEY not found in .env file or environment variables.")

azure_openai_endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
if not azure_openai_endpoint:
    raise ValueError("AZURE_OPENAI_ENDPOINT not found in .env file or environment variables.")

azure_openai_api_version = os.environ.get("AZURE_OPENAI_API_VERSION", "2023-05-15") # Default to a common version
azure_openai_deployment_name = os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME")
if not azure_openai_deployment_name:
    raise ValueError("AZURE_OPENAI_DEPLOYMENT_NAME not found in .env file or environment variables.")

# Initialize Azure OpenAI LLM
llm = AzureChatOpenAI(
    openai_api_version=azure_openai_api_version,
    azure_deployment=azure_openai_deployment_name,
    azure_endpoint=azure_openai_endpoint,
    api_key=azure_openai_api_key,
    temperature=0
)

# Select your database and collection
db = client['rag_db']
collection = db['documents']

print("Clients initialized successfully.")

## Step 2: Define User Query and Perform Initial Vector Search

We start with a user query and perform an initial vector search, potentially retrieving more documents than strictly needed, as reranking will help us select the best ones.

In [None]:
user_query = "What are the latest security features?"
print(f"\nUser Query: {user_query}")

print("Generating query embedding with Voyage-AI...")
try:
    query_embedding_response = vo.embed(
        texts=[user_query],
        model="voyage-3-large", 
        input_type="query" 
    )
    query_embedding = query_embedding_response.embeddings[0]
    print("Query embedding generated.")
except Exception as e:
    print(f"Error generating query embedding: {e}")
    exit()

pipeline = [
  {
    '$vectorSearch': {
      'queryVector': query_embedding,
      'path': 'embedding',          
      'numCandidates': 30,         # Search more candidates for reranking
      'limit': 10,                  # Retrieve top 10 for reranking
      'index': 'vector_index'            
    }
  },
  {
    '$project': {
      'text_chunk': 1,
      'source': 1,
      'score': { '$meta': 'vectorSearchScore' },
      '_id': 0 
    }
  }
]

print("Performing initial vector search in MongoDB Atlas...")
initial_retrieved_documents = list(collection.aggregate(pipeline))

if initial_retrieved_documents:
    print(f"Retrieved {len(initial_retrieved_documents)} initial documents for reranking.")
    for i, doc in enumerate(initial_retrieved_documents):
        print(f"  {i+1}. Score: {doc['score']:.4f}, Source: {doc['source']}, Text: {doc['text_chunk'][:50]}...")
else:
    print("No initial documents found.")
    exit()

## Step 3: Rerank Documents with Voyage-AI

We pass the original `user_query` and the `text_chunk`s from our initial retrieval to Voyage-AI's reranker. It will return a new set of scores indicating how relevant each document is to the query.

In [None]:
documents_to_rerank = [doc['text_chunk'] for doc in initial_retrieved_documents]

print("\nReranking documents with Voyage-AI...")
try:
    rerank_result = vo.rerank(
        query=user_query,
        documents=documents_to_rerank,
        model="rerank-2.5-lite" # Or another reranking model
    )

    # Sort the original documents based on the new relevance scores
    reranked_documents_with_scores = sorted(
        zip(initial_retrieved_documents, rerank_result.results),
        key=lambda x: x[1].relevance_score, 
        reverse=True
    )

    print(f"Reranked {len(reranked_documents_with_scores)} documents.")
    print("Top reranked documents:")
    for i, (original_doc, reranked_item) in enumerate(reranked_documents_with_scores[:5]): # Show top 5
        print(f"  {i+1}. Rerank Score: {reranked_item.relevance_score:.4f}, Original Score: {original_doc['score']:.4f}, Source: {original_doc['source']}, Text: {original_doc['text_chunk'][:50]}...")

except Exception as e:
    print(f"Error reranking documents: {e}")
    exit()

## Step 4: Build Augmented Prompt with Reranked Context

Now we take the top documents after reranking and use them to build the context for our LLM prompt.

In [None]:
top_n_after_rerank = 3 # Choose how many top reranked documents to send to the LLM
final_context_chunks = [doc[0]['text_chunk'] for doc in reranked_documents_with_scores[:top_n_after_rerank]]
context_reranked = "\n".join(final_context_chunks)

print("\n--- Reranked Context for LLM ---")
print(context_reranked)
print("-------------------------------")

if not context_reranked:
    print("Warning: No reranked context was available.")

In [None]:
if context_reranked:
    prompt_template = ChatPromptTemplate.from_messages(
        [
            ("system", "You are a helpful assistant. Answer the user's question based on the provided context only. If you cannot find the answer in the context, politely state that the information is not available."),
            ("human", "Context:\n{context}\n\nQuestion: {question}"),
        ]
    )
    chain = prompt_template | llm
    response = chain.invoke({"context": context_reranked, "question": user_query})
    print("\n--- LLM Augmented Prompt (with Reranking) ---")
    print(response.content)
    print("------------------------------------------")

else:
    prompt_template = ChatPromptTemplate.from_messages(
        [
            ("system", "You are a helpful assistant. I couldn't find relevant information for the following question. Please state that the information is not available in the provided knowledge base."),
            ("human", "Question: {question}"),
        ]
    )
    chain = prompt_template | llm
    response = chain.invoke({"question": user_query})
    print("\n--- LLM Augmented Prompt (with Reranking) ---")
    print(response.content)
    print("------------------------------------------")

## Conclusion

Reranking with Voyage-AI provides an effective way to refine the search results before feeding them to an LLM, potentially leading to more accurate and relevant responses in your RAG application. This is a crucial step for optimizing performance in real-world scenarios.

## How to Run and Test (Azure OpenAI and Langchain Integration)

To run this notebook and test the Azure OpenAI and Langchain integration, ensure you have the following environment variables set in your `.env` file or environment:

- `MONGODB_URI`: Your MongoDB Atlas connection URI.
- `VOYAGEAI_API_KEY`: Your Voyage-AI API key.
- `AZURE_OPENAI_API_KEY`: Your Azure OpenAI API key.
- `AZURE_OPENAI_ENDPOINT`: Your Azure OpenAI endpoint (e.g., `https://YOUR_RESOURCE_NAME.openai.azure.com/`).
- `AZURE_OPENAI_DEPLOYMENT_NAME`: The name of your Azure OpenAI model deployment (e.g., `gpt-4`, `gpt-35-turbo`).
- `AZURE_OPENAI_API_VERSION`: The API version for Azure OpenAI (e.g., `2023-05-15`). Defaulted to `2023-05-15` if not provided.

**Steps to run:**
1.  Ensure all prerequisites (including `langchain-openai`) are installed (`pip install -r requirements.txt` if you have one, or `!pip install langchain-openai --quiet` in a notebook cell).
2.  Open the notebook in a Jupyter environment.
3.  Run all cells sequentially. Observe the output from the Langchain-powered Azure OpenAI LLM.

In [None]:
# Don't forget to close the MongoDB client connection
client.close()
print("MongoDB client connection closed.")