# BGE Reranker v2 m3 - Local Deployment and Testing

This notebook demonstrates how to deploy and test the BGE Reranker v2 m3 model from HuggingFace locally. This is a powerful reranking model that can be used to improve search results by reordering candidate documents based on their relevance to a query.

## 1. Installation

First, let's install the necessary packages:

In [None]:
# UNCOMMENT THE NEXT LINE TO INSTALL DEPENDENCIES

# !pip install transformers torch

## 2. Loading the BGE Reranker Model

Now, let's import the necessary libraries and load the BGE Reranker v2 m3 model from HuggingFace:

In [2]:
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Set model name
model_name = "BAAI/bge-reranker-v2-m3"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Move model to the appropriate device
model = model.to(device)

Using device: cpu


## 3. Creating a Reranking Function

Now we'll create a function to rerank documents based on a query:

In [3]:
def rerank(query, documents, model, tokenizer, device, top_n=None):
    """Rerank documents based on their relevance to a query.
    
    Args:
        query (str): The query string
        documents (list): List of document strings to rerank
        model: The reranker model
        tokenizer: The tokenizer for the model
        device: The device to run inference on
        top_n (int, optional): Number of top documents to return. If None, return all documents.
    
    Returns:
        list: List of (document, score) pairs sorted by relevance score in descending order
    """
    # Prepare pairs of (query, document)
    pairs = []
    for doc in documents:
        pairs.append([query, doc])
    
    # Tokenize the pairs
    with torch.no_grad():
        inputs = tokenizer(
            pairs,
            padding=True,
            truncation=True,
            return_tensors="pt",
            max_length=512
        ).to(device)
        
        # Get scores from the model
        scores = model(**inputs).logits.flatten()
    
    # Convert scores to list and create (document, score) pairs
    scores = scores.cpu().tolist()
    doc_score_pairs = list(zip(documents, scores))
    
    # Sort by score in descending order
    ranked_results = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
    
    # Return top_n results if specified
    if top_n is not None:
        return ranked_results[:top_n]
    return ranked_results

## 4. Testing with a Simple Example

Let's test our reranker with a simple example:

In [4]:
# Define a query and some documents
query = "What are the benefits of regular exercise?"

documents = [
    "Regular exercise improves cardiovascular health, boosts mood, and helps maintain a healthy weight.",
    "The capital of France is Paris, which is known for its beautiful architecture.",
    "Exercise has been shown to reduce the risk of chronic diseases such as diabetes and heart disease.",
    "Coffee contains caffeine which can improve alertness and concentration.",
    "Physical activity strengthens muscles and bones, and can improve sleep quality."
]

# Rerank the documents
ranked_results = rerank(query, documents, model, tokenizer, device)

# Display the ranked results
print(f"Query: {query}\n")
print("Ranked Documents (most to least relevant):")
for i, (doc, score) in enumerate(ranked_results):
    print(f"\n{i+1}. Score: {score:.4f}")
    print(f"   Document: {doc}")

Query: What are the benefits of regular exercise?

Ranked Documents (most to least relevant):

1. Score: 6.7963
   Document: Regular exercise improves cardiovascular health, boosts mood, and helps maintain a healthy weight.

2. Score: 1.7936
   Document: Exercise has been shown to reduce the risk of chronic diseases such as diabetes and heart disease.

3. Score: 0.7743
   Document: Physical activity strengthens muscles and bones, and can improve sleep quality.

4. Score: -9.6014
   Document: Coffee contains caffeine which can improve alertness and concentration.

5. Score: -11.0410
   Document: The capital of France is Paris, which is known for its beautiful architecture.


## 5. Testing with a More Complex Example

Now let's test with a more complex example to demonstrate the reranker's capabilities:

In [5]:
# Define a more specific query and some potentially relevant documents
query = "What programming language is best for machine learning?"

documents = [
    "Python is widely used in machine learning due to its simplicity and the availability of libraries like TensorFlow and PyTorch.",
    "Java is an object-oriented programming language that's been around since the 1990s.",
    "R is a programming language specifically designed for statistical computing, making it popular for data analysis and machine learning.",
    "JavaScript is primarily used for web development and creating interactive web pages.",
    "Machine learning engineers often prefer Python because of its extensive ecosystem of ML libraries and frameworks.",
    "C++ can be used for machine learning when performance is critical, especially in production environments.",
    "The best programming language for a project depends on specific requirements and constraints."
]

# Rerank the documents
ranked_results = rerank(query, documents, model, tokenizer, device)

# Display the ranked results
print(f"Query: {query}\n")
print("Ranked Documents (most to least relevant):")
for i, (doc, score) in enumerate(ranked_results):
    print(f"\n{i+1}. Score: {score:.4f}")
    print(f"   Document: {doc}")

Query: What programming language is best for machine learning?

Ranked Documents (most to least relevant):

1. Score: 3.1123
   Document: R is a programming language specifically designed for statistical computing, making it popular for data analysis and machine learning.

2. Score: 2.6278
   Document: Python is widely used in machine learning due to its simplicity and the availability of libraries like TensorFlow and PyTorch.

3. Score: 2.2042
   Document: Machine learning engineers often prefer Python because of its extensive ecosystem of ML libraries and frameworks.

4. Score: 2.1942
   Document: C++ can be used for machine learning when performance is critical, especially in production environments.

5. Score: -2.4278
   Document: Java is an object-oriented programming language that's been around since the 1990s.

6. Score: -2.5449
   Document: The best programming language for a project depends on specific requirements and constraints.

7. Score: -4.0099
   Document: JavaScript is