# Rerankers

In this notebook, we will show how to use RedisVL to rerank search results (documents or chunks or records) based on the input query. Today RedisVL supports reranking through:

- A re-ranker that uses pre-trained [Cross-Encoders](https://sbert.net/examples/applications/cross-encoder/README.html) which can use models from [Hugging Face cross encoder models](https://huggingface.co/cross-encoder) or Hugging Face models that implement a cross encoder function ([example: BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)).
- The [Cohere /rerank API](https://docs.cohere.com/docs/rerank-2).
- The [VoyageAI /rerank API](https://docs.voyageai.com/docs/reranker).

Before running this notebook, be sure to:
1. Have RedisVL JAR in your classpath
2. Have a running Redis Stack instance with RediSearch > 2.4 active

For example, you can run Redis Stack locally with Docker:

```bash
docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
```

This will run Redis on port 6379 and RedisInsight at http://localhost:8001.

In [1]:
// Load Maven dependencies
%maven com.microsoft.onnxruntime:onnxruntime:1.20.0
%maven com.google.code.gson:gson:2.11.0
%maven org.slf4j:slf4j-nop:2.0.16
%maven com.squareup.okhttp3:okhttp:4.12.0
%maven ai.djl.huggingface:tokenizers:0.30.0
%maven com.cohere:cohere-java:1.8.1

// Note: RedisVL JAR must be in classpath (loaded automatically by Docker container)

In [2]:
// Import RedisVL reranking classes
import com.redis.vl.utils.rerank.*;

// Import Java standard libraries
import java.util.*;
import java.nio.file.*;
import java.io.*;

## Simple Reranking

Reranking provides a relevance boost to search results generated by traditional (lexical) or semantic search strategies.

As a simple demonstration, take the passages and user query below:

In [3]:
String query = "What is the capital of the United States?";

List<String> docs = Arrays.asList(
    "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.",
    "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.",
    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.",
    "Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment."
);

System.out.println("Query: " + query);
System.out.println("Number of documents: " + docs.size());

Query: What is the capital of the United States?
Number of documents: 5


The goal of reranking is to provide a more fine-grained quality improvement to initial search results. With RedisVL, this would likely be results coming back from a search operation like full text or vector.

### Using the Cross-Encoder Reranker

To use the cross-encoder reranker we initialize an instance of `HFCrossEncoderReranker` passing a suitable model (if no model is provided, the `cross-encoder/ms-marco-MiniLM-L-6-v2` model is used).

In [4]:
// Use BAAI/bge-reranker-base to match Python notebook
HFCrossEncoderReranker crossEncoderReranker = HFCrossEncoderReranker.builder()
    .model("BAAI/bge-reranker-base")
    .build();

System.out.println("HFCrossEncoderReranker initialized with model: " + crossEncoderReranker.getModel());

HFCrossEncoderReranker initialized with model: BAAI/bge-reranker-base


### Rerank documents with HFCrossEncoderReranker

With the obtained reranker instance we can rerank and truncate the list of documents based on relevance to the initial query.

In [5]:
RerankResult result = crossEncoderReranker.rank(query, docs);

// Java API: result.getDocuments() and result.getScores()
// Python API: results, scores = reranker.rank(...)

List<?> results = result.getDocuments();
List<Double> scores = result.getScores();

for (int i = 0; i < results.size(); i++) {
    System.out.println(scores.get(i) + " -- " + results.get(i));
}

0.9999381303787231 -- Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.
0.38023582100868225 -- Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.
0.07461141049861908 -- Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.


Notice that the document about Washington, D.C. (the correct answer) is ranked first!

### Using the Cohere Reranker

To initialize the Cohere reranker you'll need to install the cohere library and provide the right Cohere API Key.

In [6]:
// Load API key from .env file
String cohereApiKey = null;
try {
    Path envPath = Paths.get(".env");
    if (Files.exists(envPath)) {
        List<String> lines = Files.readAllLines(envPath);
        for (String line : lines) {
            if (line.startsWith("COHERE_API_KEY=")) {
                cohereApiKey = line.substring("COHERE_API_KEY=".length()).trim();
                break;
            }
        }
    }
} catch (IOException e) {
    System.err.println("Error reading .env file: " + e.getMessage());
}

if (cohereApiKey == null || cohereApiKey.isEmpty()) {
    cohereApiKey = System.getenv("COHERE_API_KEY");
}

if (cohereApiKey != null && !cohereApiKey.isEmpty()) {
    System.out.println("Cohere API key loaded successfully!");
} else {
    System.out.println("WARNING: COHERE_API_KEY not found in .env file or environment variables");
    System.out.println("Please set COHERE_API_KEY to use CohereReranker");
}

Cohere API key loaded successfully!


In [7]:
Map<String, String> apiConfig = Map.of("api_key", cohereApiKey);

CohereReranker cohereReranker = CohereReranker.builder()
    .limit(3)
    .apiConfig(apiConfig)
    .build();

System.out.println("CohereReranker initialized with model: " + cohereReranker.getModel());
System.out.println("Limit: " + cohereReranker.getLimit());

CohereReranker initialized with model: rerank-english-v3.0
Limit: 3


### Rerank documents with CohereReranker

Below we will use the `CohereReranker` to rerank and truncate the list of documents above based on relevance to the initial query.

In [8]:
RerankResult cohereResult = cohereReranker.rank(query, docs);

List<?> cohereResults = cohereResult.getDocuments();
List<Double> cohereScores = cohereResult.getScores();

System.out.println("\n=== Cohere Reranking Results (String Docs) ===");
for (int i = 0; i < cohereResults.size(); i++) {
    System.out.println(cohereScores.get(i) + " -- " + cohereResults.get(i));
}


=== Cohere Reranking Results (String Docs) ===
0.9990563988685608 -- Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.
0.7516481280326843 -- Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment.
0.08882029354572296 -- The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.


### Working with semi-structured documents

Often times the initial result set includes other metadata and components that could be used to steer the reranking relevancy. To accomplish this, we can set the `rank_by` argument and provide documents with those additional fields.

In [9]:
// Create structured documents with "passage" and "source" fields
List<Map<String, Object>> cohereStructuredDocs = new ArrayList<>();

cohereStructuredDocs.add(Map.of(
    "source", "wiki",
    "passage", "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274."
));

cohereStructuredDocs.add(Map.of(
    "source", "encyclopedia",
    "passage", "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan."
));

cohereStructuredDocs.add(Map.of(
    "source", "textbook",
    "passage", "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas."
));

cohereStructuredDocs.add(Map.of(
    "source", "textbook",
    "passage", "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America."
));

cohereStructuredDocs.add(Map.of(
    "source", "wiki",
    "passage", "Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment."
));

System.out.println("Created " + cohereStructuredDocs.size() + " structured documents for Cohere");

Created 5 structured documents for Cohere


In [10]:
// Rerank with rank_by parameter
// Note: CohereReranker requires rankBy to be specified for dictionary documents
CohereReranker cohereRerankBy = CohereReranker.builder()
    .limit(3)
    .rankBy(Arrays.asList("passage", "source"))
    .apiConfig(apiConfig)
    .build();

RerankResult cohereStructuredResult = cohereRerankBy.rank(query, cohereStructuredDocs);

@SuppressWarnings("unchecked")
List<Map<String, Object>> cohereRerankedResults = (List<Map<String, Object>>) cohereStructuredResult.getDocuments();
List<Double> cohereStructuredScores = cohereStructuredResult.getScores();

System.out.println("\n=== Cohere Reranking Results (Structured Docs) ===");
for (int i = 0; i < cohereRerankedResults.size(); i++) {
    System.out.println(cohereStructuredScores.get(i) + " -- " + cohereRerankedResults.get(i));
}


=== Cohere Reranking Results (Structured Docs) ===
0.9988120794296265 -- {source=textbook, passage=Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.}
0.5974904298782349 -- {source=wiki, passage=Capital punishment (the death penalty) has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment.}
0.05910154804587364 -- {source=encyclopedia, passage=The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.}


### Using the VoyageAI Reranker

To initialize the VoyageAI reranker you'll need to install the voyageai library and provide the right VoyageAI API Key.

In [11]:
// Load VoyageAI API key from .env file
String voyageApiKey = null;
try {
    Path envPath = Paths.get(".env");
    if (Files.exists(envPath)) {
        List<String> lines = Files.readAllLines(envPath);
        for (String line : lines) {
            if (line.startsWith("VOYAGE_API_KEY=")) {
                voyageApiKey = line.substring("VOYAGE_API_KEY=".length()).trim();
                break;
            }
        }
    }
} catch (IOException e) {
    System.err.println("Error reading .env file: " + e.getMessage());
}

if (voyageApiKey != null && !voyageApiKey.isEmpty()) {
    System.out.println("VoyageAI API key loaded successfully!");
} else {
    System.err.println("Warning: VOYAGE_API_KEY not found in .env file");
}

VoyageAI API key loaded successfully!


### Rerank documents with VoyageAIReranker

Below we will use the `VoyageAIReranker` to rerank and truncate the list of documents above based on relevance to the initial query.

VoyageAI's reranker provides competitive relevance scoring for document reranking.

In [12]:
Map<String, String> voyageApiConfig = Map.of("api_key", voyageApiKey);

VoyageAIReranker voyageReranker = VoyageAIReranker.builder()
    .model("rerank-lite-1")
    .limit(3)
    .apiConfig(voyageApiConfig)
    .build();

System.out.println("VoyageAI reranker initialized with model: " + voyageReranker.getModel());

VoyageAI reranker initialized with model: rerank-lite-1


In [13]:
RerankResult voyageResult = voyageReranker.rank(query, docs);

List<?> voyageRankedDocs = voyageResult.getDocuments();
List<Double> voyageScores = voyageResult.getScores();

System.out.println("\n=== VoyageAI Reranked Results ===");
for (int i = 0; i < voyageRankedDocs.size(); i++) {
    String doc = (String) voyageRankedDocs.get(i);
    String preview = doc.substring(0, Math.min(60, doc.length()));
    System.out.println(String.format("%.6f -- %s...", voyageScores.get(i), preview));
}


=== VoyageAI Reranked Results ===
0.796875 -- Washington, D.C. (also known as simply Washington or D.C., a...
0.578125 -- Charlotte Amalie is the capital and largest city of the Unit...
0.562500 -- Carson City is the capital city of the American state of Nev...


In [14]:
// VoyageAI requires documents to have a 'content' field for structured docs
// Let's create compatible structured docs

List<Map<String, Object>> voyageStructuredDocs = new ArrayList<>();
for (Map<String, Object> doc : cohereStructuredDocs) {
    // Convert 'passage' field to 'content' field for VoyageAI
    Map<String, Object> voyageDoc = new HashMap<>();
    voyageDoc.put("source", doc.get("source"));
    voyageDoc.put("content", doc.get("passage"));
    voyageStructuredDocs.add(voyageDoc);
}

RerankResult voyageStructuredResult = voyageReranker.rank(query, voyageStructuredDocs);

List<?> voyageStructuredResults = voyageStructuredResult.getDocuments();
List<Double> voyageStructuredScores = voyageStructuredResult.getScores();

System.out.println("\n=== VoyageAI Reranked Results (Structured Docs) ===");
for (int i = 0; i < voyageStructuredResults.size(); i++) {
    System.out.println(String.format("%.6f -- %s", voyageStructuredScores.get(i), voyageStructuredResults.get(i)));
}


=== VoyageAI Reranked Results (Structured Docs) ===
0.796875 -- {source=textbook, content=Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.}
0.578125 -- {source=textbook, content=Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.}
0.562500 -- {source=wiki, content=Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.}


## Summary

This notebook demonstrated how to use RedisVL's reranking capabilities with:

- **HFCrossEncoderReranker**: Local ONNX-based reranking using Hugging Face models
- **CohereReranker**: Cloud-based reranking with Cohere's API, supporting structured documents
- **VoyageAIReranker**: Cloud-based reranking with VoyageAI's API

All rerankers successfully ranked Washington, D.C. as the most relevant document for the query "What is the capital of the United States?".