# Improving Semantic Search with Rerankers
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/togethercomputer/together-cookbook/blob/main/Search_with_Reranking.ipynb)

## Introduction

This notebook demonstrates how to enhance search results using **reranking** - a two-stage retrieval process that significantly improves search quality and relevance.

<img src="../images/reranking.png" width="1000">

**What is Reranking?**

Reranking is a two-stage process:
1. **Initial Retrieval**: Use fast embedding-based search to get candidate documents
2. **Reranking**: Use a more sophisticated model to reorder candidates based on query relevance

This approach combines the speed of embedding search with the precision of cross-attention models.

### Install relevant libraries

In [1]:
!pip install together

Collecting together
  Downloading together-1.3.3-py3-none-any.whl.metadata (11 kB)
Downloading together-1.3.3-py3-none-any.whl (68 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m68.1/68.1 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: together
Successfully installed together-1.3.3


In [2]:
import together, os

# Paste in your Together AI API Key or load it
TOGETHER_API_KEY = os.environ.get("TOGETHER_API_KEY")

## Download and view the Dataset

In [None]:
# Let's get the movies dataset
!wget https://raw.githubusercontent.com/togethercomputer/together-cookbook/refs/heads/main/datasets/movies.json
!mkdir datasets
!mv movies.json datasets/movies.json

In [None]:
import json

with open('../datasets/movies.json', 'r') as file:
    movies_data = json.load(file)

movies_data[10:13]

[{'title': 'Terminator Genisys',
  'overview': "The year is 2029. John Connor, leader of the resistance continues the war against the machines. At the Los Angeles offensive, John's fears of the unknown future begin to emerge when TECOM spies reveal a new plot by SkyNet that will attack him from both fronts; past and future, and will ultimately change warfare forever.",
  'director': 'Alan Taylor',
  'genres': 'Science Fiction Action Thriller Adventure',
  'tagline': 'Reset the future'},
 {'title': 'Captain America: Civil War',
  'overview': 'Following the events of Age of Ultron, the collective governments of the world pass an act designed to regulate all superhuman activity. This polarizes opinion amongst the Avengers, causing two factions to side with Iron Man or Captain America, which causes an epic battle between former allies.',
  'director': 'Anthony Russo',
  'genres': 'Adventure Action Science Fiction',
  'tagline': 'Divided We Fall'},
 {'title': 'Whiplash',
  'overview': 'Unde

## Implement Semantic Search Pipeline

Below we implement a simple semantic search pipeline:
1. Embed movie documents + query
2. Obtain a list of movies ranked based on cosine similarities between the query and movie vectors.

In [4]:
# This function will be used to access the Together API to generate embeddings for the movie plots

from typing import List

def generate_embeddings(input_texts: List[str], model_api_string: str) -> List[List[float]]:
    """Generate embeddings from Together python library.

    Args:
        input_texts: a list of string input texts.
        model_api_string: str. An API string for a specific embedding model of your choice.

    Returns:
        embeddings_list: a list of embeddings. Each element corresponds to the each input text.
    """
    together_client = together.Together(api_key = TOGETHER_API_KEY)
    outputs = together_client.embeddings.create(
        input=input_texts,
        model=model_api_string,
    )
    return [x.embedding for x in outputs.data]


In [6]:
# Concatenate the title, overview, and tagline of each movie
# this makes the text that will be embedded for each movie more informative
# as a result the embeddings will be richer and capture this information.
to_embed = []
for movie in movies_data[:1000]:
    text = ''
    for field in ['title', 'overview', 'tagline']:
        value = movie.get(field, '')
        text += str(value) + ' '
    to_embed.append(text.strip())

to_embed[:10]

['Minions Minions Stuart, Kevin and Bob are recruited by Scarlet Overkill, a super-villain who, alongside her inventor husband Herb, hatches a plot to take over the world. Before Gru, they had a history of bad bosses',
 'Interstellar Interstellar chronicles the adventures of a group of explorers who make use of a newly discovered wormhole to surpass the limitations on human space travel and conquer the vast distances involved in an interstellar voyage. Mankind was born on Earth. It was never meant to die here.',
 'Deadpool Deadpool tells the origin story of former Special Forces operative turned mercenary Wade Wilson, who after being subjected to a rogue experiment that leaves him with accelerated healing powers, adopts the alter ego Deadpool. Armed with his new abilities and a dark, twisted sense of humor, Deadpool hunts down the man who nearly destroyed his life. Witness the beginning of a happy ending',
 'Guardians of the Galaxy Light years from Earth, 26 years after being abducted,

In [7]:
# Use bge-base-en-v1.5 model to generate embeddings
embeddings = generate_embeddings(to_embed, 'BAAI/bge-base-en-v1.5')

In [9]:
# bge-base-en-v1.5 model generates 768-dimensional embeddings
len(embeddings[0])

768

In [10]:
# Generate the vector embeddings for the query
query = "super hero mystery action movie about bats"

query_embedding = generate_embeddings([query], 'BAAI/bge-base-en-v1.5')[0]

In [11]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Calculate cosine similarity between the query embedding and each movie embedding
similarity_scores = cosine_similarity([query_embedding], embeddings)

In [12]:
# We get a similarity score for each of our 1000 movies - the higher the score, the more similar the movie is to the query
similarity_scores.shape

(1, 1000)

In [13]:
similarity_scores[0, :50]

array([0.40771197, 0.38049766, 0.41696283, 0.45355013, 0.42999108,
       0.38060093, 0.45121266, 0.45003822, 0.3967806 , 0.48520776,
       0.40464459, 0.42882582, 0.38652118, 0.57960979, 0.34677179,
       0.46018015, 0.35667508, 0.54620629, 0.46283555, 0.38756721,
       0.45288009, 0.43207754, 0.44227613, 0.41876672, 0.49631512,
       0.41780368, 0.45989605, 0.34773829, 0.38279619, 0.44958772,
       0.47529959, 0.33805641, 0.35754316, 0.55132521, 0.3921352 ,
       0.45416451, 0.43980237, 0.40724228, 0.38742214, 0.38542083,
       0.34656122, 0.30797263, 0.3550458 , 0.34403634, 0.39187938,
       0.34535796, 0.30361464, 0.43121332, 0.36820356, 0.45294667])

In [16]:
# Sort the similarity scores in descending order, obtain the index of the movies
indices = np.argsort(-similarity_scores)

indices[0, :50]

array([ 13, 265, 451,  33,  56,  17, 140, 450,  58, 828, 227,  62, 337,
       172, 724, 424, 585, 696, 933, 996, 932, 433, 883, 420, 744, 407,
       633, 775, 746, 723, 312, 119, 325, 688, 606, 400, 653, 647, 175,
       655, 187, 613, 948, 580, 772,  24, 751, 835, 476, 961])

In [17]:
# Get the top 25 movie titles that are most similar to the query - these will be passed to the reranker
top_25_sorted_titles = [movies_data[index]['title'] for index in indices[0]][:25]

top_25_sorted_titles

['The Dark Knight',
 'Watchmen',
 'Predator',
 'Despicable Me 2',
 'Night at the Museum: Secret of the Tomb',
 'Batman v Superman: Dawn of Justice',
 'Penguins of Madagascar',
 'Batman & Robin',
 'Batman Begins',
 'Super 8',
 'Megamind',
 'The Dark Knight Rises',
 'Batman Returns',
 'The Incredibles',
 'The Raid',
 'Die Hard: With a Vengeance',
 'Kick-Ass',
 'Fantastic Mr. Fox',
 'Commando',
 'Tremors',
 'The Peanuts Movie',
 'Kung Fu Panda 2',
 'Crank: High Voltage',
 'Men in Black 3',
 'ParaNorman']

Notice here that not all movies in our top 25 have to do with our query - `super hero mystery action movie about bats`. This is because semantic search capture the "approximate" meaning of the query and movies.

The reranker can more closely determine the similarity between these 25 candidates and rerank which ones deserve to be atop our list.

## Use Llama Rank to Rerank Top 25 Movies

Treating the top 25 matching movies as good candidate matches, potentially with irrelevant false positives, that might have snuck in we want to have the reranker model look and rerank each based on similarity to the query.

In [19]:
from together import Together

client = Together(api_key = TOGETHER_API_KEY)

query = "super hero mystery action movie about bats" # we keep the same query - can change if we want

response = client.rerank.create(
  model="Salesforce/Llama-Rank-V1",
  query=query,
  documents=top_25_sorted_titles,
  top_n=5 # we only want the top 5 results
)

for result in response.results:
    print(f"Document Index: {result.index}")
    print(f"Document: {top_25_sorted_titles[result.index]}")
    print(f"Relevance Score: {result.relevance_score}")

Document Index: 12
Document: Batman Returns
Relevance Score: 0.35380946383813044
Document Index: 8
Document: Batman Begins
Relevance Score: 0.339339115127178
Document Index: 7
Document: Batman & Robin
Relevance Score: 0.33013392395016167
Document Index: 5
Document: Batman v Superman: Dawn of Justice
Relevance Score: 0.3289763252445171
Document Index: 9
Document: Super 8
Relevance Score: 0.258483721657576


Here we can see that that reranker was able to improve the list by demoting irrelevant movies like `'Watchmen'`, `'Predator'`, `'Despicable Me 2'`, `'Night at the Museum: Secret of the Tomb'`, `'Penguins of Madagascar'`, further down the list and promoting `'Batman Returns'`, `'Batman Begins'`, `'Batman & Robin'`, `'Batman v Superman: Dawn of Justice'` to the top of the list!

The `bge-base-en-v1.5` embedding model gives us a fuzzy match to concepts mentioned in the query, the `Llama-Rank-V1` reranker then imrpoves the quality of our list further by spending more compute to resort the list of movies.

Learn more about how to use reranker models in the [docs](https://docs.together.ai/docs/rerank-overview) here!