<a target="_blank" href="https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/llmu/co_aws_ch5_rerank_sm.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>

# Reranking Using Cohere Rerank on Amazon SageMaker

Reranking is an essential technique in information retrieval systems, especially in large-scale search applications. It is a process of reordering a set of initially retrieved documents based on their relevance to a user's query.

What’s great about reranking is that while it gives a huge boost to search results, implementing Cohere’s Rerank models requires adding only one line of code to any existing search system, whether a semantic search system or a traditional search system that uses keyword-based approaches.

In this notebook, we'll explore how to use the Cohere Rerank endpoint on Amazon SageMaker. In particular, we'll look at an example of a multi-aspect search on semi-structured data, and walk through how to perform reranking on email data that contains multiple fields: “title” and “content.”

# Setup

First, let's install and import the necessary libraries and set up our Cohere client.

We'll need to create a SageMaker endpoint that exposes access to a Cohere model (Rerank v3 in our case). For this, we’ll use the cohere_aws SDK which makes it easy to set up the endpoint, together with AWS’s boto3 library.

In [None]:
! pip install cohere cohere-aws boto3

In [3]:
import os
import boto3
import cohere
import cohere_aws
from cohere_aws import Client

In [None]:
import cohere

# Create SageMaker client via the native Cohere SDK
# Contact your AWS administrator for the credentials
co = cohere.SagemakerClient(
    aws_region="YOUR_AWS_REGION",
    aws_access_key="YOUR_AWS_ACCESS_KEY_ID",
    aws_secret_key="YOUR_AWS_SECRET_ACCESS_KEY",
    aws_session_token="YOUR_AWS_SESSION_TOKEN",
)

# For creating an endpoint, you need to use the cohere_aws client: Set environment variables with the AWS credentials
os.environ['AWS_ACCESS_KEY_ID'] = "YOUR_AWS_ACCESS_KEY_ID"
os.environ['AWS_SECRET_ACCESS_KEY'] = "YOUR_AWS_SECRET_ACCESS_KEY"
os.environ['AWS_SESSION_TOKEN'] = "YOUR_AWS_SESSION_TOKEN"

# Create Endpoint

With SageMaker, we’ll need to create an endpoint via an AWS instance. The marketplace listing provides more details, including pricing, on the recommended instance type for a particular model.

To create the endpoint, we define:

- arn: The model package ARN we defined in the previous step
- endpoint_name: A name we can give as an identifier
- instance_type: The instance type to be used
- n_instances: The number of instances

We pass the arguments to the create_endpoint method from the cohere_aws library. 

In [None]:
# Create SageMaker endpoint via the cohere_aws SDK
cohere_package = "cohere-rerank-english-v3-01-d3687e0d2e3a366bb904275616424807"
model_package_map = {
    "us-east-1": f"arn:aws:sagemaker:us-east-1:865070037744:model-package/{cohere_package}",
    "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{cohere_package}",
    "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{cohere_package}",
    "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{cohere_package}",
    "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{cohere_package}",
    "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{cohere_package}",
    "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{cohere_package}",
    "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{cohere_package}",
    "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{cohere_package}",
    "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{cohere_package}",
    "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{cohere_package}",
    "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{cohere_package}",
    "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{cohere_package}",
    "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{cohere_package}",
    "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{cohere_package}",
    "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{cohere_package}",
}

region = boto3.Session().region_name

if region not in model_package_map.keys():
    raise Exception("UNSUPPORTED REGION")

model_package_arn = model_package_map[region]

co_aws = Client(region_name=region)

co_aws.create_endpoint(arn=model_package_arn, endpoint_name="my-rerank-v3", instance_type="ml.g5.xlarge", n_instances=1)

# Retrieve Documents

Let’s assume that the first stage of retrieval has already been performed, whether it’s through a semantic, keyword, or any other type of search system.

Here we have a list of nine documents that represent the search results of that first stage. Each document has two fields, Title and Content, corresponding to the contents of an email. Each email is a dictionary containing these fields that preserves its semi-structured format, which the Rerank endpoint can take advantage of.

In [48]:
documents = [
    {"Title":"Incorrect Password","Content":"Hello, I have been trying to access my account for the past hour and it keeps saying my password is incorrect. Can you please help me?"},
    {"Title":"Confirmation Email Missed","Content":"Hi, I recently purchased a product from your website but I never received a confirmation email. Can you please look into this for me?"},
    {"Title":"Questions about Return Policy","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Customer Support is Busy","Content":"Good morning, I have been trying to reach your customer support team for the past week but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Received Wrong Item","Content":"Hi, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Customer Service is Unavailable","Content":"Hello, I have been trying to reach your customer support team for the past hour but I keep getting a busy signal. Can you please help me?"},
    {"Title":"Return Policy for Defective Product","Content":"Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."},
    {"Title":"Wrong Item Received","Content":"Good morning, I have a question about my recent order. I received the wrong item and I need to return it."},
    {"Title":"Return Defective Product","Content":"Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective."}
]

# Rerank Documents

To use the endpoint, we now use the cohere SDK. Adding a reranking component is simple with Cohere Rerank. It takes just one line of code to implement.

In [49]:
query = 'What emails have been about refunds?'

response = co.rerank(documents=documents,
                     query=query,
                     rank_fields=["Title","Content"],
                     top_n=3,
                     model="my-rerank-v3")

# View Results

Since we defined top_n=3, we’ll get the top three most relevant documents to the query. For each document, the response contains the index of its position in the original list and its relevance score against the query.

In [50]:
print("Documents","\n")

for idx,doc in enumerate(response.results):
    print(f"#{idx+1}:\n{documents[doc.index]}\n")

Documents 

#1:
{'Title': 'Questions about Return Policy', 'Content': 'Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.'}

#2:
{'Title': 'Return Policy for Defective Product', 'Content': 'Hi, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.'}

#3:
{'Title': 'Return Defective Product', 'Content': 'Hello, I have a question about the return policy for this product. I purchased it a few weeks ago and it is defective.'}



The search query was looking for emails about refunds. But none of the documents mention the word “refunds” specifically.

However, the Rerank model was able to retrieve the right documents. From the list of documents, some mention the word “return.” The Rerank model can capture semantically similar meanings between two pieces of text, so it is able to return documents that mention return instead, which has a very similar meaning to return.

# Delete endpoint

Important Note: You will continue to incur charges for as long as an endpoint is running, so remember to delete the endpoint when your usage ends.

In [None]:
co_aws.delete_endpoint()
co_aws.close()

Reranking is a valuable technique used in information retrieval systems to enhance the relevance of search results. Cohere's Rerank endpoint, including its latest model, Rerank 3, offers improved capabilities for enterprise search.

By incorporating reranking with a single line of code, as shown in our example in this chapter, the model successfully identified semantically similar documents, even when specific keywords were absent from the query. This example highlights the potential benefits of integrating reranking into existing search systems to enhance search accuracy and user satisfaction.