# Re-ranking

Amazon Bedrock provides access to reranker models that you can use when querying to improve the relevance of the retrieved results.  reranker model calculates the relevance of chunks to a query and reorders the results based on the scores that it calculates. By using a reranker model, you can return responses that are better suited to answering the query. 

Reranker models are trained to identify relevance signals based on a query and then use those signals to rank documents. Because of this, the models can provide more relevant, more accurate results.

If you're using `Amazon Bedrock Knowledge Bases` for building your Retrieval Augmented Generation (RAG) application, use a reranker model while calling the `Retrieve` or `RetrieveAndGenerate operation`. The results from reranking override the default ranking that Amazon Bedrock Knowledge Bases determines.

This notebook demonstrates the use of **reranking model** with Amazon Bedrock Knowledge Bases, through the Rerank API which will help to further improve the accuracy and relevance of RAG applications. With a reranker model, you can retrieve fewer, but more relevant, results. By feeding these results to the foundation model that you use to generate a response, you can also decrease cost and latency.

Let's explore how to implement and utilize reranking models with Amazon Bedrock Knowledge Bases for an example use case.

## 1. Setup
Before running the rest of this notebook, you'll need to run the cells below to (ensure necessary libraries are installed and) connect to Bedrock.

Please ignore any pip dependency error (if you see any while installing libraries)

In [None]:
%pip install --upgrade pip --quiet
%pip install -r ../requirements.txt --no-deps --quiet
%pip install -r ../requirements.txt --upgrade --quiet

In [None]:
%pip install --upgrade boto3

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [None]:
import boto3
print(boto3.__version__)

In [None]:
import warnings
warnings.filterwarnings('ignore')

This code is part of the setup and used to :
- Add the parent directory to the python system path
- Imports a custom module (BedrockStructuredKnowledgeBase) from `utils` necessary for later executions

In [None]:
import os
import sys
import time
import boto3
import logging
import pprint
import json

# Set the path to import module
from pathlib import Path
current_path = Path().resolve()
current_path = current_path.parent
if str(current_path) not in sys.path:
    sys.path.append(str(current_path))
# Print sys.path to verify
# print(sys.path)

from utils.knowledge_base import BedrockKnowledgeBase

In [None]:
#Clients
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
session = boto3.session.Session(region_name = 'us-west-2')
region =  session.region_name
account_id = sts_client.get_caller_identity()["Account"]
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime') 
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)
region, account_id

In [None]:
import time

# Get the current timestamp
current_time = time.time()

# Format the timestamp as a string
timestamp_str = time.strftime("%Y%m%d%H%M%S", time.localtime(current_time))[-7:]
# Create the suffix using the timestamp
suffix = f"{timestamp_str}"
knowledge_base_name = 'reranking-kb'
knowledge_base_description = "Knowledge Base for re-ranking."
bucket_name = f'{knowledge_base_name}-{suffix}'
foundation_model = "anthropic.claude-3-sonnet-20240229-v1:0"

# Define data sources
data_source=[{"type": "S3", "bucket_name": bucket_name}]

## 2 - Create knowledge bases with fixed chunking strategy
Let's start by creating a [Amazon Bedrock Knowledge Bases](https://aws.amazon.com/bedrock/knowledge-bases/) to store video games data in csv format. Knowledge Bases allow you to integrate with different vector databases including [Amazon OpenSearch Serverless](https://aws.amazon.com/opensearch-service/features/serverless/), [Amazon Aurora](https://aws.amazon.com/rds/aurora/), [Pinecone](http://app.pinecone.io/bedrock-integration), [Redis Enterprise]() and [MongoDB Atlas](). For this example, we will integrate the knowledge base with Amazon OpenSearch Serverless. To do so, we will use the helper class `BedrockKnowledgeBase` which will create the knowledge base and all of its pre-requisites:
1. IAM roles and policies
2. S3 bucket
3. Amazon OpenSearch Serverless encryption, network and data access policies
4. Amazon OpenSearch Serverless collection
5. Amazon OpenSearch Serverless vector index
6. Knowledge base
7. Knowledge base data source

We will create a knowledge base using fixed chunking strategy. 

You can chhose different chunking strategies by changing the below parameter values: 
```
"chunkingStrategy": "FIXED_SIZE | NONE | HIERARCHICAL | SEMANTIC"
```

In [None]:
knowledge_base_metadata = BedrockKnowledgeBase(
    kb_name=f'{knowledge_base_name}-{suffix}',
    kb_description=knowledge_base_description,
    data_sources=data_source, 
    chunking_strategy = "FIXED_SIZE", 
    suffix = suffix
)

### 2.1 Download Amazon 2019, 2020, 2021, 2022, & 2023 annual reports and upload it to Amazon S3

Now that we have created the knowledge base, let's populate it with the `sec-10-k reports` dataset to KB. This data is being downloaded from [here](https://ir.aboutamazon.com/annual-reports-proxies-and-shareholder-letters/default.aspx). This data is about Amazon's annual reports, proxies and shareholder letters.

In [None]:
import os

def create_directory(directory_name):    
    if not os.path.exists(directory_name):
        os.makedirs(directory_name)
        print(f"Directory '{directory_name}' created successfully.")
    else:
        print(f"Directory '{directory_name}' already exists.")

# Call the function to create the directory
create_directory("sec-10-k")

In [None]:
import requests

def download_file(url, filename):
    # Send a GET request to the URL
    response = requests.get(url)
    
    # Check if the request was successful
    if response.status_code == 200:
        # Open the file in write-binary mode
        with open(filename, 'wb') as file:
            # Write the content of the response to the file
            file.write(response.content)
        print(f"File downloaded successfully: {filename}")
    else:
        print(f"Failed to download file. Status code: {response.status_code}")

# URL of the files to download
urls = ["https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf",
        "https://s2.q4cdn.com/299287126/files/doc_financials/2023/ar/Amazon-2022-Annual-Report.pdf",
        "https://s2.q4cdn.com/299287126/files/doc_financials/2022/ar/Amazon-2021-Annual-Report.pdf",
        "https://s2.q4cdn.com/299287126/files/doc_financials/2021/ar/Amazon-2020-Annual-Report.pdf",
        "https://s2.q4cdn.com/299287126/files/doc_financials/2020/ar/2019-Annual-Report.pdf"]


for url in urls:
    # Name for the downloaded file
    filename = url.split('/')[-1]

    # Path to save the downloaded file
    filepath = f"./sec-10-k/{filename}"

    # Call the function to download the file
    download_file(url, filepath)

Let's upload the annual reports data available in the `sec-10-k` folder to s3.

In [None]:
def upload_directory(path, bucket_name):
        for root,dirs,files in os.walk(path):
            for file in files:
                if not file.startswith('.DS_Store'):
                    file_to_upload = os.path.join(root,file)
                    print(f"uploading file {file_to_upload} to {bucket_name}")
                    s3_client.upload_file(file_to_upload,bucket_name,file)

# upload metadata file to S3
upload_directory("sec-10-k", bucket_name)

Now start the ingestion job. Since, we are using the same documents as used for fixed chunking, we are skipping the step to upload documents to s3 bucket. 

In [None]:
# ensure that the kb is available
time.sleep(30)
# sync knowledge base
knowledge_base_metadata.start_ingestion_job()

Finally we save the Knowledge Base Id to test the solution at a later stage. 

In [None]:
kb_id = knowledge_base_metadata.get_knowledge_base_id()

## 3. Evaluate the relevance of query responses with and without Re-ranking (using Ragas)

Define models for generation, evaluation and re-ranking

In [None]:
from langchain.llms.bedrock import Bedrock
from langchain_aws import ChatBedrock
from langchain_aws import BedrockEmbeddings

bedrock_client = boto3.client('bedrock-runtime')

TEXT_GENERATION_MODEL_ID = "anthropic.claude-3-haiku-20240307-v1:0"
EVALUATION_MODEL_ID = "anthropic.claude-3-sonnet-20240229-v1:0"
EMBEDDING_MODEL_ID = "amazon.titan-embed-text-v2:0"

# Reranker model: there are two reranker models available at launch
AMAZON_RERANKER_MODEL_ID = "amazon.rerank-v1:0"
COHERE_RERANKER_MODEL_ID = "cohere.rerank-v3-5:0"


llm_for_evaluation = ChatBedrock(model_id="anthropic.claude-3-sonnet-20240229-v1:0", client=bedrock_client)
bedrock_embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0", client=bedrock_client)


#### 3.1 Update Knowledge Bases execution role

In [None]:
# Before using autogenerated filters - update the knowledge base execution IAM role with right permissions

iam = boto3.resource('iam')
client = boto3.client('iam')

def get_attached_policies(role_name):
    response = client.list_attached_role_policies(RoleName=role_name)
    attached_policies = response['AttachedPolicies']
    return attached_policies

# get the knowledge base IAM role name
get_kb_response = bedrock_agent_client.get_knowledge_base(knowledgeBaseId = kb_id)
role_arn = get_kb_response['knowledgeBase']['roleArn']
role_name = role_arn.split('/')[-1]

# get attached policies
attached_policies = get_attached_policies(role_name)
attached_policies

def update_kb_execution_role(attached_policies, region_name):
    
    for policy in attached_policies:

        print(policy['PolicyArn'])
        policy_name = policy['PolicyName']
        policy_arn = policy['PolicyArn']

        if 'FoundationModel' in policy_arn:
            print('Updating FoundationModel policy: ',policy_arn)
            policy = iam.Policy(policy_arn)
            version = policy.default_version
            policyJson = version.document
            policyJson['Statement'][0]['Resource'].append(f'arn:aws:bedrock:{region}::foundation-model/{TEXT_GENERATION_MODEL_ID}')
            policyJson['Statement'][0]['Resource'].append(f'arn:aws:bedrock:{region}::foundation-model/{EVALUATION_MODEL_ID}')  
            policyJson['Statement'][0]['Resource'].append(f'arn:aws:bedrock:{region}::foundation-model/{AMAZON_RERANKER_MODEL_ID}') 
            policyJson['Statement'][0]['Resource'].append(f'arn:aws:bedrock:{region}::foundation-model/{COHERE_RERANKER_MODEL_ID}') 
        
            client.detach_role_policy(RoleName=role_name,
                PolicyArn=policy_arn)
            
            response = client.delete_policy(
                PolicyArn=policy_arn
            )
            print(response)
           
            response = client.create_policy(
            PolicyName= policy_name,
            PolicyDocument=json.dumps(policyJson)
            )
            print(response)
        
        client.attach_role_policy(
            RoleName=role_name,
            PolicyArn=policy_arn
        )

In [None]:
update_kb_execution_role(attached_policies, region)
# time.sleep(30)

#### 3.2 Customize retrieve and generate configuraion

In [None]:
def retrieve_and_generate(query, reranker_model=None, kb_id=None, TEXT_GENERATION_MODEL_ID=None, metadata_filters=None):
    
    # Prepare retrieval configuration
    retrieval_config = {
        "vectorSearchConfiguration": {
            "numberOfResults": 30 if reranker_model else 3
        }
    }

    if reranker_model:
        retrieval_config["vectorSearchConfiguration"]["rerankingConfiguration"] = {
            "type": "BEDROCK_RERANKING_MODEL",
            "bedrockRerankingConfiguration": {
                "modelConfiguration": {
                    "modelArn": f'arn:aws:bedrock:{region}::foundation-model/{reranker_model}',
                },
                "numberOfRerankedResults": 3
            }
        }

        if metadata_filters:
            retrieval_config["vectorSearchConfiguration"]["rerankingConfiguration"]["bedrockRerankingConfiguration"]["metadataConfiguration"] = {
                                                                "selectionMode" : "SELECTIVE",
                                                                "selectiveModeConfiguration" : {
                                                                    "fieldsToInclude": [{
                                                                        "fieldName": "year",
                                                                    }]
                                                                }
                                                            }
                    

    # Call the retrieve and generate API
    start = time.time()
    response = bedrock_agent_runtime_client.retrieve_and_generate(
        input={'text': query},
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': f'arn:aws:bedrock:{region}::foundation-model/{TEXT_GENERATION_MODEL_ID}',
                'retrievalConfiguration': retrieval_config,
            },
        }
    )
    time_spent = time.time() - start

    print(f"[Response] : {response['output']['text']}\n")
    print(f"[Invocation time] : {time_spent}\n")

    return response


#### 3.3 Prepare dataset for evaluation

In [None]:
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import (
    context_relevancy,
    context_recall,
    context_precision,
    context_entity_recall,
    answer_correctness,
)

#specify the metrics here
metrics = [
    context_relevancy,
    context_recall,
    context_precision,
    context_entity_recall,
    answer_correctness,
]

questions = [
    "How many jobs did Amazon create in 2020, and what was its total global workforce after this expansion?",
    "How does the 2023 net sales mix reflect Amazon's global priorities and strategic investments across segments?",
    "How did foreign exchange rate fluctuations impact Amazon's net sales and the corresponding segment's performance in 2023?",
    "What is the cumulative growth contribution of AWS and Advertising segments to Amazon's 2022 consolidated revenue?",
    "How did Amazon's investments in technology infrastructure and fulfillment operations affect its cash flows and operating expenses in 2022?",
    "What types of securities does Amazon invest its excess cash in 2019 and how are these investments classified in the balance sheet?"
]
ground_truths = [
    "Amazon added 500,000 jobs in 2020, bringing its total workforce to approximately 1.3 million employees worldwide.",
    "Amazon's 2023 net sales mix highlights its global priorities, with North America contributing 61%, International 23%, and AWS 16% of total sales. Year-over-year growth in each segment—12% for North America, 11% for International, and 13% for AWS—was driven by increased unit sales, advertising services, and subscription offerings. These trends reflect Amazon's balanced approach to expanding its core markets, strengthening its international presence, and investing in AWS's innovative cloud services to sustain long-term growth.",
    "In 2023, foreign exchange rate fluctuations had a mixed impact on Amazon's financial performance. While these changes reduced consolidated net sales by $71 million, they positively influenced the International segment, increasing its net sales by $88 million. This highlights the nuanced effects of currency fluctuations, where gains in specific regions, such as the International segment, helped offset broader challenges at the consolidated level.",
    "In 2022, AWS achieved a 29% year-over-year revenue growth, increasing from $62.2 billion in 2021 to $80.1 billion. Similarly, the Advertising segment experienced a 25% year-over-year growth, reaching $31 billion in revenue for the year. Together, these segments contributed significantly to Amazon's total consolidated revenue of $434 billion. AWS accounted for 19.59%, while Advertising contributed 7.14%, resulting in a cumulative contribution of approximately 26.73%.",
    "In 2022, Amazon's substantial investments in technology infrastructure and fulfillment operations significantly impacted its cash flows and operating expenses. The company allocated $58.3 billion in cash capital expenditures to support AWS growth and expand its fulfillment network, resulting in a 31% increase in technology and content expenses due to higher payroll costs for technical teams and infrastructure spending on servers, networking equipment, and data centers. Fulfillment costs rose by 12%, driven by increased product sales volume, inventory levels, and wage rate incentives. These investments led to a decline in free cash flow to $(11,569) million, compared to $(9,069) million in 2021. Despite the higher costs, these expenditures were crucial for scaling operations, enhancing the customer experience, and sustaining long-term growth, particularly in AWS and global fulfillment capacity, highlighting Amazon's commitment to maintaining its competitive edge in a rapidly evolving market.",
    "Amazon typically invests its excess cash in AAA-rated money market funds and investment-grade short- to intermediate-term fixed income securities, which are classified as either Cash and cash equivalents or Marketable securities on its consolidated balance sheets. In 2019, Amazon's marketable securities portfolio included a variety of assets such as money market funds, equity securities, foreign government and agency securities, U.S. government and agency securities, corporate debt securities, asset-backed securities, and other fixed income securities. These marketable securities were categorized as either Level 1 or Level 2 securities on the balance sheet."
]

In [None]:
def prepare_eval_dataset(questions, ground_truths, kb_id=None, TEXT_GENERATION_MODEL_ID=None, reranker_model=None, metadata_filters = None):
    answers = []
    contexts = []
    
    for query in questions:
        response = retrieve_and_generate(
            query,
            reranker_model=reranker_model,
            kb_id=kb_id,
            TEXT_GENERATION_MODEL_ID=TEXT_GENERATION_MODEL_ID,
            metadata_filters=metadata_filters
        )
        
        answers.append(response["output"]["text"])
        
        context_group = []
        for citation in response["citations"]:
            context_group.extend([
                ref["content"]["text"]
                for ref in citation["retrievedReferences"]
                if "content" in ref and "text" in ref["content"]
            ])
        contexts.append(context_group)
        time.sleep(15)

    # Create dictionary
    data = {
        "question": questions,
        "answer": answers,
        "contexts": contexts,
        "ground_truth": ground_truths
    }

    # Convert dict to dataset
    dataset = Dataset.from_dict(data)
    return dataset


#### 3.4 Evaluate dataset - without re-ranker

In [None]:
without_reranker_dataset = prepare_eval_dataset(questions, ground_truths, kb_id, TEXT_GENERATION_MODEL_ID, reranker_model=None)

In [None]:
without_reranker_result = evaluate(
    dataset=without_reranker_dataset,
    metrics=metrics,
    llm=llm_for_evaluation,
    embeddings=bedrock_embeddings,
)

without_reranker_result_df = without_reranker_result.to_pandas()

#### 3.5 Evaluate dataset - with re-ranker

In [None]:
with_reranker_dataset = prepare_eval_dataset(questions, ground_truths, kb_id, TEXT_GENERATION_MODEL_ID, reranker_model=AMAZON_RERANKER_MODEL_ID)

In [None]:
with_reranker_result = evaluate(
dataset=with_reranker_dataset,
metrics=metrics,
llm=llm_for_evaluation,
embeddings=bedrock_embeddings,
)

with_reranker_result_df = with_reranker_result.to_pandas()

#### 3.4 Evaluate dataset - with re-ranker + metadata configuration

##### 3.4.1 Prepare metadata for ingestion


In [None]:
import json
import re

def generate_matadata(data_dir):
    
    # Loop through all PDF files in the directory
    for filename in os.listdir(data_dir):
        if not filename.startswith('.DS_Store'):
            # Define the metadata dictionary
            metadata ={}
            
            filename= f'{data_dir}/{filename}'
            print(filename)
            
            # Create metadata
            metadata["company"] = "Amazon"
            metadata["ticker"] = "AMZN"
            metadata["year"] = re.search(r'\d+', filename.split('/')[-1]).group(0)

            # Create a JSON object
            json_data = {"metadataAttributes": metadata}

            # print(json_data)

            # Write the JSON object to a file
            with open(f"{filename.replace('.pdf', '.pdf.metadata.json')}", "w") as f:
                json.dump(json_data, f)


In [None]:
data_dir = './sec-10-k'
generate_matadata(data_dir)

In [None]:
# upload metadata file to S3
upload_directory("sec-10-k", bucket_name)

##### 3.4.2 Ingest metadata into Knowledge Bases


Now start the ingestion job. Since, we are using the same documents as used for fixed chunking, we are skipping the step to upload documents to s3 bucket. 

In [None]:
# ensure that the kb is available
time.sleep(30)
# sync knowledge base
knowledge_base_metadata.start_ingestion_job()

In [None]:
with_reranker_metadata_filters_dataset = prepare_eval_dataset(questions, ground_truths, kb_id, TEXT_GENERATION_MODEL_ID, reranker_model=AMAZON_RERANKER_MODEL_ID, metadata_filters=True)

In [None]:
with_reranker_metadata_filters_result = evaluate(
dataset=with_reranker_metadata_filters_dataset,
metrics=metrics,
llm=llm_for_evaluation,
embeddings=bedrock_embeddings,
)

with_reranker_metadata_filters_result_df = with_reranker_result.to_pandas()

#### 3.5 Prepare Comparison data frame

In [None]:
import pandas as pd

# Create the side-by-side DataFrame
comparison_df = pd.DataFrame({
    'question': without_reranker_result_df['question'],
    'without_reranker_answer': without_reranker_result_df['answer'],
    'with_reranker_answer': with_reranker_result_df['answer'],
    'with_reranker_metadata_answer': with_reranker_metadata_filters_result_df['answer'],
    
    'without_reranker_answer_correctness': without_reranker_result_df['answer_correctness'],
    'with_reranker_answer_correctness': with_reranker_result_df['answer_correctness'],
    'with_reranker_metadata_correctness': with_reranker_metadata_filters_result_df['answer_correctness'],
    })

In [None]:
pd.options.display.max_colwidth = 1000
comparison_df

In [None]:
# Calculate average correctness
without_reranker_avg_correctness = without_reranker_result_df['answer_correctness'].mean()
with_reranker_avg_correctness = with_reranker_result_df['answer_correctness'].mean()
with_reranker_metadata_avg_correctness = with_reranker_metadata_filters_result_df['answer_correctness'].mean()

print(f"\nAverage Correctness without Reranker: {without_reranker_avg_correctness:.4f}")
print(f"Average Correctness with Reranker: {with_reranker_avg_correctness:.4f}")
print(f"Average Correctness with Reranker and metadata filter: {with_reranker_metadata_avg_correctness:.4f}")

### 2.7 Clean up
Please make sure to uncomment and run below cells to delete the resources created in this notebook.

In [None]:
# delete local directory
import shutil

dir_path = "sec-10-k" # Replace with the actual path

try:
    shutil.rmtree(dir_path)
    print(f"Directory '{dir_path}' and its contents have been deleted successfully.")
except FileNotFoundError:
    print(f"Directory '{dir_path}' not found.")
except Exception as e:
        print(f"An error occurred: {e}")

In [None]:
## Empty and delete S3 Bucket

objects = s3_client.list_objects(Bucket=bucket_name)  
if 'Contents' in objects:
    for obj in objects['Contents']:
        s3_client.delete_object(Bucket=bucket_name, Key=obj['Key']) 
s3_client.delete_bucket(Bucket=bucket_name)

In [None]:
print("===============================Knowledge base==============================")
knowledge_base_metadata.delete_kb(delete_s3_bucket=True, delete_iam_roles_and_policies=True)