# Bedrock Knowledge Base Retrieval and Generation with Reranking

The Rerank API in Amazon Bedrock is a new feature that improves the accuracy and relevance of responses in Retrieval-Augmented Generation (RAG) applications. It supports reranker models that rank a set of retrieved documents based on their relevance to a user's query, helping to prioritize the most relevant content for response generation.

## Key features and use cases:

1. **Enhancing RAG applications**: The Rerank API addresses challenges in semantic search, particularly with complex or ambiguous queries. For example, it can help a customer service chatbot focus on return policies rather than shipping guidelines when asked about returning an online purchase.

2. **Improving search relevance**: It enables developers to significantly enhance their search relevance and content ranking capabilities, making enterprise-grade search technology more accessible.

3. **Optimizing context window usage**: By ensuring the most useful information is sent to the foundation model, it potentially reduces costs and improves response accuracy.

4. **Flexible integration**: The Rerank API can be used independently to rerank documents even if you're not using Amazon Bedrock Knowledge Bases.

5. **Multiple model support**: At launch, it supports Amazon Rerank 1.0 and Cohere Rerank 3.5 models.

6. **Customizable configurations**: Developers can specify additional model configurations as key-value pairs for more tailored reranking.

The Rerank API is available in select AWS Regions, including US West (Oregon), Canada (Central), Europe (Frankfurt), and Asia Pacific (Tokyo). It can be integrated into existing systems at scale, whether keyword-based or semantic, through a single API call in Amazon Bedrock.


## 1: Import and Load Variables

In [None]:
import json

# Load the configuration variables from a JSON file
with open("../Lab 1/variables.json", "r") as f:
    variables = json.load(f)

variables


## 2: Define ARN and Configuration Details

In [None]:
# Setting up configuration for Bedrock
regionName=variables['regionName'] 
accountNumber = variables['accountNumber']
knowledge_base_id = variables['kbFixedChunk']   
model_id = 'us.amazon.nova-pro-v1:0' 

# Define ARNs (Amazon Resource Names) for the model
model_arn = f"arn:aws:bedrock:us-west-2:{accountNumber}:inference-profile/{model_id}"
rerank_model_arn=f"arn:aws:bedrock:us-west-2::foundation-model/cohere.rerank-v3-5:0"


## 3: Set Up Bedrock Client

In [None]:
import boto3
import json
from typing import *

# Configure the Bedrock client
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name="us-west-2")


## 4: Define Function for Reranking

In [None]:
import boto3
import json

def search_kb_with_optional_rerank(query, kb_id, model_arn=None, use_reranking=False):
    """Search KB and optionally rerank results"""
    client = boto3.client("bedrock-agent-runtime", region_name=regionName)
    
    # 1. Retrieve from knowledge base
    kb_response = client.retrieve(
        knowledgeBaseId=kb_id,
        retrievalQuery={"text": query},
        retrievalConfiguration={"vectorSearchConfiguration": {"numberOfResults": 5}}
    )
    
    # Extract documents
    documents = []
    original_results = []
    
    for i, result in enumerate(kb_response.get("retrievalResults", [])):
        # Extract text from result
        text = ""
        if "content" in result and "text" in result["content"]:
            text = " ".join([item.get("span", "") if isinstance(item, dict) else str(item) 
                           for item in result["content"]["text"]])
            
        # Store original result
        original_results.append({
            "position": i + 1,
            "score": result.get("scoreValue", 0),
            "text": text[:300] + "..." if len(text) > 300 else text
        })
        documents.append(text)
    
    # 2. Rerank if enabled
    if use_reranking and model_arn and documents:
        reranked = client.rerank(
            queries=[{"textQuery": {"text": query}, "type": "TEXT"}],
            rerankingConfiguration={
                "bedrockRerankingConfiguration": {
                    "modelConfiguration": {"modelArn": model_arn},
                    "numberOfResults": 5
                },
                "type": "BEDROCK_RERANKING_MODEL"
            },
            sources=[{
                "inlineDocumentSource": {"textDocument": {"text": doc}, "type": "TEXT"},
                "type": "INLINE"
            } for doc in documents]
        )
        
        # Process reranked results
        reranked_results = []
        for result in reranked.get("results", []):
            idx = result.get("index", 0)
            reranked_results.append({
                "original_position": idx + 1,
                "new_position": len(reranked_results) + 1,
                "relevance_score": result.get("relevanceScore", 0),  # Full precision score
                "text": documents[idx][:300] + "..."
            })
        return {"original_results": original_results, "reranked_results": reranked_results}
        
    return {"original_results": original_results}

## 5: Define Function for Retrieve and Generate

In [None]:
def retrieve_and_generate(query, kb_id, rerank_model_arn, use_reranking=True):
    """Full RAG pipeline with optional reranking"""
    # 1. Search and get documents
    results = search_kb_with_optional_rerank(
        query, kb_id, rerank_model_arn, use_reranking
    )
    
    # 2. Use the appropriate results
    if use_reranking and "reranked_results" in results:
        docs = [doc["text"] for doc in results["reranked_results"]]
        source_type = "reranked"
    else:
        docs = [doc["text"] for doc in results["original_results"]]
        source_type = "vector search"
    
    # 3. Generate answer with context from docs
    context = "\n\n".join([f"Document {i+1}: {doc[:300]}..." for i, doc in enumerate(docs[:3])])
    prompt = f"Query: {query}\n\nContext from {source_type}:\n{context}\n\nAnswer:"
    
    # Call your LLM of choice (simplified here)
    client = boto3.client("bedrock-runtime", region_name=regionName)
    response = client.invoke_model(
        modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
        body=json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 500,
            "messages": [{"role": "user", "content": prompt}]
        })
    )
    
    response_body = json.loads(response["body"].read())
    return response_body["content"][0]["text"]

## 6: Compare the Retrieved results WITH & WITHOUT Reranking

In [None]:
# Example usage
query = "Compare the results between 2022 and 2023"

# Without reranking
print("WITHOUT RERANKING:")
results_no_rerank = search_kb_with_optional_rerank(
    query, knowledge_base_id, rerank_model_arn, use_reranking=False
)

# Display original results
print("\nTOP 3 DOCUMENTS WITHOUT RERANKING:")
for doc in results_no_rerank["original_results"][:3]:
    print(f"Position {doc['position']} (Score: {doc['score']}):")
    print(f"{doc['text']}\n")


In [None]:
# With reranking
print("\nWITH RERANKING:")
results_with_rerank = search_kb_with_optional_rerank(
    query, knowledge_base_id, rerank_model_arn, use_reranking=True
)

# Show reranked results with full precision scores
print("\nTOP 3 DOCUMENTS AFTER RERANKING:")
for doc in results_with_rerank["reranked_results"][:3]:
    print(f"Moved from position {doc['original_position']} to {doc['new_position']}")
    print(f"Relevance score: {doc['relevance_score']}")  # Full precision
    print(f"{doc['text']}\n")


## 7: Compare the Generated results WITH & WITHOUT Reranking

In [None]:
print("\nGENERATED ANSWER WITHOUT RERANKING:")
answer_no_rerank = retrieve_and_generate(query, knowledge_base_id, rerank_model_arn, use_reranking=False)
print(answer_no_rerank)


In [None]:
print("\nGENERATED ANSWER WITH RERANKING:")
answer_with_rerank = retrieve_and_generate(query, knowledge_base_id, rerank_model_arn, use_reranking=True)
print(answer_with_rerank)
