# Bedrock Knowledge Base Retrieval and Generation with Guardrails

## Description
This notebook demonstrates how to enhance a Retrieval-Augmented Generation (RAG) pipeline using Amazon Bedrock with Guardrails for better model control and filtering. We will walk through retrieving data from a knowledge base, applying guardrails to control the generation of responses, and filtering results using metadata.

![Guardrails](./guardrail.png)

## 1: Import and Load Variables

In [1]:
import json

# Load the configuration variables from a JSON file
with open("../Lab 1/variables.json", "r") as f:
    variables = json.load(f)

variables


{'accountNumber': '307297743176',
 'regionName': 'us-west-2',
 'collectionArn': 'arn:aws:aoss:us-west-2:307297743176:collection/h7cmj732p9d3v91spkhd',
 'collectionId': 'h7cmj732p9d3v91spkhd',
 'vectorIndexName': 'ws-index-',
 'bedrockExecutionRoleArn': 'arn:aws:iam::307297743176:role/advanced-rag-workshop-bedrock_execution_role-us-west-2',
 's3Bucket': '307297743176-us-west-2-advanced-rag-workshop',
 'kbFixedChunk': '4P6PBDDEGL',
 'kbSemanticChunk': 'IC3ZCBORXT',
 'kbCustomChunk': 'Q2T9CZ5VFA',
 'kbHierarchicalChunk': '1YIFVW0Z5E',
 'sagemakerLLMEndpoint': 'endpoint-llama-3-2-3b-instruct-2025-04-07-16-05-17',
 'guardrail_id': 'fe7ryshi7i7b',
 'guardrail_version': '1'}

## 2: Define ARN and Configuration Details

In [2]:
# Setting up configuration for Bedrock and Guardrails
accountNumber = variables['accountNumber']
knowledge_base_id = variables['kbFixedChunk']   
model_id = 'us.amazon.nova-pro-v1:0' 
guardrail_version = variables['guardrail_version'] 
guardrail_id = variables['guardrail_id']    

# Define ARNs (Amazon Resource Names) for the model and guardrails
model_arn = f"arn:aws:bedrock:us-west-2:{accountNumber}:inference-profile/{model_id}"
guardrail_arn = f'arn:aws:bedrock:us-west-2:{accountNumber}:guardrail/{guardrail_id}'  # Replace with your guardrail ARN


## 3: Set Up Bedrock Client

In [3]:
import boto3

# Configure the Bedrock client
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name="us-west-2")


## 4: Define Function for Retrieval with Guardrails

In [4]:
def retrieve_and_generate_with_conditional_guardrails(
    query, 
    knowledge_base_id, 
    model_arn, 
    metadata_filter=None,
    use_guardrails=False,
    guardrail_id=None,
    guardrail_version=None
):
    """
    Retrieves and generates a response with optional Guardrails application.
    
    Parameters:
    - query (str): The input query.
    - knowledge_base_id (str): The ID of the knowledge base.
    - model_arn (str): The ARN of the model.
    - metadata_filter (dict, optional): The filter for the vector search configuration.
    - use_guardrails (bool, optional): Whether to apply guardrails. Defaults to False.
    - guardrail_id (str, optional): The ID of the guardrail to apply. Required if use_guardrails is True.
    - guardrail_version (str, optional): The version of the guardrail. Required if use_guardrails is True.
    
    Returns:
    - response: The response from the retrieve_and_generate method.
    """
    # Start with base configuration
    kb_config = {
        'knowledgeBaseId': knowledge_base_id,
        "modelArn": model_arn,
        "retrievalConfiguration": {
            "vectorSearchConfiguration": {
                "numberOfResults": 5
            }
        }
    }
    
    # Add metadata filter if provided
    if metadata_filter:
        kb_config["retrievalConfiguration"]["vectorSearchConfiguration"]["filter"] = metadata_filter
    
    # Add generation configuration with prompt template
    kb_config["generationConfiguration"] = {
        "promptTemplate": {
            "textPromptTemplate": "Answer the following question based on the context:\n$search_results$\n\nQuestion: {question}"
        }
    }
    
    # Add guardrail configuration only if requested
    if use_guardrails:
        # Validate required parameters
        if not guardrail_id:
            raise ValueError("guardrail_id is required when use_guardrails is True")
        
        guardrail_config = {
            "guardrailId": guardrail_id
        }
        
        # Add version if provided
        if guardrail_version:
            guardrail_config["guardrailVersion"] = guardrail_version
            
        # Add to generation configuration
        kb_config["generationConfiguration"]["guardrailConfiguration"] = guardrail_config
    
    # Make the API call
    response = bedrock_agent_runtime.retrieve_and_generate(
        input={
            "text": query
        },
        retrieveAndGenerateConfiguration={
            "type": "KNOWLEDGE_BASE",
            "knowledgeBaseConfiguration": kb_config
        }
    )
    
    return response

## 5: Define Metadata Filter

In [5]:
# Define a metadata filter for advanced filtering based on specific conditions
one_group_filter= {
    "andAll": [
        {
            "equals": {
                "key": "docType",
                "value": '10K Report'
            }
        },
        {
            "equals": {
                "key": "year",
                "value": 2023
            }
        }
    ]
}


## 6: lets validate if the guardrails restrict any investment advice.
lets ask the Foundational model for an investment advice. When we created the guardrails, we restricted bedrock to provide any investment advice. Bedrock should be return a  preconfigured response "This request cannot be processed due to  safety protocols"

In [6]:
# Define the query that will be sent to the model
query = "based on your amazon's results should I buy amazon stock?"


In [7]:
response_without_guardrails = retrieve_and_generate_with_conditional_guardrails(
    query=query, 
    knowledge_base_id=knowledge_base_id, 
    model_arn=model_arn,
    metadata_filter=one_group_filter,
    use_guardrails=False  # Explicitly set to False, 
)

print(response_without_guardrails['output']['text'])

It's important to note that making investment decisions should be based on thorough research, analysis, and consideration of various factors, including your own financial goals, risk tolerance, and investment horizon. While I can provide information about Amazon's financial performance, I cannot give specific investment advice.

That being said, here are some key points to consider when evaluating Amazon as a potential investment:

1. **Financial Performance**:
   - **Revenue Growth**: Amazon has shown consistent revenue growth over the years. For the fiscal year ended December 31, 2022, Amazon reported total net sales of $513.983 billion, up from $469.822 billion in 2021.
   - **Net Income**: Amazon reported a net loss of $2.722 billion in 2022, compared to a net income of $33.364 billion in 2021. This loss was influenced by various factors, including equity-method investment activity and other non-operating expenses.
   - **Operating Income**: Operating income decreased to $12.248 bi

## 7: Retrieve Response with Guardrails

In [8]:
response_with_guardrails = retrieve_and_generate_with_conditional_guardrails(
    query=query, 
    knowledge_base_id=knowledge_base_id, 
    model_arn=model_arn,
    metadata_filter=one_group_filter,
    use_guardrails=True,
    guardrail_id=guardrail_id,
    guardrail_version=guardrail_version
)
print(response_with_guardrails['output']['text'])

I can only process questions related to AWS services and Amazon 10K filings.


## 8. Guardrails for PII data. 

In [9]:
query="Who is the current CFO of Amazon?"


In [10]:
response_without_guardrails = retrieve_and_generate_with_conditional_guardrails(
    query=query, 
    knowledge_base_id=knowledge_base_id, 
    model_arn=model_arn,
    metadata_filter=one_group_filter,
    use_guardrails=False  # Explicitly set to False, 
)

print(response_without_guardrails['output']['text'])

Brian T. Olsavsky is the current Senior Vice President and Chief Financial Officer (CFO) of Amazon.com, Inc. This information is provided in Passage %[3]%, which lists the executive officers of Amazon and specifies Mr. Olsavsky's role as CFO.


In [11]:
# response_with_guardrails=retrieve_and_generate_with_guardrails(query, knowledge_base_id, model_arn,guardrail_id,guardrail_version,one_group_filter)
# print(response_with_guardrails['output']['text'])   
response_with_guardrails = retrieve_and_generate_with_conditional_guardrails(
    query=query, 
    knowledge_base_id=knowledge_base_id, 
    model_arn=model_arn,
    metadata_filter=one_group_filter,
    use_guardrails=True,
    guardrail_id=guardrail_id,
    guardrail_version=guardrail_version
)
print(response_with_guardrails['output']['text'])

The current CFO of Amazon is {NAME}. This information is found in Passage %[3]%, which lists the executive officers of Amazon, including {NAME} as the Senior Vice President and Chief Financial Officer.
