# Build a simplified Corrective RAG assistant with Amazon Bedrock

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although Retrieval Augmented Generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong.

Advanced RAG techniques like [Corrective RAG](https://arxiv.org/pdf/2401.15884.pdf) were proposed to improve the robustness of generation. In CRAG, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered. Since retrieval from static and limited corpora can only return sub-optimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches.

This notebook will walk you through the process of building a simplified CRAG based assistant using the Anthropic Claude 3 Large Language Model (LLM) hosted on [Amazon Bedrock](https://aws.amazon.com/bedrock/). We will also use [Knowledge Bases for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/) with Titan Embeddings G1 - Text as the Embeddings model and [Agents for Amazon Bedrock](https://aws.amazon.com/bedrock/agents/).

We will use [LangChain](https://www.langchain.com/) to simplify the process of constructing the prompts, and interacting with the LLMs and Knowledge Bases (KBs). In the process of working through this notebook, you will learn how to setup the Amazon Bedrock client environment, configure security permissions and use prompt templates in LangChain. Invocations that involve LangChain will be explicitly mentioned.

![](./images/flowchart.png)

<div class="alert alert-block alert-info">
<b>Note:</b>
    <ul>
        <li>This notebook should only be run from within an <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html">Amazon SageMaker Notebook instance</a> or within an <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated.html">Amazon SageMaker Studio Notebook</a>.</li>
        <li>This notebook uses text based models along with their versions that were available at the time of writing. Update these as required.</li>
        <li>At the time of writing this notebook, Amazon Bedrock was only available in <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-regions.html">these supported AWS Regions</a>. If you are running this notebook from any other AWS Region, then you have to change the Amazon Bedrock client's region and/or endpoint URL parameters to one of those supported AWS Regions that has Anthropic Claude 3 and Titan Embeddings G1 - Text. Follow the guidance in the <i>Organize imports</i> section of this notebook.</li>
        <li>This notebook is recommended to be run with a minimum instance size of <i>ml.t3.medium</i> and
            <ul>
                <li>With <i>Amazon Linux 2, Jupyter Lab 3</i> as the platform identifier on an Amazon SageMaker Notebook instance.</li>
                <li> (or)
                <li>With <i>Data Science 3.0</i> as the image on an Amazon SageMaker Studio Notebook.</li>
            <ul>
        </li>
        <li>At the time of this writing, the most relevant latest version of the Kernel for running this notebook,
            <ul>
                <li>On an Amazon SageMaker Notebook instance was <i>conda_python3</i></li>
                <li>On an Amazon SageMaker Studio Notebook was <i>Python 3</i></li>
            </ul>
        </li>
    </ul>
</div>

**Table of Contents:**

1. [Complete prerequisites](#Complete%20prerequisites)

    1. [Check and configure access to the Internet](#Check%20and%20configure%20access%20to%20the%20Internet)

    2. [Install required software libraries](#Install%20required%20software%20libraries)
    
    3. [Configure logging](#Configure%20logging)
        
        1. [System logs (Optional)](#Configure%20system%20logs%20(Optional))
        
        2. [Application logs](#Configure%20application%20logs)
    
    4. [Organize imports](#Organize%20imports)
    
    5. [Set AWS Region and boto3 config](#Set%20AWS%20Region%20and%20boto3%20config)
    
    6. [Enable model access in Amazon Bedrock](#Enable%20model%20access%20in%20Amazon%20Bedrock)
    
    7. [Check and configure security permissions](#Check%20and%20configure%20security%20permissions)
    
    8. [Create common objects](#Create%20common%20objects)
    
    9. [Get Knowledge Base details](#Get%20Knowledge%20Base%20details)
    
    10. [Get Bedrock Agent details](#Get%20Bedrock%20Agent%20details)

 2. [Load data to Knowledge Base](#Load%20data%20to%20Knowledge%20Base)
 
    1. [Step0a: Download data](#Load%20to%20KB%20Step0a)
    
    2. [Step0b: Copy downloaded data to S3](#Load%20to%20KB%20Step0b)
    
    3. [Steps 0c to 0e: Sync to Knowledge Base](#Load%20to%20KB%20Steps0c%20to%200e)
 
 3. [Scenario 1 - match found in KB](#Match%20found%20in%20KB)
 
     1. [Step 1: User query](#Scenario%201%20Step%201)
     
     2. [Steps 2a and 2b: Query lookup](#Scenario%201%20Steps%202a%20and%202b)
     
     3. [Steps 3a and 3b: Determine the query results relevancy](#Scenario%201%20Steps%203a%20and%203b)
     
     4. [Steps 4a through 5: Process to completion](#Scenario%201%20Steps%204a%20through%205)
 
 4. [Scenario 2 - match not found in KB](#Match%20not%20found%20in%20KB)
 
     1. [Step 1: User query](#Scenario%202%20Step%201)
     
     2. [Steps 2a and 2b: Query lookup](#Scenario%202%20Steps%202a%20and%202b)
     
     3. [Steps 3a and 3b: Determine the query results relevancy](#Scenario%202%20Steps%203a%20and%203b)
     
     4. [Steps 4a through 7: Process to completion](#Scenario%202%20Steps%204a%20through%207)
 
 5. [Associate Bedrock Agent with KB (Optional)](#Associate%20Bedrock%20Agent%20with%20KB%20(Optional))
 
 6. [Cleanup](#Cleanup)
 
 7. [Conclusion](#Conclusion)
 
 8. [Frequently Asked Questions (FAQs)](#FAQs)

##  1. Complete prerequisites <a id ='Complete%20prerequisites'> </a>

Check and complete the prerequisites.

###  A. Check and configure access to the Internet <a id ='Check%20and%20configure%20access%20to%20the%20Internet'> </a>
This notebook requires outbound access to the Internet to download the required software updates and to download the dataset.  You can either provide direct Internet access (default) or provide Internet access through an [Amazon VPC](https://aws.amazon.com/vpc/).  For more information on this, refer [here](https://docs.aws.amazon.com/sagemaker/latest/dg/appendix-notebook-and-internet-access.html).

### B. Install required software libraries <a id ='Install%20required%20software%20libraries'> </a>
This notebook requires the following libraries:
* [SageMaker Python SDK version 2.x](https://sagemaker.readthedocs.io/en/stable/v2.html)
* [Python 3.10.x](https://www.python.org/downloads/release/python-3100/)
* [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html)
* [LangChain](https://www.langchain.com/)

Run the following cell to install the required libraries.

<div class="alert alert-block alert-warning">  
    <b>Note:</b> At the end of the installation, the Kernel will be forcefully restarted immediately. Please wait 10 seconds for the kernel to come back before running the next cell.
</div>

In [None]:
!pip install boto3==1.34.109
!pip install langchain==0.2.0
!pip install langchain-aws==0.1.4
!pip install langchain-community==0.2.0
!pip install requests==2.32.1
!pip install sagemaker==2.221.0

import IPython

IPython.Application.instance().kernel.do_shutdown(True)

### C. Configure logging <a id ='Configure%20logging'> </a>

####  a. System logs (Optional) <a id='Configure%20system%20logs%20(Optional)'></a>

System logs refers to the logs generated by the notebook's interactions with the underlying notebook instance. Some examples of these are the logs generated when loading or saving the notebook.

These logs are automatically setup when the notebook instance is launched.

These logs can be accessed through the [Amazon CloudWatch Logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html) console in the same AWS Region where this notebook is running.
* When running this notebook in an Amazon SageMaker Notebook instance, navigate to the following location,
    * <i>CloudWatch > Log groups > /aws/sagemaker/NotebookInstances > {notebook-instance-name}/jupyter.log</i>
* When running this notebook in an Amazon SageMaker Studio Notebook, navigate to the following locations,
    * <i>CloudWatch > Log groups > /aws/sagemaker/studio > {sagmaker-domain-name}/{user-name}/KernelGateway/{notebook-instance-name}</i>
    * <i>CloudWatch > Log groups > /aws/sagemaker/studio > {sagmaker-domain-name}/{user-name}/JupyterServer/default</i>

If you want to find out the name of the underlying instance where this notebook is running, uncomment the following code cell and run it.

In [None]:
'''
import json

notebook_name = ''
resource_metadata_path = '/opt/ml/metadata/resource-metadata.json'
with open(resource_metadata_path, 'r') as metadata:
    notebook_name = (json.load(metadata))['ResourceName']
print("Notebook instance name: '{}'".format(notebook_name))
'''

####  b. Application logs <a id='Configure%20application%20logs'></a>

Application logs refers to the logs generated by running the various code cells in this notebook. To set this up, instantiate the [Python logging service](https://docs.python.org/3/library/logging.html) by running the following cell. You can configure the default log level and format as required.

By default, this notebook will only print the logs to the corresponding cell's output console.

In [None]:
import logging
import os

# Set the logging level and format
log_level = logging.INFO
log_format = '%(asctime)s - %(levelname)s - %(message)s'
logging.basicConfig(level=log_level, format=log_format)

# Save these in the environment variables for use in the helper scripts
os.environ['LOG_LEVEL'] = str(log_level)
os.environ['LOG_FORMAT'] = log_format

###  D. Organize imports <a id ='Organize%20imports'> </a>

Organize all the library and module imports for later use.

In [None]:
import boto3
import langchain
import sagemaker
import sys
import time
from botocore.config import Config

# Import the helper functions from the 'scripts' folder
sys.path.append(os.path.join(os.getcwd(), "scripts"))
#logging.info("Updated sys.path: {}".format(sys.path))
from helper_functions import *

Print the installed versions of some of the important libraries.

In [None]:
logging.info("Python version : {}".format(sys.version))
logging.info("Boto3 version : {}".format(boto3.__version__))
logging.info("SageMaker Python SDK version : {}".format(sagemaker.__version__))
logging.info("LangChain version : {}".format(langchain.__version__))
logging.info("Requests version : {}".format(requests.__version__))

###  E. Set AWS Region and boto3 config <a id ='Set%20AWS%20Region%20and%20boto3%20config'> </a>

Get the current AWS Region (where this notebook is running) and the SageMaker Session. These will be used to initialize some of the clients to AWS services using the boto3 APIs.

<div class="alert alert-block alert-info">  
<b>Note:</b> All the AWS services used by this notebook except Amazon Bedrock will use the current AWS Region. For Bedrock, follow the guidance in the next cell.
</div>

<div class="alert alert-block alert-warning">  
<b>Note:</b> At the time of writing this notebook, Amazon Bedrock was only available in <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-regions.html">these supported AWS Regions</a>. If you are running this notebook from any other AWS Region, then you have to change the Amazon Bedrock client's region and/or endpoint URL parameters to one of those supported AWS Regions that has Anthropic Claude 3 and Titan Embeddings G1 - Text. In order to do this, this notebook will use the value specified in the environment variable named <mark>AMAZON_BEDROCK_REGION</mark>. If this is not specified, then the notebook will default to <mark>us-west-2 (Oregon)</mark> for Amazon Bedrock.
</div>



In [None]:
# Get the AWS Region, SageMaker Session and IAM Role references
my_session = boto3.session.Session()
logging.info("SageMaker Session: {}".format(my_session))
my_iam_role = sagemaker.get_execution_role()
logging.info("Notebook IAM Role: {}".format(my_iam_role))
my_region = my_session.region_name
logging.info("Current AWS Region: {}".format(my_region))

# Explicity set the AWS Region for Amazon Bedrock clients
AMAZON_BEDROCK_DEFAULT_REGION = "us-west-2"
br_region = os.environ.get('AMAZON_BEDROCK_REGION')
if br_region is None:
    br_region = AMAZON_BEDROCK_DEFAULT_REGION
elif len(br_region) == 0:
    br_region = AMAZON_BEDROCK_DEFAULT_REGION
logging.info("AWS Region for Amazon Bedrock: {}".format(br_region))

Set the timeout and retry configurations that will be applied to all the boto3 clients used in this notebook.

In [None]:
# Increase the standard time out limits in the boto3 client from 1 minute to 3 minutes
# and set the retry limits
my_boto3_config = Config(
    connect_timeout = (60 * 3),
    read_timeout = (60 * 3),
    retries = {
        'max_attempts': 10,
        'mode': 'standard'
    }
)

###  F. Enable model access in Amazon Bedrock <a id ='Enable%20model%20access%20in%20Amazon%20Bedrock'> </a>

<div class="alert alert-block alert-danger">
    <b>Note:</b> Before proceeding further with this notebook, you must enable access to the 'Anthropic Claude 3' and 'Titan Embeddings G1 - Text' models on Amazon Bedrock by following the instructions <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html">here</a>. You need to submit the use case details. Otherwise, you will get an authorization error.
</div>

<div class="alert alert-block alert-warning">  
<b>Note:</b> You will have to do this manually after reading the End User License Agreement (EULA) for each of the models that you want to enable. Unless you explicitly disable it, this is a one-time setup for each model in an AWS account.
</div>

Run the following cell to print the Amazon Bedrock model access page URL for the AWS Region that was selected earlier.

In [None]:
# Print the Amazon Bedrock model access page URL
logging.info("Amazon Bedrock model access page - https://{}.console.aws.amazon.com/bedrock/home?region={}#/modelaccess"
             .format(br_region, br_region))

<div class="alert alert-block alert-info">  
<b>Note:</b> For running this notebook, you need access to only the Anthropic Claude 3 Haiku and Sonnet models, and the Titan Embeddings G1 - Text model. 
</div>

###  G. Check and configure security permissions <a id ='Check%20and%20configure%20security%20permissions'> </a>
This notebook uses the IAM role attached to the underlying notebook instance.  To view the name of this role, run the following cell. This IAM role should have the following permissions,
1. Full access to invoke Large Language Models (LLMs) on Amazon Bedrock.
2. Full access to use Knowledge Bases for Amazon Bedrock.
3. Full access to use Agents for Amazon Bedrock.
4. Full access to read and write to the Amazon OpenSearch Serverless collection associated with the Knowledge Base.
5. Full access to read and write to the Amazon S3 bucket associated with the Knowledge Base.
6. Full access to read, write and invoke the AWS Lambda function associated with the Agent.
7. Access to write to Amazon CloudWatch Logs.

The Bedrock Agent that this notebook will invoke to perform the web search requires the following permissions,
1. Access to the invoke the associated AWS Lambda function.
2. Access to invoke Large Language Models (LLMs) on Amazon Bedrock.
3. Access to write to Amazon CloudWatch Logs.

The AWS Lambda function associated with the above Bedrock Agent requires the following permissions,
1. Outbound Internet access.
2. Access to write to Amazon CloudWatch Logs.

<div class="alert alert-block alert-info">
<b>Note:</b>  If you are running this notebook as part of a workshop session, by default, all these permissions will be setup.
</div>

Run the following cell to print the details of the IAM role attached to the underlying notebook instance.

In [None]:
# Print the IAM role ARN and console URL
logging.info("This notebook's IAM role is '{}'".format(my_iam_role))
arn_parts = my_iam_role.split('/')
logging.info("Details of this IAM role are available at https://{}.console.aws.amazon.com/iamv2/home?region={}#/roles/details/{}?section=permissions"
             .format(my_region, my_region, arn_parts[len(arn_parts) - 1]))

###  H. Create common objects <a id='Create%20common%20objects'></a>

To begin with, create the boto3 clients.

In [None]:
# Create the Amazon S3 client
s3_client = boto3.client("s3", region_name = br_region, config = my_boto3_config)

# Create the Amazon OpenSearch Serverless client
aoss_client = boto3.client("opensearchserverless", region_name = br_region, config = my_boto3_config)

# Create the Amazon Bedrock client
bedrock_client = boto3.client("bedrock", region_name = br_region, endpoint_url = "https://bedrock.{}.amazonaws.com"
                              .format(br_region), config = my_boto3_config)

# Create the Amazon Bedrock runtime client
bedrock_rt_client = boto3.client("bedrock-runtime", region_name = br_region, config = my_boto3_config)

# Create the Agents for Amazon Bedrock client
bedrock_agt_client = boto3.client("bedrock-agent", region_name = br_region, config = my_boto3_config)

# Create the Agents for Amazon Bedrock runtime client
bedrock_agt_rt_client = boto3.client("bedrock-agent-runtime", region_name = br_region, config = my_boto3_config)

List all Anthropic Claude 3 LLMs on Amazon Bedrock that are offered through the On-Demand throughput pricing model. This will help you pick the model-ids that you will use further down in this notebook.

For more information on this, refer [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html#model-ids-arns).

In [None]:
# Note: 'print_claude_3_llm_info' available through ./scripts/helper_functions.py
print_claude_3_llm_info(bedrock_client, 'ON_DEMAND')

Create the common objects.

In [None]:
# Specify the Anthropic Claude 3 model-id
#model_id = "anthropic.claude-3-sonnet-20240229-v1:0"
model_id = "anthropic.claude-3-haiku-20240307-v1:0"

# Specify the invocation interval between Bedrock calls
invocation_interval_in_secs = 2

# Specify the URL of the data to be used for RAG
rag_data_url = "https://arxiv.org/pdf/2401.15884.pdf"

# Specify the Amazon S3 bucket key prefix
# Note: The bucket name will be automatically determined
# from the Knowledge Base data source when executing cells
# further down in this notebook
s3_key_prefix = "rag_data"

# Specify the name and location of the prompt templates
prompt_templates_dir = os.path.join(os.getcwd(), "prompt_templates")
query_result_relevancy_user_prompt_template = 'query_result_relevancy_user_prompt_template.txt'
query_result_relevancy_system_prompt_template = 'query_result_relevancy_system_prompt_template.txt'
final_system_prompt_template = 'final_system_prompt_template.txt'
final_user_prompt_template = 'final_user_prompt_template.txt'

# Specify and create the required output directories
rag_data_dir = os.path.join(os.getcwd(), "rag_data")
os.makedirs(rag_data_dir, exist_ok = True)

###  I. Get Knowledge Base details <a id='Get%20Knowledge%20Base%20details'></a>

<div class="alert alert-block alert-info">
<b>Note:</b> For the purpose of running this notebook, a new Knowledge Base (KB) must be created in the same AWS Region as Amazon Bedrock that was configured in Step 1E of this notebook.
<p>This KB must meet the following requirements:
    <ul>
        <li>KB must be in 'ACTIVE' status.</li>
        <li>Must have an Amazon OpenSearch Serverless collection as the vector index.</li>
        <li>Must have an Amazon S3 bucket as the data source.</li>
        <li>Data source must be in 'AVAILABLE' status.</li>
        <li>Embeddings model must be 'Titan Embeddings G1 - Text'.</li>
    </ul>
</p>
<p>If you are running this notebook as part of a workshop session, by default, a KB that meets all these requirements will be pre-created and ready to use.</p>
</div>

<div class="alert alert-block alert-danger">
    <b>Note:</b> If you are running this notebook outside of a workshop session, then, you must create a KB as specified above. Otherwise, this notebook will fail. You can follow the procedure described <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html">here</a> to create a KB.
</div>

If you have a created a KB that meets all the requirements mentioned above and want to use it, then enter the id of that KB in the code cell below. If not, this notebook will automatically retrieve the id of the first available KB that meets all the requirements, when you run the cells further down. 

In [None]:
kb_id = ''

Run the following cell to verify the specified KB or to retrieve the first available KB that meets all the requirements. In the process, retreive the S3 bucket name that is set as the data source for this KB.

In [None]:
# Note: 'get_kb_that_meets_requirements' available through ./scripts/helper_functions.py
kb_id, ds_id, s3_bucket_name, aoss_collection_arn = get_kb_that_meets_requirements(bedrock_agt_client, kb_id)

###  J. Get Bedrock Agent details <a id='Get%20Bedrock%20Agent%20details'></a>

<div class="alert alert-block alert-info">
<b>Note:</b> For the purpose of running this notebook, a new Bedrock Agent along with an agent alias must be created in the same AWS Region as Amazon Bedrock that was configured in Step 1E of this notebook. This Agent will be standalone and not connected to the Knowledge Base (KB).
<p>This agent and it's alias must meet the following requirements:
    <ul>
        <li>Both the agent and it's alias must be in 'PREPARED' status.</li>
    </ul>
</p>
<p>
In addition, the agent must have an <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-action-create.html">Action Group</a> associated with it. This Action Group must have an AWS Lambda function associated with it to perform the web search.
</p>
<p>
Although not mandatory, it is strongly recommended to use Anthropic Claude 3 as the LLM for this agent. If you use any other LLM, make sure the model access is enabled by following the instructions mentioned in prior steps.
</p>
<p>If you are running this notebook as part of a workshop session, by default, an agent along with it's alias that meet all these requirements will be pre-created and ready to use.</p>
</div>

<div class="alert alert-block alert-danger">
    <b>Note:</b> If you are running this notebook outside of a workshop session, then, you must create a Bedrock Agent along with it's alias as specified above. Otherwise, this notebook will fail. You can follow the procedure described <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-create.html">here</a> to create a Bedrock Agent and the procedure described <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/agents-alias-manage.html">here</a> to create an agent alias.
</div>

If you have a created a Bedrock Agent and it's alias that meet all the requirements mentioned above and want to use them, then enter their ids in the code cell below. If not, this notebook will automatically retrieve the id of the first available Bedrock Agent and it's alias that meet all the requirements, when you run the cells further down. 

In [None]:
br_agent_id = ''
br_agent_version = ''
br_agent_alias_id = ''

Run the following cell to verify the specified Bedrock Agent (and it's alias) or to retrieve the first available Bedrock Agent (and it's alias) that meet all the requirements.

In [None]:
# Note: 'get_br_agent_that_meets_requirements' available through ./scripts/helper_functions.py
br_agent_id, br_agent_version = get_br_agent_that_meets_requirements(bedrock_agt_client, br_agent_id, br_agent_version)
if len(br_agent_id) > 0:
    # Note: 'get_br_agent_alias_that_meets_requirements' available through ./scripts/helper_functions.py
    br_agent_alias_id = get_br_agent_alias_that_meets_requirements(bedrock_agt_client, br_agent_id, br_agent_alias_id)

## 2. Load data to Knowledge Base <a id ='Load%20data%20to%20Knowledge%20Base'> </a>

![](./images/load_to_KB.png)

###  A. Step 0a: Download data <a id='Load%20to%20KB%20Step0a'></a>

Download the data i.e. a LLM whitepaper as a PDF file to a local directory.

In [None]:
# Note: 'download_file' available through ./scripts/helper_functions.py
downloaded_file_name = download_file(rag_data_url, rag_data_dir)

###  B. Step 0b: Copy downloaded data to S3 <a id='Load%20to%20KB%20Step0b'></a>

Copy the downloaded PDF file to an Amazon S3 bucket. This bucket will be the [data source](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ds.html) for the Knowledge Base (KB).

In [None]:
# Note: 'upload_to_s3' available through ./scripts/helper_functions.py
upload_to_s3(rag_data_dir, s3_bucket_name, s3_key_prefix)

###  C. Steps 0c to 0e: Sync to Knowledge Base <a id='Load%20to%20KB%20Steps0c%20to%200e'></a>

Trigger the [sync operation](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-ingest.html) on that Knowledge Base to load the PDF file from that data source (S3) to the [vector index](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html) configured for that Knowledge Base (KB). In the process of loading, the data will be chunked and converted to vectors using the Embeddings Model specified for that KB. Check the status of the ingestion every 5 seconds until failed or complete.

In [None]:
# Note: 'sync_to_kb' available through ./scripts/helper_functions.py
sync_to_kb(bedrock_agt_client, ds_id, kb_id, 'Sync PDF from S3 to vector index')

<div class="alert alert-block alert-info">
    <b>Note:</b> Even if the sync operation is in 'COMPLETE' status, it may take 30 to 60 seconds for the data to be available for reading from the vector index.
</div>

## 3. Scenario 1 - match found in KB <a id ='Match%20found%20in%20KB'> </a>

![](./images/scenario_1.png)

###  A. Step 1: User query <a id='Scenario%201%20Step%201'></a>

Set the user query along with configurations for the max query results and relevancy score threshold.

In [None]:
scenario_1_query = 'What are hallucinations in LLMs?'
max_query_results = 10
relevancy_score_threshold = 0.7

###  B. Steps 2a and 2b: Query lookup <a id='Scenario%201%20Steps%202a%20and%202b'></a>

Perform a semantic search on the vector store in the Knowledge Base (KB) for the user query and retrieve the results. The following cell shows you two ways to do this - one using the LangChain API and the other using the boto3 API.

In [None]:
# Use the LangChain API
# Note: 'retrieve_from_kb_using_lc' available through ./scripts/helper_functions.py
scenario_1_query_results = retrieve_from_kb_using_lc(bedrock_agt_rt_client, kb_id, scenario_1_query, max_query_results)

# Use the boto3 API
# Note: 'retrieve_from_kb_using_boto3' available through ./scripts/helper_functions.py
#scenario_1_query_results = retrieve_from_kb_using_boto3(bedrock_agt_rt_client, kb_id, scenario_1_query, max_query_results)

# Print the retrieval count
logging.info("Retrieved {} result(s) for the specified query '{}' from the Knowledge Base '{}'.".format(len(scenario_1_query_results),
                                                                                                        scenario_1_query,
                                                                                                        kb_id))

###  C. Steps 3a and 3b: Determine the query results relevancy <a id='Scenario%201%20Steps%203a%20and%203b'></a>

For each of the retrieved result, invoke the LLM to determine how relevant the query result is to the user query and capture the results. The following cell uses the LangChain API for invoking the LLM (Anthropic Claude 3 on Amazon Bedrock).

<div class="alert alert-block alert-info">
    <b>Note:</b> The relevancy between the specified query and a query result is computed by passing these two values and asking the LLM a simple question - <i>In a scale of 0 to 1, with 0 being irrelevant and 1 being the most relevant, how relevant is the query result to the query?</i>
</div>

<div class="alert alert-block alert-info">
    <b>Note:</b> If you want to restrict Amazon Bedrock invocations to certain number of requests per second, then, you can change the <i>invocation_interval_in_secs</i> variable as required. This will execute the sleep() function between calls thereby controlling the requests per second.</i>
</div>

In [None]:
# Invoke the LLM to find the relevancy between the query and each retrieved result and capture the response
for scenario_1_query_result in scenario_1_query_results:
    # Note: 'instruct_llm_to_find_relevancy' available through ./scripts/helper_functions.py
    scenario_1_query_result['relevancy_score'] = (instruct_llm_to_find_relevancy(model_id,
                                                                                 bedrock_rt_client,
                                                                                 prompt_templates_dir,
                                                                                 query_result_relevancy_system_prompt_template,
                                                                                 query_result_relevancy_user_prompt_template,
                                                                                 scenario_1_query,
                                                                                 scenario_1_query_result['query_result']))
    # Check for invocation interval, print the log and sleep
    if (invocation_interval_in_secs > 0):
        logging.info("Sleeping for {} second(s)...".format(invocation_interval_in_secs))
        time.sleep(invocation_interval_in_secs)
        logging.info("Completed sleeping.")

###  D. Steps 4a through 5: Process to completion <a id='Scenario%201%20Steps%204a%20through%205'></a>

Determine if at least one of the query results is relevant to the user query based on whether it meets or exceeds the specified relevancy threshold. In scenario 1, it will meet/exceed. So, invoke the LLM (Anthropic Claude 3 on Amazon Bedrock) using the LangChain API for the final generation and print the results.

In [None]:
# Analyze the relevancy scores
# Note: 'instruct_llm_to_find_correlation' available through ./scripts/helper_functions.py
print_results_relevancy_score_stats(scenario_1_query, scenario_1_query_results, relevancy_score_threshold)

# Filter the query results that meet or exceed the specified threshold relevancy score
scenario_1_filtered_query_results = filter_query_results_by_threshold(scenario_1_query_results, relevancy_score_threshold)
# Check the filtered query results and process
if len(scenario_1_filtered_query_results) > 0:
    logging.info("At least one of the relevancy scores meets or exceeds the specified threshold of '{}'.".format(relevancy_score_threshold))
    # Construct the final prompt with the filtered query results and invoke the LLM
    scenario_1_final_response = process_final_prompt(model_id, bedrock_rt_client, prompt_templates_dir,
                                                     final_system_prompt_template, final_user_prompt_template,
                                                     scenario_1_query, scenario_1_filtered_query_results)
else:
    logging.warning("None of the relevancy scores meets or exceeds the specified threshold of '{}'.".format(relevancy_score_threshold))

Print the final response for the query.

In [None]:
logging.info("\n\nQUERY: {}\n\nRESPONSE: {}".format(scenario_1_query, scenario_1_final_response))

## 4. Scenario 2 - match not found in KB <a id ='Match%20not%20found%20in%20KB'> </a>

![](./images/scenario_2.png)

###  A. Step 1: User query <a id='Scenario%202%20Step%201'></a>

Set the user query along with configurations for the max query results and relevancy score threshold.

In [None]:
scenario_2_query = 'When is the 2024 Summer Olympics?'
max_query_results = 10
relevancy_score_threshold = 0.7

###  B. Steps 2a and 2b: Query lookup <a id='Scenario%202%20Steps%202a%20and%202b'></a>

Perform a semantic search on the vector store in the Knowledge Base (KB) for the user query and retrieve the results. The following cell shows you two ways to do this - one using the LangChain API and the other using the boto3 API.

In [None]:
# Use the LangChain API
# Note: 'retrieve_from_kb_using_lc' available through ./scripts/helper_functions.py
scenario_2_query_results = retrieve_from_kb_using_lc(bedrock_agt_rt_client, kb_id, scenario_2_query, max_query_results)

# Use the boto3 API
# Note: 'retrieve_from_kb_using_boto3' available through ./scripts/helper_functions.py
#scenario_2_query_results = retrieve_from_kb_using_boto3(bedrock_agt_rt_client, kb_id, scenario_2_query, max_query_results)

# Print the retrieval count
logging.info("Retrieved {} result(s) for the specified query '{}' from the Knowledge Base '{}'.".format(len(scenario_2_query_results),
                                                                                                        scenario_2_query,
                                                                                                        kb_id))

###  C. Steps 3a and 3b: Determine the query results relevancy <a id='Scenario%202%20Steps%203a%20and%203b'></a>

For each of the retrieved result, invoke the LLM to determine how relevant the query result is to the user query and capture the results. The following cell uses the LangChain API for invoking the LLM (Anthropic Claude 3 on Amazon Bedrock).

<div class="alert alert-block alert-info">
    <b>Note:</b> The relevancy between the specified query and a query result is computed by passing these two values and asking the LLM a simple question - <i>In a scale of 0 to 1, with 0 being irrelevant and 1 being the most relevant, how relevant is the query result to the query?</i>
</div>

<div class="alert alert-block alert-info">
    <b>Note:</b> If you want to restrict Amazon Bedrock invocations to certain number of requests per second, then, you can change the <i>invocation_interval_in_secs</i> variable as required. This will execute the sleep() function between calls thereby controlling the requests per second.</i>
</div>

In [None]:
# Invoke the LLM to find the relevancy between the query and each retrieved result and capture the response
for scenario_2_query_result in scenario_2_query_results:
    # Note: 'instruct_llm_to_find_relevancy' available through ./scripts/helper_functions.py
    scenario_2_query_result['relevancy_score'] = (instruct_llm_to_find_relevancy(model_id,
                                                                                 bedrock_rt_client,
                                                                                 prompt_templates_dir,
                                                                                 query_result_relevancy_system_prompt_template,
                                                                                 query_result_relevancy_user_prompt_template,
                                                                                 scenario_2_query,
                                                                                 scenario_2_query_result['query_result']))
    # Check for invocation interval, print the log and sleep
    if (invocation_interval_in_secs > 0):
        logging.info("Sleeping for {} second(s)...".format(invocation_interval_in_secs))
        time.sleep(invocation_interval_in_secs)
        logging.info("Completed sleeping.")

###  D. Steps 4a through 7: Process to completion <a id='Scenario%202%20Steps%204a%20through%207'></a>

Determine if at least one of the query results is relevant to the user query based on whether it meets or exceeds the specified relevancy threshold. In scenario 2, it will not meet/exceed. So, print a warning and do a web search. Then, use the results from the web search and invoke the LLM (Anthropic Claude 3 on Amazon Bedrock) for the final generation and print the results.

In this notebook, for the purpose of performing a web search, we will search Wikipedia using the [Wikimedia API](https://api.wikimedia.org/). This search will be performed through an AWS Lambda function that will be invoked by the Bedrock Agent.

<div class="alert alert-block alert-info">
    <b>Note: </b>Wikipedia permits attribution through the hyperlink or URL to the article(s) referenced.
</div>

In [None]:
# Analyze the relevancy scores
# Note: 'instruct_llm_to_find_correlation' available through ./scripts/helper_functions.py
print_results_relevancy_score_stats(scenario_2_query, scenario_2_query_results, relevancy_score_threshold)

# Filter the query results that meet or exceed the specified threshold relevancy score
scenario_2_filtered_query_results = filter_query_results_by_threshold(scenario_2_query_results, relevancy_score_threshold)
# Check the filtered query results and process
if len(scenario_2_filtered_query_results) > 0:
    logging.info("At least one of the relevancy scores meets or exceeds the specified threshold of '{}'.".format(relevancy_score_threshold))
else:
    logging.warning("None of the relevancy scores meets or exceeds the specified threshold of '{}'.".format(relevancy_score_threshold))
    # Perform web search through Agents for Amazon Bedrock and retrieve the response
    scenario_2_final_response = perform_web_search(br_agent_alias_id, br_agent_id, bedrock_agt_rt_client, scenario_2_query)
    logging.info("Prompt response: {}".format(scenario_2_final_response))

Print the final response for the query.

In [None]:
logging.info("\n\nQUERY: {}\n\nRESPONSE: {}".format(scenario_2_query, scenario_2_final_response))

## 5. Associate Bedrock Agent with KB (Optional) <a id='Associate%20Bedrock%20Agent%20with%20KB%20(Optional)'></a>

Although not required for this notebook, if you want to explore how to associate a Bedrock Agent with a Knowledge Base, follow the steps mentioned [here](https://docs.aws.amazon.com/bedrock/latest/userguide/agents-kb-add.html).

## 6. Cleanup <a id='Cleanup'></a>

As a best practice, you should delete AWS resources that are no longer required.  This will help you avoid incurring unncessary costs.

<div class="alert alert-block alert-info">
<b>Note:</b> If you are running this notebook as part of a workshop session, by default, all resources will be cleaned up at the end of the session. If you are running this notebook outside of a workshop session, you can cleanup the resources associated with this notebook by uncommenting the following code cell and running it.
</div>

Running the following cell will delete the following resources:
* Knowledge Base.
* Amazon OpenSearch Serverless Collection.
* The file that was uploaded to S3; not the S3 bucket itself.

In [None]:
'''
# Note: 'delete_kb' available through ./scripts/helper_functions.py
delete_kb(bedrock_agt_client, kb_id)

# Note: 'delete_aoss_collection' available through ./scripts/helper_functions.py
delete_aoss_collection(aoss_client, collection_id)

# Note: 'delete_s3_object' available through ./scripts/helper_functions.py
delete_s3_object(s3_client, s3_bucket_name, s3_key_prefix + '/' + downloaded_file_name)
'''

## 7. Conclusion <a id='Conclusion'></a>

We have now seen how to build a simplified Corrective RAG based assistant using Amazon Bedrock. This is an advanced RAG technique that works on improving the quality of the retrieved documents prior to the generation process. We have also seen how Amazon Bedrock with its LLMs, Knowledge Bases (KBs) and Agents make it easy for you build generative AI applications.

## 8. Frequently Asked Questions (FAQs) <a id='FAQs'></a>

**Q: What AWS services are used in this notebook?**

Amazon Bedrock, Amazon OpenSearch Serverless, Amazon S3, AWS Lambda, AWS Identity and Access Management (IAM), Amazon CloudWatch, and Amazon SageMaker Notebook instance (or) Amazon SageMaker Studio Notebook depending on what you use to run the notebook.

**Q: Will Amazon Bedrock capture and store my data?**

Amazon Bedrock doesn't use your prompts and continuations to train any AWS models or distribute them to third parties. Your training data isn't used to train the base Amazon Titan models or distributed to third parties. Other usage data, such as usage timestamps, logged account IDs, and other information logged by the service, is also not used to train the models.

Amazon Bedrock uses the fine tuning data you provide only for fine tuning an Amazon Titan model. Amazon Bedrock doesn't use fine tuning data for any other purpose, such as training base foundation models.

Each model provider has an escrow account that they upload their models to. The Amazon Bedrock inference account has permissions to call these models, but the escrow accounts themselves don't have outbound permissions to Amazon Bedrock accounts. Additionally, model providers don't have access to Amazon Bedrock logs or access to customer prompts and continuations.

Amazon Bedrock doesn’t store or log your data in its service logs.

**Q: What models are supported by Amazon Bedrock?**

Go [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).

**Q: What is the difference between On-demand and Provisioned Throughput in Amazon Bedrock?**

With the On-Demand mode, you only pay for what you use, with no time-based term commitments. For text generation models, you are charged for every input token processed and every output token generated. For embeddings models, you are charged for every input token processed. A token is comprised of a few characters and refers to the basic unit that a model learns to understand user input and prompt to generate results. For image generation models, you are charged for every image generated.

With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. The Provisioned Throughput mode is primarily designed for large consistent inference workloads that need guaranteed throughput. Custom models can only be accessed using Provisioned Throughput. A model unit provides a certain throughput, which is measured by the maximum number of input or output tokens processed per minute. With this Provisioned Throughput pricing, charged by the hour, you have the flexibility to choose between 1-month or 6-month commitment terms.

**Q: Where can I find customer references for Amazon Bedrock?**

Go [here](https://aws.amazon.com/bedrock/testimonials/).

**Q: Where can I find resources for prompt engineering?**

[Prompt Engineering Guide](https://www.promptingguide.ai/).

**Q: Where can learn more about Corrective RAG?**

Go [here](https://arxiv.org/abs/2401.15884).

**Q: Is LangChain mandatory to use Amazon Bedrock?**

No. You can interact with Amazon Bedrock using the [Bedrock API](https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html) or language-specific [AWS SDKs](https://aws.amazon.com/developer/tools/). 

**Q: How do I get started with LangChain?**

Go [here](https://python.langchain.com/docs/get_started/introduction).

**Q: Where can I find pricing information for the AWS services used in this notebook?**

- Amazon Bedrock pricing - go [here](https://aws.amazon.com/bedrock/pricing/).
- Amazon OpenSearch Serverless pricing - go [here](https://aws.amazon.com/opensearch-service/pricing/) and navigate to the <i>Serverless</i> section.
- Amazon S3 pricing - go [here](https://aws.amazon.com/s3/pricing/).
- AWS Lambda pricing - go [here](https://aws.amazon.com/lambda/pricing/).
- AWS Identity and Access Management (IAM) pricing - free.
- Amazon CloudWatch pricing - go [here](https://aws.amazon.com/cloudwatch/pricing/).
- Amazon SageMaker Notebook instance (or) Amazon SageMaker Studio Notebook pricing - go [here](https://aws.amazon.com/sagemaker/pricing/).