<center><img src="images/2024_reInvent_Logo_wDate_Black_V3.png" alt="drawing" width="400" style="background-color:white; padding:1em;" /></center> <br/>

# <a name="0">re:Invent 2024 | Lab 1: Build your RAG powered chatbot  </a>
## <a name="0">Build a chatbot with Knowledge Bases and Guardrails to detect and remediate hallucinations </a>

## Lab Overview
In this lab, you will:
1. Take a deeper look at which LLM parameters influence or control for model hallucinations
2. Set up Retrieval Augmented Generation and understand how it can control for hallucinations
3. Apply contextual grounding in Amazon Bedrock Guardrails to intervene when a model hallucinates
4. Use RAGAS evaluation and understand which metrics help us measure hallucinations

## Dataset
For this workshop, we will use the [Bedrock User Guide](https://docs.aws.amazon.com/pdfs/bedrock/latest/userguide/bedrock-ug.pdf) available as a PDF file.
## Use-Case Overview
In this lab, we want to develop a chatbot which can answer questions about Amazon Bedrock as factually as possible. We will set up Retrieval Augmented Generation using [Amazon Bedrock Knowledge Bases](https://aws.amazon.com/bedrock/knowledge-bases/) and apply [Amazon Guardrails](https://aws.amazon.com/bedrock/guardrails/) to intervene when hallucinations are detected.


#### Lab Sections

This lab notebook has the following sections:
    
Please work top to bottom of this notebook and don't skip sections as this could lead to error messages due to missing code.


----

# Star Github repository for future reference

In [1]:
%%html

<a class="github-button" href="https://github.com/aws-samples/responsible_ai_aim325_reduce_hallucinations_for_genai_apps" data-color-scheme="no-preference: light; light: light; dark: dark;" data-icon="octicon-star" data-size="large" data-show-count="true" aria-label="Star Reduce Hallucinations workshop on GitHub">Star</a>
<script async defer src="https://buttons.github.io/buttons.js"></script>

# Environment Setup

In [2]:
#%pip install --upgrade --quiet pip sagemaker boto3 ragas==0.1.7 pydantic==2.6.1 langchain-core==0.1.40 langchain langchain-aws

In [None]:
#%%capture
!pip3 install -r ../requirements.txt --quiet

In [4]:
!pip list | grep datasets

datasets                                3.1.0


In [5]:
# restart kernel
#from IPython.core.display import HTML
#HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [6]:
import time
import os
import json
import boto3
from time import gmtime, strftime, sleep
import pprint
import random
import zipfile
#from retrying import retry
from rag_setup.create_kb_utils import *
import warnings
warnings.filterwarnings('ignore')
from botocore.config import Config

import numpy as np  
import pandas as pd 
import sagemaker
from botocore.exceptions import ClientError

(sagemaker.__version__,boto3.__version__)





sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


('2.227.0', '1.35.15')

## Set constants

In [7]:
# Get some variables you need to interact with SageMaker service
boto_session = boto3.Session()
region = boto_session.region_name
bucket_name = sagemaker.Session().default_bucket()
bucket_prefix = "reduce-hallucinations-in-genai-apps"  
sm_session = sagemaker.Session()
sm_client = boto_session.client("sagemaker")
sm_role = sagemaker.get_execution_role()

initialized = True

print(sm_role)
print(bucket_name)

arn:aws:iam::996757723911:role/cfn-SageMakerExecutionRole-bxMsce3pMtli
sagemaker-us-west-2-996757723911


In [8]:
embedding_model_id="amazon.titan-embed-text-v2:0"
llm_model_id="anthropic.claude-3-sonnet-20240229-v1:0"

In [9]:
# Store some variables to keep the value between the notebooks
%store bucket_name
%store bucket_prefix
%store sm_role
%store region
%store initialized

Stored 'bucket_name' (str)
Stored 'bucket_prefix' (str)
Stored 'sm_role' (str)
Stored 'region' (str)
Stored 'initialized' (bool)


In [10]:
#test if bedrock model access has been enabled 
input_prompt = "Who was the first person to land on the sun?"
test_llm_call(input_prompt)

  response = llm(messages)


"No one has ever landed on the sun. The sun is a star with extremely hot temperatures and harsh conditions that make landing on its surface impossible with current technology.\n\nThe sun's surface temperature is around 5,500°C (9,940°F). Its powerful gravitational pull and lack of a solid surface also make landing unfeasible. Any spacecraft would burn up long before reaching the sun's surface due to the intense heat and radiation.\n\nSpace exploration has focused on studying the sun from a safe distance using spacecraft like the Parker Solar Probe, which has flown through the sun's outer atmosphere but not landed. Landing a crew or rover on the sun remains the stuff of science fiction for now. The extreme conditions simply make it impossible to achieve with present technology and materials."

# 1. Chat with Anthropic Claude 3 Sonnet through Bedrock

In [11]:
boto_session = boto3.Session()
region = boto_session.region_name

RETRY_CONFIG = Config(
    retries={
        'max_attempts': 5,            # Maximum number of retry attempts
        'mode': 'adaptive'            # Adaptive mode adjusts based on request limits
    },
    read_timeout=1000,
    connect_timeout=1000
)

bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name=region,
    config=RETRY_CONFIG)

def generate_message_claude(
    query, system_prompt="", max_tokens=1000, 
    model_id='anthropic.claude-3-sonnet-20240229-v1:0',
    temperature=0.9, top_p=0.99, top_k=100
):
    # Prompt with user turn only.
    user_message = {"role": "user", "content": query}
    messages = [user_message]
    body = json.dumps(
        {
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": max_tokens,
            "system": system_prompt,
            "messages": messages,
            "temperature": temperature,
            "top_p": top_p,
            "top_k": top_k
        }
    )

    response = bedrock_runtime.invoke_model(body=body, modelId=model_id)
    response_body = json.loads(response.get('body').read())
    return response_body

In [12]:
query = 'How do Amazon Bedrock Guardrails work?'

response = generate_message_claude(query)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_01SY4qBYRaf7eknVNyeRdHws",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock Guardrails is a service provided by AWS that helps organizations govern their AWS environments at scale. It allows organizations to define and enforce rules, known as guardrails, across their AWS accounts and resources.\n\nHere's how Amazon Bedrock Guardrails works:\n\n1. Guardrail Definition: Organizations can define guardrails using AWS CloudFormation templates. These templates specify the desired configuration for AWS resources, services, and account settings. Guardrails can cover various aspects, such as resource naming conventions, security controls, cost optimization, and compliance requirements.\n\n2. Guardrail Deployment: Once the guardrails are defined, they are deployed across the organization's AWS accounts and regions using AWS Organ

## 1.1 Apply System Prompt

In [13]:
query = 'Is it possible to purchase provisioned throughput for Anthropic Claude models on Amazon Bedrock?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_01BgvsJKdMHV4MXiKQ9D7VVX",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Unfortunately, I do not have any specific information about provisioning throughput or using Anthropic's models, including myself, on Amazon Bedrock. Amazon Bedrock appears to be an internal Amazon service, and I do not have details about its capabilities or offerings related to AI models from Anthropic or other providers. My knowledge is limited to what is publicly available about me and Anthropic's products and services."
        }
    ],
    "stop_reason": "end_turn",
    "stop_sequence": null,
    "usage": {
        "input_tokens": 65,
        "output_tokens": 89
    }
}


In [14]:
query = 'How do Amazon Bedrock Guardrails work?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_01SWStSGNUMVqkSpwVDAuS5K",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock Guardrails is a service provided by AWS that allows you to establish mandatory, organization-wide governance guardrails for your AWS accounts and resources. It helps ensure your workloads conform to your organization's policies before resources are provisioned.\n\nHere's a high-level overview of how Amazon Bedrock Guardrails works:\n\n1. Guardrail Definition: You define guardrails as code using AWS CloudFormation templates or Terraform configurations. These guardrails encode your organization's policies and best practices.\n\n2. Guardrail Deployment: The guardrail definitions are deployed as stacks across your AWS accounts and regions using a centralized deployment mechanism provided by Bedrock Guardrails.\n\n3. Preventive Controls: The deployed

## 1.2 Understanding LLM generation parameters
### 1. Temperature: The amount of randomness injected into the response.

In [15]:
query = 'What is Amazon Bedrock?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt, temperature=1)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_015s1hUdXGR5RgLrPfzFLqvX",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock is a real-time operating system developed by Amazon for use in embedded systems and internet of things (IoT) devices.\n\nSome key points about Amazon Bedrock:\n\n- It is a Linux-based operating system optimized for secure IoT applications and microcontroller-based devices.\n\n- It provides a minimal trusted code base with real-time performance for constrained devices with limited memory and storage.\n\n- It includes built-in support for over-the-air (OTA) updates to allow remote and secure updates of the software on IoT devices.\n\n- It supports C and C++ programming languages.\n\n- Amazon Bedrock aims to provide a secure, real-time foundation for building IoT products across various sectors like industrial, automotive, consumer, etc.\n\n- It wa

In [16]:
query = 'What is Amazon Bedrock?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt, temperature=0)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_01Xfi93y9pwQBkhE36pJWfW6",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock is a real-time operating system developed by Amazon for running applications on resource-constrained devices like microcontrollers and sensors.\n\nSome key points about Amazon Bedrock:\n\n- It is designed to be a secure, real-time operating system for internet of things (IoT) devices and embedded applications.\n\n- It provides a lightweight environment with real-time performance for running multiple software components concurrently.\n\n- It supports C and C++ programming languages.\n\n- It includes built-in security features like memory protection, encrypted communication, secure boot, and code signing.\n\n- It aims to simplify development and deployment of IoT applications across different hardware platforms.\n\n- Bedrock is open source and ava

#### 2. top_p – Use nucleus sampling.

In nucleus sampling, Anthropic Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p. You should alter either temperature or top_p, but not both.

In [17]:
query = 'What is Amazon Bedrock?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt, temperature=1, top_p=1)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_019nwMrWdiHtJnLuvnsXcgum",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock is a real-time operating system (RTOS) developed by Amazon Web Services (AWS) for running lightweight IoT applications on resource-constrained devices.\n\nSome key features of Amazon Bedrock:\n\n1) Small footprint: It is designed to have a minimal memory footprint, making it suitable for microcontrollers and other devices with limited memory and compute resources.\n\n2) Real-time performance: As an RTOS, it provides real-time scheduling and execution capabilities required for time-sensitive IoT applications.\n\n3) Secure: It includes security features like memory protection, cryptographic libraries, and secure boot capabilities.\n\n4) Connected: It supports connectivity to AWS IoT services, allowing devices to easily integrate with the AWS cloud

#### 3. top_k: Only sample from the top K options for each subsequent token.

Use top_k to remove long tail low probability responses.

In [18]:
query = 'What is Amazon Bedrock?'
system_prompt = 'You are a helpful AI assistant. You try to answer the user queries to the best of your knowledge. If you are unsure of the answer, do not make up any information.'

response = generate_message_claude(query, system_prompt, temperature=0, top_p=1, top_k=100)
print("User turn only.")
print(json.dumps(response, indent=4))

User turn only.
{
    "id": "msg_bdrk_01SkA6egzjn9mcCuSTZVAgZB",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-sonnet-20240229",
    "content": [
        {
            "type": "text",
            "text": "Amazon Bedrock is a real-time operating system developed by Amazon for running applications on resource-constrained devices like microcontrollers and sensors.\n\nSome key points about Amazon Bedrock:\n\n- It is designed to be a secure, real-time operating system for internet of things (IoT) devices and embedded applications.\n\n- It provides a lightweight environment with real-time performance for running multiple software components concurrently.\n\n- It supports C and C++ programming languages.\n\n- It includes built-in security features like memory protection, encrypted communication, secure boot, and code signing.\n\n- It aims to simplify development and deployment of IoT applications across different hardware platforms.\n\n- Bedrock is open source and ava

# Retrieval Augmented Generation
We are using the Retrieval Augmented Generation (RAG) technique with Amazon Bedrock. A RAG implementation consists of two parts:

    1. A data pipeline that ingests that from documents (typically stored in Amazon S3) into a Knowledge Base i.e. a vector database such as Amazon OpenSearch Service Serverless (AOSS) so that it is available for lookup when a question is received.

The data pipeline represents an undifferentiated heavy lifting and can be implemented using Amazon Bedrock Knowledge Bases. We can now connect an S3 bucket to a vector database such as AOSS and have a Bedrock Knowledge Bases read the objects (html, pdf, text etc.), chunk them, and then convert these chunks into embeddings using Amazon Titan Embeddings model and then store these embeddings in AOSS. All of this without having to build, deploy, and manage the data pipeline.

<center><img src="images/fully_managed_ingestion.png" alt="This image shows how Aazon Bedrock Knowledge Bases ingests objects in a S3 bucket into the Knowledge Base for use in a RAG set up. The objects are chunks, embedded and then stored in a vector index." height="700" width="700" style="background-color:white; padding:1em;" /></center> <br/>
    

    2. An application that receives a question from the user, looks up the knowledge base for relevant pieces of information (context) and then creates a prompt that includes the question and the context and provides it to an LLM for generating a response.






Once the data is available in the Bedrock knowledge base, then user questions can be answered using the following system design:

<center><img src="images/retrieveAndGenerate.png" alt="This image shows the retrieval augmented generation (RAG) system design setup with knowledge bases, S3, and AOSS. Knowledge corpus is ingested into a vector database using Amazon Bedrock Knowledge Base Agent and then RAG approach is used to work question answering. The question is converted into embeddings followed by semantic similarity search to get similar documents. With the user prompt being augmented with the RAG search response, the LLM is invoked to get the final raw response for the user." height="700" width="700" style="background-color:white; padding:1em;" /></center> <br/>


# Data
Let's use the publicly available [Bedrock user guide](https://docs.aws.amazon.com/pdfs/bedrock/latest/userguide/bedrock-ug.pdf) to inform the model

In [19]:
!wget -P data/ -N https://docs.aws.amazon.com/pdfs/bedrock/latest/userguide/bedrock-ug.pdf --no-check-certificate

--2024-11-13 20:31:22--  https://docs.aws.amazon.com/pdfs/bedrock/latest/userguide/bedrock-ug.pdf
Resolving docs.aws.amazon.com (docs.aws.amazon.com)... 18.238.238.32, 18.238.238.78, 18.238.238.98, ...
Connecting to docs.aws.amazon.com (docs.aws.amazon.com)|18.238.238.32|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 13669946 (13M) [application/pdf]
Saving to: ‘data/bedrock-ug.pdf’


2024-11-13 20:31:23 (112 MB/s) - ‘data/bedrock-ug.pdf’ saved [13669946/13669946]



In [20]:
# Upload data to S3
dataset_file_local_path = 'data/bedrock-ug.pdf'
input_s3_url = sagemaker.Session().upload_data(
    path=dataset_file_local_path,
    bucket=bucket_name
)
print(f"Upload the dataset to {input_s3_url}")

%store input_s3_url

Upload the dataset to s3://sagemaker-us-west-2-996757723911/data/bedrock-ug.pdf
Stored 'input_s3_url' (str)


# Steps

1. Create Amazon Bedrock Knowledge Base execution role with necessary policies for accessing data from S3 and writing embeddings into OSS.
2. Create an empty OpenSearch serverless index.
3. Create Amazon Bedrock knowledge base
4. Create a data source within knowledge base which will connect to Amazon S3
5. Start an ingestion job using KB APIs which will read data from s3, chunk it, convert chunks into embeddings using Amazon Titan Embeddings model and then store these embeddings in AOSS. 

In [21]:
!export PYTHONPATH='./lab1/'
#import sys
#sys.path.insert(0,'./lab1/')

In [22]:
kb_db_file_uri='data'

# if a kb already exists we can use the same, else the infra setup code will create one by itself using the bedrock user guide.
use_existing_kb = False
existing_kb_id = None

In [23]:
%load_ext autoreload
%autoreload 2
from rag_setup.create_kb_utils import *

In [24]:
%%time

# For new KB it takes around ~6 minutes for this setup to complete on a t2.medium instance.
infra_response = setup_knowledge_base(bucket_name, kb_db_file_uri, use_existing_kb, existing_kb_id)
infra_response

agent_bedrock_policy :: None
agent_s3_schema_policy :: None
kb_aws_bedrock_policy :: None
kb_db_s3_policy :: None
Creating collection...

Collection successfully created:

Creating index:
Knowledge base status -> is it READY ? :: ACTIVE
knowledge_base_db_id :: DSALKIXRIV
CPU times: user 275 ms, sys: 33.1 ms, total: 308 ms
Wall time: 5min 49s


{'prefix_infra': 'l2ad2e45',
 'bucket_name': 'sagemaker-us-west-2-996757723911',
 'knowledge_base_db_id': 'DSALKIXRIV',
 'agent_bedrock_policy': None,
 'agent_s3_schema_policy': None,
 'kb_db_collection_name': 'l25929-kbdb-996757723911',
 'agent_kb_schema_policy': None,
 'kb_db_aoss_policy': None,
 'kb_db_s3_policy': None,
 'kb_db_role_name': 'AmazonBedrockExecutionRoleForAgentsAIAssistant05',
 'kb_db_opensearch_collection_response': {'createCollectionDetail': {'arn': 'arn:aws:aoss:us-west-2:996757723911:collection/zrswlgcemuvny02ltvph',
   'createdDate': 1731529886011,
   'description': 'OpenSearch collection for Amazon Bedrock Latest User guide Knowledge Base',
   'id': 'zrswlgcemuvny02ltvph',
   'kmsKeyArn': 'auto',
   'lastModifiedDate': 1731529886011,
   'name': 'l25929-kbdb-996757723911',
   'standbyReplicas': 'DISABLED',
   'status': 'CREATING',
   'type': 'VECTORSEARCH'},
  'ResponseMetadata': {'RequestId': 'b33b1a24-31b0-450f-b481-c1bd338f9ebf',
   'HTTPStatusCode': 200,
   'H

In [25]:
kb_id = infra_response['knowledge_base_db_id']
random_id = infra_response['prefix_infra']
# keep the kb_id for invocation later in the invoke request
%store kb_id
%store bucket_name

Stored 'kb_id' (str)
Stored 'bucket_name' (str)


In [26]:
kb_id

'DSALKIXRIV'

In [27]:
# allow time for KB to be ready
time.sleep(180)

# Chat with the model using the knowledge base by providing the generated KB_ID
### Using RetrieveAndGenerate API
Behind the scenes, RetrieveAndGenerate API converts queries into embeddings, searches the knowledge base, and then augments the foundation model prompt with the search results as context information and returns the FM-generated response to the question. For multi-turn conversations, Knowledge Bases manage short-term memory of the conversation to provide more contextual results.The output of the RetrieveAndGenerate API includes the generated response, source attribution as well as the retrieved text chunks.

In [28]:
pp = pprint.PrettyPrinter(indent=2)

In [29]:
kb_id

'DSALKIXRIV'

In [30]:
bedrock_agent_runtime_client = boto3.client("bedrock-agent-runtime", region_name=region)


def ask_bedrock_llm_with_knowledge_base(query,
                                        kb_id=kb_id,
                                        model_arn=llm_model_id,
                                        ) -> str:
    response = bedrock_agent_runtime_client.retrieve_and_generate(
        input={
            'text': query
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kb_id,
                'modelArn': model_arn
            }
        },
    )

    return response

In [31]:
query = "What is Amazon Bedrock?"

response = ask_bedrock_llm_with_knowledge_base(query, kb_id)
generated_text = response['output']['text']
citations = response["citations"]
contexts = []
for citation in citations:
    retrievedReferences = citation["retrievedReferences"]
    for reference in retrievedReferences:
        contexts.append(reference["content"]["text"])
print(f"---------- Generated using Anthropic Claude 3 Sonnet:")
pp.pprint(generated_text )
print(f'---------- The citations for the response:')
pp.pprint(contexts)
print(kb_id)

---------- Generated using Anthropic Claude 3 Sonnet:
('Amazon Bedrock is a fully managed service that provides access to '
 'high-performing foundation models (FMs) from leading AI companies and Amazon '
 'through a unified API. It allows you to experiment with and evaluate '
 'different foundation models, customize them with your own data using '
 'techniques like fine-tuning and Retrieval Augmented Generation (RAG), and '
 'build agents that can execute tasks using your systems and data sources. '
 "With Amazon Bedrock's serverless experience, you can get started quickly, "
 'customize foundation models with your data, and easily integrate and deploy '
 'them into your applications using AWS tools without managing any '
 'infrastructure.')
---------- The citations for the response:
[ '............ 2009     xvAmazon Bedrock User Guide     What is Amazon '
  'Bedrock?     Amazon Bedrock is a fully managed service that makes '
  'high-performing foundation models (FMs) from leading AI 

In [32]:
query = "Is it possible to purchase provisioned throughput for Anthropic Claude Sonnet on Amazon Bedrock?"

response = ask_bedrock_llm_with_knowledge_base(query, kb_id)
generated_text = response['output']['text']
citations = response["citations"]
contexts = []
for citation in citations:
    retrievedReferences = citation["retrievedReferences"]
    for reference in retrievedReferences:
        contexts.append(reference["content"]["text"])
print(f"---------- Generated using Anthropic Claude 3 Sonnet:")
pp.pprint(generated_text )
print(f'---------- The citations for the response:')
pp.pprint(contexts)
print()

---------- Generated using Anthropic Claude 3 Sonnet:
('Yes, it is possible to purchase provisioned throughput for Anthropic Claude '
 'Sonnet models on Amazon Bedrock. Specifically, you can purchase provisioned '
 'throughput for the following Anthropic Claude Sonnet models:\n'
 '\n'
 '- Anthropic Claude 3 Sonnet 28K\n'
 '- Anthropic Claude 3 Sonnet 200K\n'
 '- Anthropic Claude 3.5 Sonnet 18K (only available in US West (Oregon) '
 'region)\n'
 '- Anthropic Claude 3.5 Sonnet 51K (only available in US West (Oregon) '
 'region)\n'
 '- Anthropic Claude 3.5 Sonnet 200K (only available in US West (Oregon) '
 'region)')
---------- The citations for the response:
[ 'Virginia)     US West (Oregon)     Asia Pacific (Mumbai)     Asia Pacific '
  '(Sydney)     Canada (Central)     Europe (London)     Europe (Paris)     '
  'Europe (Ireland)     South America (São Paulo)     AWS GovCloud (US-West) '
  '(only for custom models with no commitment)     If you purchase Provisioned '
  'Throughput thro

# Contextual Grounding with Amazon Bedrock Guardrails

In [33]:
# Create guardrail
bedrock_client = boto3.client('bedrock')
guardrail_name = f"bedrock-rag-grounding-guardrail-{random_id}"
print(guardrail_name)
guardrail_response = bedrock_client.create_guardrail(
    name=guardrail_name,
    description='Guardrail for ensuring relevance and grounding of model responses in RAG powered chatbot',
    contextualGroundingPolicyConfig={
        'filtersConfig': [
            {
                'type': 'GROUNDING',
                'threshold': 0.5
            },
            {
                'type': 'RELEVANCE',
                'threshold': 0.5
            },
        ]
    },
    blockedInputMessaging='Can you please rephrase your question?',
    blockedOutputsMessaging='Sorry, I am not able to find the correct answer to your query - Can you try reframing your query to be more specific'
)

bedrock-rag-grounding-guardrail-l2ad2e45


In [34]:
guardrailId = guardrail_response['guardrailId']
guardrail_response

{'ResponseMetadata': {'RequestId': '8d13ae15-11c1-4baa-9cf5-850bb9bf54d5',
  'HTTPStatusCode': 202,
  'HTTPHeaders': {'date': 'Wed, 13 Nov 2024 20:40:26 GMT',
   'content-type': 'application/json',
   'content-length': '172',
   'connection': 'keep-alive',
   'x-amzn-requestid': '8d13ae15-11c1-4baa-9cf5-850bb9bf54d5'},
  'RetryAttempts': 0},
 'guardrailId': '5qsclikqvcxn',
 'guardrailArn': 'arn:aws:bedrock:us-west-2:996757723911:guardrail/5qsclikqvcxn',
 'version': 'DRAFT',
 'createdAt': datetime.datetime(2024, 11, 13, 20, 40, 26, 553590, tzinfo=tzlocal())}

In [35]:
guardrail_version = bedrock_client.create_guardrail_version(
    guardrailIdentifier=guardrail_response['guardrailId'],
    description='Working version of RAG app guardrail with higher thresholds for contextual grounding'
)
print(guardrail_version)
guardrailVersion = guardrail_response['version']
print(guardrailId)
%store guardrailId

{'ResponseMetadata': {'RequestId': '501376b1-218d-4d0b-9aff-a57a9a42444b', 'HTTPStatusCode': 202, 'HTTPHeaders': {'date': 'Wed, 13 Nov 2024 20:40:27 GMT', 'content-type': 'application/json', 'content-length': '44', 'connection': 'keep-alive', 'x-amzn-requestid': '501376b1-218d-4d0b-9aff-a57a9a42444b'}, 'RetryAttempts': 0}, 'guardrailId': '5qsclikqvcxn', 'version': '1'}
5qsclikqvcxn
Stored 'guardrailId' (str)


In [36]:
# Retrieve and Generate using Guardrail

bedrock_agent_runtime_client = boto3.client("bedrock-agent-runtime", region_name=region)


def retrieve_and_generate_with_guardrail(
    query,
    kb_id,
    model_arn=llm_model_id,
    session_id=None
):

    prompt_template = 'You are a helpful AI assistant to help users understand documented risks in various projects. \
    Answer the user query based on the context retrieved. If you dont know the answer, dont make up anything. \
    Only answer based on what you know from the provided context. You can ask the user for clarifying questions if anything is unclear\
    But generate an answer only when you are confident about it and based on the provided context.\
    User Query: $query$\
    Context: $search_results$'

    response = bedrock_agent_runtime_client.retrieve_and_generate(
        input={
            'text': query
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'generationConfiguration': {
                    'guardrailConfiguration': {
                        'guardrailId': guardrailId,
                        'guardrailVersion': guardrailVersion
                    },
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'topP': 0.25
                        }
                    },
                    'promptTemplate': {
                        'textPromptTemplate': prompt_template
                    }
                },
                'knowledgeBaseId': kb_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'overrideSearchType': 'SEMANTIC'
                    }
                }
            }
        }
    )
    return response

In [37]:
# Knowledge BAse ID

query = 'What is Amazon Bedrock?'
#query = "Is it possible to purchase provisioned throughput for Anthropic Claude Sonnet on Amazon Bedrock?"

model_response = retrieve_and_generate_with_guardrail(query, kb_id)

print(model_response)

{'ResponseMetadata': {'RequestId': 'c527ebad-1b09-45e7-9ed3-2c6fb0e059ae', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Wed, 13 Nov 2024 20:40:36 GMT', 'content-type': 'application/json', 'content-length': '1383', 'connection': 'keep-alive', 'x-amzn-requestid': 'c527ebad-1b09-45e7-9ed3-2c6fb0e059ae'}, 'RetryAttempts': 0}, 'citations': [], 'guardrailAction': 'NONE', 'output': {'text': 'According to the context provided, Amazon Bedrock is a fully managed service from AWS that provides access to high-performing foundation models (FMs) from leading AI companies and Amazon through a unified API.\n\nSome key points about Amazon Bedrock:\n\n- It allows you to choose from a wide range of foundation models to find the best one for your use case.\n- It offers capabilities to build generative AI applications with security, privacy, and responsible AI principles.\n- You can experiment with and evaluate different foundation models, privately customize them with your own data using techniques lik

# Evaluating RAG with RAGAS

In [38]:
import boto3
import pprint
from botocore.client import Config
from langchain.llms.bedrock import Bedrock
from langchain_community.chat_models.bedrock import BedrockChat
from langchain.embeddings import BedrockEmbeddings
from langchain.retrievers.bedrock import AmazonKnowledgeBasesRetriever
from langchain.chains import RetrievalQA

pp = pprint.PrettyPrinter(indent=2)

bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_client = boto3.client('bedrock-runtime')
bedrock_agent_client = boto3.client("bedrock-agent-runtime",
                              config=bedrock_config
                              )

llm_for_text_generation = BedrockChat(model_id=llm_model_id, client=bedrock_client)

llm_for_evaluation = BedrockChat(model_id=llm_model_id, client=bedrock_client)

bedrock_embeddings = BedrockEmbeddings(model_id=embedding_model_id,client=bedrock_client)

  bedrock_embeddings = BedrockEmbeddings(model_id=embedding_model_id,client=bedrock_client)


In [39]:
import pandas as pd

test = pd.read_csv('data/bedrock-user-guide-test.csv')
test = test.dropna()
test.style.set_properties(**{'text-align': 'left', 'border': '1px solid black'})
test.to_string(justify='left', index=False)
with pd.option_context("display.max_colwidth", None):
    pretty_print(test)

Unnamed: 0,Question/prompt,Correct answer
0,Are all models accessible on Amazon Bedrock by default?,"Access to Amazon Bedrock foundation models isn't granted by default. You can request access, or modify access, to foundation models only by using the Amazon Bedrock console. First, make sure the IAM role that you use has sufficent IAM permissions to manage access to foundation models. Then, add or remove access to a model by following the instructions at Add or remove access to Amazon Bedrock foundation models."
1,What is the Model ID of Amazon Titan Text Premier,amazon.titan-text-premier-v1:0
2,With which Anthropic Claude models can I use the Text Completions API?,"Anthropic Claude Instant v1.2, Anthropic Claude v2, Anthropic Claude v2.1"
3,What policies can I configure in Amazon Bedrock guardrails?,"You can configure the following policies in a guardrail to avoid undesirable and harmful content and remove sensitive information for privacy protection. Content filters – Adjust filter strengths to block input prompts or model responses containing harmful content. Denied topics – Define a set of topics that are undesirable in the context of your application. These topics will be blocked if detected in user queries or model responses. Word filters – Configure filters to block undesirable words, phrases, and profanity. Such words can include offensive terms, competitor names etc. Sensitive information filters – Block or mask sensitive information such as personally identifiable information (PII) or custom regex in user inputs and model responses. Contextual grounding check – Detect and filter hallucinations in model responses based on grounding in a source and relevance to the user query."
4,Which built in datasets are available on Amazon Bedrock for model evaluation of text generation?,"The following built-in datasets contain prompts that are well-suited for use in general text generation tasks. Bias in Open-ended Language Generation Dataset (BOLD) The Bias in Open-ended Language Generation Dataset (BOLD) is a dataset that evaluates fairness in general text generation, focusing on five domains: profession, gender, race, religious ideologies, and political ideologies. It contains 23,679 different text generation prompts. RealToxicityPrompts RealToxicityPrompts is a dataset that evaluates toxicity. It attempts to get the model to generate racist, sexist, or otherwise toxic language. This dataset contains 100,000 different text generation prompts. T-Rex : A Large Scale Alignment of Natural Language with Knowledge Base Triples (TREX) TREX is dataset consisting of Knowledge Base Triples (KBTs) extracted from Wikipedia. KBTs are a type of data structure used in natural language processing (NLP) and knowledge representation. They consist of a subject, predicate, and object, where the subject and object are linked by a relation. An example of a Knowledge Base Triple (KBT) is ""George Washington was the president of the United States"". The subject is ""George Washington"", the predicate is ""was the president of"", and the object is ""the United States"". WikiText2 WikiText2 is a HuggingFace dataset that contains prompts used in general text generation."


In [40]:
from datasets import Dataset

questions = test['Question/prompt'].tolist()
ground_truths = [[gt] for gt in test['Correct answer'].tolist()]

answers = []
contexts = []

for query in questions:
    response = ask_bedrock_llm_with_knowledge_base(query, kb_id)
    generatedResult = response['output']['text']
    answers.append(generatedResult)
    contexts.append([doc['content']['text'] for doc in response['citations'][0]['retrievedReferences']])

# To dict
data = {
    "question": questions,
    "answer": answers,
    "contexts": contexts,
    "ground_truths": ground_truths
}

# Convert dict to dataset
dataset = Dataset.from_dict(data)

In [41]:
#!pip install datasets -U



In [1]:
!pip list | grep datasets

datasets                                3.1.0


In [None]:
#%%capture
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    context_entity_recall,
    answer_similarity,
    answer_correctness
)

#specify the metrics here, kept one for now, we can add more.
metrics = [
        answer_relevancy
    ]

result = evaluate(
    dataset = dataset, 
    metrics=metrics,
    llm=llm_for_evaluation,
    embeddings=bedrock_embeddings,
    raise_exceptions=False
)

ragas_df = result.to_pandas()

In [43]:
ragas_df.style.set_properties(**{'text-align': 'left', 'border': '1px solid black'})
ragas_df.to_string(justify='left', index=False)
with pd.option_context("display.max_colwidth", None):
    pretty_print(ragas_df)

Unnamed: 0,user_input,retrieved_contexts,response,answer_relevancy
0,Are all models accessible on Amazon Bedrock by default?,"[aws-marketplace:Unsubscribe ? aws-marketplace:ViewSubscriptions For information creating the policy, see I already have an AWS account. For the aws-marketplace:Subscribe action only, you can use the aws- marketplace:ProductId condition key to restrict subscription to specific models. Grant permissions to request access to foundation models 29 https://aws.amazon.com/bedrock/pricing/ https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsmarketplace.html#awsmarketplace-actions-as-permissions https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsmarketplace.html#awsmarketplace-policy-keysAmazon Bedrock User Guide Note You can't remove request access from the Amazon Titan, Mistral AI, and Meta Llama 3 Instruct models. You can prevent users from making inference calls to these models by using an IAM policy and specifying the model ID. For more information, see Deny access for inference on specific models. The following table lists product IDs for Amazon Bedrock foundation models: The following is the format of the IAM policy you can attach to a role to control model access permissions: Model Product ID AI21 Labs Jurassic-2 Mid 1d288c71-65f9-489a-a3e2-9c7f4f6e6a85 AI21 Labs Jurassic-2 Ultra cc0bdd50-279a-40d8-829c-4009b77a1fcc AI21 Jamba-Instruct prod-dr2vpvd4k73aq AI21 Labs Jamba 1.5 Large prod-evcp4w4lurj26 AI21 Labs Jamba 1.5 Mini prod-ggrzjm65qmjhm Anthropic Claude c468b48a-84df-43a4-8c46-8870630108a7 Anthropic Claude Instant b0eb9475-3a2c-43d1-94d3-56756fd43737 Anthropic Claude 3 Sonnet prod-6dw3qvchef7zy Anthropic Claude 3.5 Sonnet prod-m5ilt4siql27k Anthropic Claude 3.5 Sonnet v2 prod-cx7ovbu5wex7g Anthropic Claude 3 Haiku prod-ozonys2hmmpeu Anthropic Claude 3.5 Haiku prod-5oba7y7jpji56 Anthropic Claude 3 Opus prod-fm3feywmwerog Grant permissions to request access to foundation models 30Amazon Bedrock User Guide Model Product ID Cohere Command a61c46fe-1747-41aa-9af0-2e0ae8a9ce05 Cohere Command Light 216b69fd-07d5-4c7b-866b-936456d68311 Cohere Command R prod-tukx4z3hrewle Cohere Command R+ prod-nb4wqmplze2pm Cohere Embed (English) b7568428-a1ab-46d8-bab3-37def50f6f6a Cohere Embed (Multilingual) 38e55671-c3fe-4a44-9783-3584906e7cad Stable Diffusion XL 1.0 prod-2lvuzn4iy6n6o Stable Image Core 1.0 prod-eacdrmv7zfc5e Stable Diffusion 3 Large 1.0 prod-cqfmszl26sxu4 Stable Image Ultra 1.0 prod-7boen2z2wnxrg { ""Version"": ""2012-10-17"", ""Statement"": [ { ""Effect"": ""Allow|Deny"", ""Action"": [ ""aws-marketplace:Subscribe"" ], ""Resource"": ""*"", ""Condition"": { ""ForAnyValue:StringEquals"": { ""aws-marketplace:ProductId"": [ model-product-id-1, model-product-id-2, ... ] } } }, Grant permissions to request access to foundation models 31Amazon Bedrock User Guide { ""Effect"": ""Allow|Deny"", ""Action"": [ ""aws-marketplace:Unsubscribe"" ""aws-marketplace:ViewSubscriptions"" ], ""Resource"": ""*"" } ] } To see an example policy, refer to Allow access to third-party model subscriptions. Add or remove access to Amazon Bedrock foundation models Before you can use a foundation model in Amazon Bedrock, you must request access to it. If you no longer need access to a model, you can remove access from it. Note You can't remove request access from the Amazon Titan, Mistral AI, and Meta Llama 3 Instruct models., Once access is provided to a model, it is available for all users in the AWS account. To add or remove access to foundation models 1. Make sure you have permissions to request access, or modify access, to Amazon Bedrock foundation models. 2. Sign into the Amazon Bedrock console at https://console.aws.amazon.com/bedrock/. 3. In the left navigation pane, under Bedrock configurations, choose Model access. 4. On the Model access page, choose Modify model access. 5. Select the models that you want the account to have access to and unselect the models that you don't want the account to have access to. You have the following options: Add or remove access to foundation models 32 https://console.aws.amazon.com/bedrock/Amazon Bedrock User Guide Be sure to review the End User License Agreement (EULA) for terms and conditions of using a model before requesting access to it. ? Select the check box next to an individual model to check or uncheck it. ? Select the top check box to check or uncheck all models. ? Select how the models are grouped and then check or uncheck all the models in a group by selecting the check box next to the group. For example, you can choose to Group by provider and then select the check box next to Cohere to check or uncheck all Cohere models. 6. Choose Next. 7. If you add access to Anthropic models, you must describe your use case details. Choose Submit use case details, fill out the form, and then select Submit form. Notification of access is granted or denied based on your answers when completing the form for the provider. 8. Review the access changes you're making, and then read the Terms. Note Your use of Amazon Bedrock foundation models is subject to the seller's pricing terms, EULA, and the AWS service terms. 9. If you agree with the terms, choose Submit. The changes can take several minutes to be reflected in the console. Note If you revoke access to a model, it can still be accessed through the API for some time after you complete this action while the changes propagate. To immediately remove access in the meantime, add an IAM policy to a role to deny access to the model. 10. If your request is successful, the Access status changes to Access granted or Available to request. Note For AWS GovCloud (US) customers, follow these steps to access models that are available in AWS GovCloud (US): Add or remove access to foundation models 33 https://aws.amazon.com/bedrock/pricing/ https://aws.amazon.com/service-termsAmazon Bedrock User Guide ? AWS GovCloud (US) users must locate their standard AWS account ID associated with their AWS GovCloud (US) account ID. AWS GovCloud (US) users can follow this guide Finding your associated standard AWS account ID, if they don't already know their ID. Navigate to the model access page on Amazon Bedrock console. Select the model(s) that you want to enable. Select Request model access and follow the step-by-step subscription flow. ? AWS GovCloud (US) customers use their standard AWS account ID (which is linked to their AWS GovCloud (US) account ID) to first enable model access. Navigate to the model access page on Amazon Bedrock console in either us-east-1 or us-west-2. Select the model(s) that you want to enable. Select Request model access and follow the step-by- step subscription flow. ? Log into your AWS GovCloud (US) account and navigate to Amazon Bedrock in us-gov- west-1 and follow the same model access sign-up steps. This will grant you a regional entitlement to access the models in us-gov-west-1. ? The model will be accessible to the linked AWS GovCloud (US) account on us-gov- west-1. If you don't have permissions to request access to a model, an error banner appears. Contact your account administrator to ask them to request access to the model for you or to provide you permissions to request access to the model. Add or remove access to foundation models 34 https://docs.aws.amazon.com/govcloud-us/latest/UserGuide/govcloud-account-ID-alias.html#find-standard-idAmazon Bedrock User Guide Amazon Bedrock foundation model information A foundation model is an Artificial Intellgence model with a large number of parameters and trained on a massive amount of diverse data. A foundation model can generate a variety of responses for a wide range of use cases. Foundation models can generate text or image, and can also convert input into embeddings. This section provides information about the foundation models (FM) that you can use in Amazon Bedrock, such as the features that models support and the AWS regions in which models are available.]","No, not all models are accessible on Amazon Bedrock by default. Before you can use a foundation model in Amazon Bedrock, you must request access to it. Some models require you to go through an approval process before access is granted. You cannot remove access from certain models like Amazon Titan, Mistral AI, and Meta Llama 3 Instruct models. However, you can prevent users from making inference calls to these models by using an IAM policy and specifying the model ID.",1.0
1,What is the Model ID of Amazon Titan Text Premier,"[Amazon Titan Text Embeddings models ? Amazon Titan Multimodal Embeddings G1 model ? Amazon Titan Image Generator G1 models Amazon Titan Text models Amazon Titan text models include Amazon Titan Text G1 - Premier, Amazon Titan Text G1 - Express and Amazon Titan Text G1 - Lite. Amazon Titan Text G1 - Premier Amazon Titan Text G1 - Premier is a large language model for text generation. It is useful for a wide range of tasks including open-ended and context-based question answering, code generation, and summarization. This model is integrated with Amazon Bedrock Knowledge Base and Amazon Bedrock Agents. The model also supports Custom Finetuning in preview. ? Model ID ? amazon.titan-text-premier-v1:0 ? Max tokens ? 32K ? Languages ? English Amazon Titan Text 1399Amazon Bedrock User Guide ? Supported use cases ? 32k context window, open-ended text generation, brainstorming, summarizations, code generation, table creation, data formatting, paraphrasing, chain of thought, rewrite, extraction, QnA, chat, Knowledge Base support, Agents support, Model Customization (preview). ? Inference parameters ? Temperature, Top P (defaults: Temperature = 0.7, Top P = 0.9) AWS AI Service Card - Amazon Titan Text Premier Amazon Titan Text G1 - Express Amazon Titan Text G1 - Express is a large language model for text generation. It is useful for a wide range of advanced, general language tasks such as open-ended text generation and conversational chat, as well as support within Retrieval Augmented Generation (RAG). At launch, the model is optimized for English, with multilingual support for more than 30 additional languages available in preview. ? Model ID ? amazon.titan-text-express-v1 ? Max tokens ? 8K ? Languages ? English (GA), 100 additional languages (Preview) ? Supported use cases ? Retrieval augmented generation, open-ended text generation, brainstorming, summarizations, code generation, table creation, data formatting, paraphrasing, chain of thought, rewrite, extraction, QnA, and chat. Amazon Titan Text G1 - Lite Amazon Titan Text G1 - Lite is a light weight efficient model, ideal for fine-tuning of English- language tasks, including like summarizations and copy writing, where customers want a smaller, more cost-effective model that is also highly customizable. ? Model ID ? amazon.titan-text-lite-v1 ? Max tokens ? 4K ? Languages ? English ? Supported use cases ? Open-ended text generation, brainstorming, summarizations, code generation, table creation, data formatting, paraphrasing, chain of thought, rewrite, extraction, QnA, and chat. Amazon Titan Text G1 - Express 1400 https://aws.amazon.com/machine-learning/responsible-machine-learning/titan-text-premier/Amazon Bedrock User Guide Amazon Titan Text Model Customization For more information on customizing Amazon Titan text models, see the following pages. ? Prepare the datasets ? Amazon Titan text model customization hyperparameters Amazon Titan Text Prompt Engineering Guidelines Amazon Titan text models can be used in a wide variety of applications for different use cases. Amazon Titan Text models have prompt engineering guidelines for the following applications including: ? Chatbot ? Text2SQL ? Function Calling ? RAG (Retrieval Augmented Generation) For more information on Amazon Titan Text prompt engineering guidelines, see Amazon Titan Text Prompt Engineering Guidelines. For general prompt engineering guidelines, see Prompt Engineering Guidelines. AWS AI Service Card - Amazon Titan Text AI Service Cards provide transparency and document the intended use cases and fairness considerations for our AWS AI services. AI Service Cards provide a single place to find information on the intended use cases, responsible AI design choices, best practices, and performance for a set of AI service use cases. Amazon Titan Text Embeddings models Amazon Titan Embeddings text models include Amazon Titan Text Embeddings v2 and Titan Text Embeddings G1 model. Text embeddings represent meaningful vector representations of unstructured text such as documents, paragraphs, and sentences. You input a body of text and the output is a (1 x n) vector. You can use embedding vectors for a wide variety of applications.]",The Model ID of Amazon Titan Text Premier is amazon.titan-text-premier-v1:0.,0.995229
2,With which Anthropic Claude models can I use the Text Completions API?,"[Anthropic Claude 3 model, such as Anthropic Claude 3 Opus For information about creating system prompts, see https://docs.anthropic.com/claude/ docs/how-to-use-system-prompts in the Anthropic Claude documentation. To avoid timeouts with Anthropic Claude version 2.1, we recommend limiting the input token count in the prompt field to 180K. We expect to address this timeout issue soon. In the inference call, fill the body field with a JSON object that conforms the type call you want to make, Anthropic Claude Text Completions API or Anthropic Claude Messages API. Topics ? Anthropic Claude Text Completions API ? Anthropic Claude Messages API Anthropic Claude models 131 https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags https://docs.anthropic.com/en/docs/welcome https://docs.anthropic.com/claude/docs/how-to-use-system-prompts https://docs.anthropic.com/claude/docs/how-to-use-system-promptsAmazon Bedrock User Guide Anthropic Claude Text Completions API This section provides inference parameters and code examples for using Anthropic Claude models with the Text Completions API. Topics ? Anthropic Claude Text Completions API overview ? Supported models ? Request and Response ? Code example Anthropic Claude Text Completions API overview Use the Text Completion API for single-turn text generation from a user supplied prompt. For example, you can use the Text Completion API to generate text for a blog post or to summarize text input from a user. For information about creating prompts for Anthropic Claude models, see Introduction to prompt design. If you want to use your existing Text Completions prompts with the Anthropic Claude Messages API, see Migrating from Text Completions. Supported models You can use the Text Completions API with the following Anthropic Claude models. ? Anthropic Claude Instant v1.2 ? Anthropic Claude v2 ? Anthropic Claude v2.1 Request and Response The request body is passed in the body field of a request to InvokeModel or InvokeModelWithResponseStream. For more information, see https://docs.anthropic.com/claude/reference/complete_post in the Anthropic Claude documentation. Anthropic Claude models 132 https://docs.anthropic.com/claude/docs/introduction-to-prompt-design https://docs.anthropic.com/claude/docs/introduction-to-prompt-design https://docs.anthropic.com/claude/reference/migrating-from-text-completions-to-messages https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html https://docs.anthropic.com/claude/reference/complete_postAmazon Bedrock User Guide Request Anthropic Claude has the following inference parameters for a Text Completion inference call. { ""prompt"": "" Human:<prompt> Assistant:"", ""temperature"": float, ""top_p"": float, ""top_k"": int, ""max_tokens_to_sample"": int, ""stop_sequences"": [string] } The following are required parameters. ? prompt ? (Required) The prompt that you want Claude to complete. For proper response generation you need to format your prompt using alternating Human: and Assistant: conversational turns. For example: "" Human: {userQuestion} Assistant:"" For more information, see Prompt validation in the Anthropic Claude documentation. ? max_tokens_to_sample ? (Required) The maximum number of tokens to generate before stopping. We recommend a limit of 4,000 tokens for optimal performance. Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens_to_sample. Different Anthropic Claude models have different maximum values for this parameter. For more information, see Model comparison in the Anthropic Claude documentation. Default Minimum Maximum 200 0 4096 The following are optional parameters. ? stop_sequences ? (Optional) Sequences that will cause the model to stop generating. Anthropic Claude models 133 https://docs.anthropic.com/claude/reference/prompt-validation https://docs.anthropic.com/claude/docs/models-overview#model-comparisonAmazon Bedrock User Guide Anthropic Claude models stop on "" Human:"", and may include additional built-in stop sequences in the future. Use the stop_sequences inference parameter to include additional strings that will signal the model to stop generating text. ? temperature ?]",You can use the Text Completions API with the following Anthropic Claude models: - Anthropic Claude Instant v1.2 - Anthropic Claude v2 - Anthropic Claude v2.1,0.676724
3,What policies can I configure in Amazon Bedrock guardrails?,"[Specify DENY in the action field. ? (Optional) Provide up to five examples that you would categorize as belonging to the topic in the examples list. ? Specify filter strengths for the harmful categories defined in Amazon Bedrock in the contentPolicy object. Each item in the filters list pertains to a harmful category. For more information, see Block harmful words and conversations with content filters. For more information about the fields in a content filter, see ContentFilter. ? Specify the category in the type field. ? Specify the strength of the filter for prompts in the strength field of the textToTextFiltersForPrompt field and for model responses in the strength field of the textToTextFiltersForResponse. ? (Optional) Attach any tags to the guardrail. For more information, see Tagging Amazon Bedrock resources. ? (Optional) For security, include the ARN of a KMS key in the kmsKeyId field. The response format is as follows: HTTP/1.1 202 Content-type: application/json { ""createdAt"": ""string"", ""guardrailArn"": ""string"", ""guardrailId"": ""string"", ""version"": ""string"" } Create a guardrail 464 https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Topic.html https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ContentFilter.htmlAmazon Bedrock User Guide Set up permissions to use guardrails for content filtering To set up a role with permissions for guardrails, create an IAM role and attach the following permissions by following the steps at Creating a role to delegate permissions to an AWS service. If you are using guardrails with an agent, attach the permissions to a service role with permissions to create and manage agents. You can set up this role in the console or create a custom role by following the steps at Create a service role for Amazon Bedrock Agents. ? Permissions to invoke guardrails with foundation models ? Permissions to create and manage guardrails ? (Optional) Permissions to decrypt your customer-managed AWS KMS key for the guardrail Permissions to create and manage guardrails for the policy role Append the following statement to the Statement field in the policy for your role to use guardrails. { ""Version"": ""2012-10-17"", ""Statement"": [ { ""Sid"": ""CreateAndManageGuardrails"", ""Effect"": ""Allow"", ""Action"": [ ""bedrock:CreateGuardrail"", ""bedrock:CreateGuardrailVersion"", ""bedrock:DeleteGuardrail"", ""bedrock:GetGuardrail"", ""bedrock:ListGuardrails"", ""bedrock:UpdateGuardrail"" ], ""Resource"": ""*"" } ] } Permissions for Amazon Bedrock Guardrails 465 https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.htmlAmazon Bedrock User Guide Permissions you need to invoke guardrails to filter content Append the following statement to the Statement field in the policy for the role to allow for model inference and to invoke guardrails. { ""Version"": ""2012-10-17"", ""Statement"": [ { ""Sid"": ""InvokeFoundationModel"", ""Effect"": ""Allow"", ""Action"": [ ""bedrock:InvokeModel"", ""bedrock:InvokeModelWithResponseStream"" ], ""Resource"": [ ""arn:aws:bedrock:region::foundation-model/*"" ] }, { ""Sid"": ""ApplyGuardrail"", ""Effect"": ""Allow"", ""Action"": [ ""bedrock:ApplyGuardrail"" ], ""Resource"": [ ""arn:aws:bedrock:region:account-id:guardrail/guardrail-id"" ] } ] } (Optional) Create a customer managed key for your guardrail for additional security Any user with CreateKey permissions can create customer managed keys using either the AWS Key Management Service (AWS KMS) console or the CreateKey operation. Make sure to create a symmetric encryption key. After you create your key, set up the following permissions. Permissions you need to invoke guardrails to filter content 466 https://docs.aws.amazon.com/kms/latest/APIReference/API_CreateKey.htmlAmazon Bedrock User Guide 1. Follow the steps at Creating a key policy to create a resource-based policy for your KMS key. Add the following policy statements to grant permissions to guardrails users and guardrails creators. Replace each role with the role that you want to allow to carry out the specified actions., b. In the Relevance field, select Enable relevance check to check if model responses are relevant.. c. Select Next. 10. Review and create ? Review the settings for your guardrail. a. Select Edit in any section you want to make changes to. b. When you are satisfied with the settings for your guardrail, select Create to create the guardrail. API To create a guardrail, send a CreateGuardrail request. The request format is as follows: POST /guardrails HTTP/1.1 Content-type: application/json { ""blockedInputMessaging"": ""string"", ""blockedOutputsMessaging"": ""string"", ""contentPolicyConfig"": { ""filtersConfig"": [ { ""inputStrength"": ""NONE | LOW | MEDIUM | HIGH"", ""outputStrength"": ""NONE | LOW | MEDIUM | HIGH"", ""type"": ""SEXUAL | VIOLENCE | HATE | INSULTS | MISCONDUCT | PROMPT_ATTACK"" } ] }, ""wordPolicyConfig"": { ""wordsConfig"": [ Create a guardrail 462 https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateGuardrail.htmlAmazon Bedrock User Guide { ""text"": ""string"" } ], ""managedWordListsConfig"": [ { ""type"": ""string"" } ] }, ""sensitiveInformationPolicyConfig"": { ""piiEntitiesConfig"": [ { ""type"": ""string"", ""action"": ""string"" } ], ""regexesConfig"": [ { ""name"": ""string"", ""description"": ""string"", ""regex"": ""string"", ""action"": ""string"" } ] }, ""description"": ""string"", ""kmsKeyId"": ""string"", ""name"": ""string"", ""tags"": [ { ""key"": ""string"", ""value"": ""string"" } ], ""topicPolicyConfig"": { ""topicsConfig"": [ { ""definition"": ""string"", ""examples"": [ ""string"" ], ""name"": ""string"", ""type"": ""DENY"" } ] Create a guardrail 463Amazon Bedrock User Guide } } ? Specify a name and description for the guardrail. ? Specify messages for when the guardrail successfully blocks a prompt or a model response in the blockedInputMessaging and blockedOutputsMessaging fields. ? Specify topics for the guardrail to deny in the topicPolicy object. Each item in the topics list pertains to one topic. For more information about the fields in a topic, see Topic. ? Give a name and description so that the guardrail can properly identify the topic. ? Specify DENY in the action field. ? (Optional) Provide up to five examples that you would categorize as belonging to the topic in the examples list. ? Specify filter strengths for the harmful categories defined in Amazon Bedrock in the contentPolicy object. Each item in the filters list pertains to a harmful category. For more information, see Block harmful words and conversations with content filters. For more information about the fields in a content filter, see ContentFilter. ? Specify the category in the type field. ? Specify the strength of the filter for prompts in the strength field of the textToTextFiltersForPrompt field and for model responses in the strength field of the textToTextFiltersForResponse. ? (Optional) Attach any tags to the guardrail. For more information, see Tagging Amazon Bedrock resources. ? (Optional) For security, include the ARN of a KMS key in the kmsKeyId field. The response format is as follows: HTTP/1.1 202 Content-type: application/json { ""createdAt"": ""string"", ""guardrailArn"": ""string"", ""guardrailId"": ""string"", ""version"": ""string"" } Create a guardrail 464 https://docs.aws.amazon.com/bedrock/latest/APIReference/API_Topic.html https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ContentFilter.htmlAmazon Bedrock User Guide Set up permissions to use guardrails for content filtering To set up a role with permissions for guardrails, create an IAM role and attach the following permissions by following the steps at Creating a role to delegate permissions to an AWS service. If you are using guardrails with an agent, attach the permissions to a service role with permissions to create and manage agents. You can set up this role in the console or create a custom role by following the steps at Create a service role for Amazon Bedrock Agents. ? Permissions to invoke guardrails with foundation models ? Permissions to create and manage guardrails ? (Optional) Permissions to decrypt your customer-managed AWS KMS key for the guardrail Permissions to create and manage guardrails for the policy role Append the following statement to the Statement field in the policy for your role to use guardrails.]","In Amazon Bedrock guardrails, you can configure the following policies: 1. Content filters: You can specify filter strengths for harmful categories like sexual content, violence, hate speech, insults, misconduct, and prompt attacks. 2. Topic policy: You can specify topics to deny, provide a definition and examples for each topic. 3. Word policy: You can specify custom words or managed word lists (e.g. profanity) to block. 4. Sensitive information policy: You can specify personally identifiable information (PII) entities like addresses, ages, etc. to block or anonymize, and regular expressions to match sensitive information.",0.996602
4,Which built in datasets are available on Amazon Bedrock for model evaluation of text generation?,"[It contains 23,679 different text generation prompts. RealToxicityPrompts RealToxicityPrompts is a dataset that evaluates toxicity. It attempts to get the model to generate racist, sexist, or otherwise toxic language. This dataset contains 100,000 different text generation prompts. T-Rex : A Large Scale Alignment of Natural Language with Knowledge Base Triples (TREX) TREX is dataset consisting of Knowledge Base Triples (KBTs) extracted from Wikipedia. KBTs are a type of data structure used in natural language processing (NLP) and knowledge representation. They consist of a subject, predicate, and object, where the subject and object are linked by a relation. An example of a Knowledge Base Triple (KBT) is ""George Washington was the president of the United States"". The subject is ""George Washington"", the predicate is ""was the president of"", and the object is ""the United States"". Model evaluation task types 533Amazon Bedrock User Guide WikiText2 WikiText2 is a HuggingFace dataset that contains prompts used in general text generation. The following table summarizes the metrics calculated, and recommended built-in dataset that are available for automatic model evaluation jobs. To successfully specify the available built-in datasets using the AWS CLI, or a supported AWSSDK use the parameter names in the column, Built- in datasets (API). Available built-in datasets for general text generation in Amazon Bedrock Task type Metric Built-in datasets (Console) Built-in datasets (API) Computed metric Accuracy TREX Builtin.T-REx Real world knowledge (RWK) score BOLD Builtin.BOLD WikiText2 Builtin.W ikiText2 Robustnes s TREX Builtin.T-REx Word error rate RealToxicityPrompts Builtin.R ealToxici tyPrompts General text generation Toxicity BOLD Builtin.Bold Toxicity To learn more about how the computed metric for each built-in dataset is calculated, see Review model evaluation job reports and metrics in Amazon Bedrock Text summarization for model evaluation in Amazon Bedrock Text summarization is used for tasks including creating summaries of news, legal documents, academic papers, content previews, and content curation. The ambiguity, coherence, bias, and Model evaluation task types 534 https://hadyelsahar.github.io/t-rex/ https://github.com/amazon-science/bold https://huggingface.co/datasets/wikitext https://hadyelsahar.github.io/t-rex/ https://github.com/allenai/real-toxicity-prompts https://github.com/amazon-science/boldAmazon Bedrock User Guide fluency of the text used to train the model as well as information loss, accuracy, relevance, or context mismatch can influence the quality of responses. Important For text summarization, there is a known system issue that prevents Cohere models from completing the toxicity evaluation successfully. The following built-in dataset is supported for use with the task summarization task type. Gigaword The Gigaword dataset consists of news article headlines. This dataset is used in text summarization tasks. The following table summarizes the metrics calculated, and recommended built-in dataset. To successfully specify the available built-in datasets using the AWS CLI, or a supported AWSSDK use the parameter names in the column, Built-in datasets (API). Available built-in datasets for text summarization in Amazon Bedrock Task type Metric Built-in datasets (console) Built-in datasets (API) Computed metric Accuracy Gigaword Builtin.Gigaword BERTScore Toxicity Gigaword Builtin.Gigaword Toxicity Text summariza tion Robustnes s Gigaword Builtin.Gigaword BERTScore and deltaBERT Score To learn more about how the computed metric for each built-in dataset is calculated, see Review model evaluation job reports and metrics in Amazon Bedrock Model evaluation task types 535 https://huggingface.co/datasets/gigaword?row=3 https://huggingface.co/datasets/gigaword?row=3 https://huggingface.co/datasets/gigaword?row=3Amazon Bedrock User Guide Question and answer for model evaluation in Amazon Bedrock Question and answer is used for tasks including generating automatic help-desk responses, information retrieval, and e-learning. If the text used to train the foundation model contains issues including incomplete or inaccurate data, sarcasm or irony, the quality of responses can deteriorate. Important For question and answer, there is a known system issue that prevents Cohere models from completing the toxicity evaluation successfully. The following built-in datasets are recommended for use with the question andg answer task type.]",The following built-in datasets are available on Amazon Bedrock for model evaluation of general text generation tasks: - Bias in Open-ended Language Generation Dataset (BOLD) - RealToxicityPrompts - T-Rex: A Large Scale Alignment of Natural Language with Knowledge Base Triples (TREX) - WikiText2,0.965784


### <a >Challenge Exercise :: Try it Yourself! </a>


<div style="border: 4px solid coral; text-align: left; margin: auto;">
    <br>
    <p style="text-align: center; margin: auto;"><b>Try the following exercises on this lab and note the observations.</b></p>
<p style=" text-align: left; margin: auto;">
<ol>
    <li>Test the RAG based LLM with more questions about Amazon Bedrock. </li>
<li>Look the the citations or retrieved references and see if the answer generated by the RAG chatbot aligns with these retrieved contexts. What response do you get when the retrieved context comes up empty? </li>
<li>Apply system prompts to RAG as well as amazon Bedrock Guardrails and test which is more consistent in blocking responses when the model response is hallucinated </li>
<li>Run the tutorial for RAG Checker and compare the difference with RAGAS evaluation framework: https://github.com/amazon-science/RAGChecker/blob/main/tutorial/ragchecker_tutorial_en.md </li>
</ol>
<br>
</p>
</div>


## Conclusion
We now have an understanding of parameters which influence hallucinations in Large Language Models. We learnt how to set up Retrieval Augmented Generation to provide a context to the model while answering.
We used Contextual grounding in Amazon Bedrock Guardrials to intervene when hallucinations are detected.
Finally we looked into the metrics of RAGAS and how to use them to measure hallucinations in your RAG powered chatbot.

In the next lab, we will:
1. Build a custom hallucination detector
2. Use Amazon Bedrock Agents to intervene when hallucinations are detected
3. Call a human for support when the LLM hallucinates
