## ApplyGuardrail for Amazon Bedrock
#### Examples using streaming and long context input
---
Guardrails for Amazon Bedrock evaluates user inputs and FM responses based on use case specific policies, and provides an additional layer of safeguards regardless of the underlying FM. Guardrails can be applied across all large language models (LLMs) on Amazon Bedrock, including fine-tuned models and even Generative AI applications outside of Amazon Bedrock. Customers can create multiple guardrails, each configured with a different combination of controls, and use these guardrails across different applications and use cases. Guardrails allows you to configure denied topics, filter harmful content, and remove sensitive information.



In this notebook we will showcase how to use `ApplyGuardrail` API from Amazon Bedrock. This is an independent API to apply guardrails to any generative AI application both at input and output level.

### Overview
The new ApplyGuardrail API allows customers to assess any text using their pre-configured Bedrock Guardrails, without
invoking the foundation models.
Key Features:
1. Content Validation: Send any text input or output to the `ApplyGuardrail` API to have it evaluated against your defined topic avoidance rules, content filters, word blocklists, PII detectors, regular expressions, profanity, and contextual grounding. You can evaluate user inputs and FM generated outputs independently.
2. Flexible Deployment: Integrate the Guardrails API anywhere in your application flow to validate data before processing or serving results to users. E.g. For a RAG application, you can now evaluate the user input prior to performing the retrieval instead of waiting until the final response generation.
3. ApplyGuardrail is decoupled from foundational models. You can now use Guardrails without invoking Foundation Models.
You can use the assessment results to design the experience on your generative AI application.

In [None]:
#Start by installing the dependencies to ensure we have a recent version
!pip install --upgrade --force-reinstall boto3 botocore awscli
import boto3
print(boto3.__version__)

Guardrails in Amazon Bedrock is priced in the terms of a text unit. A text unit can contain up to 1000 characters. If a text input is more than 1000 characters, it is processed as multiple text units, each containing 1000 characters or less. For example, if a text input contains 5600 characters, it will be charged for 6 text units. Read more about Guardrails pricing here: https://aws.amazon.com/bedrock/pricing/

`ApplyGuardrail` API has a default limit of 25 text units/second.

In [None]:
# Let's specify the parameters needed for execution later

REGION_NAME = "us-east-1"
MODEL_ID = "anthropic.claude-3-haiku-20240307-v1:0"
TEXT_UNIT = 1000 # characters
LIMIT_TEXT_UNIT = 25

In [None]:

from textwrap import wrap
import boto3

# Make sure you have AWS credentials or AWS profile setup before running this cell
bedrock_client = boto3.client("bedrock", region_name=REGION_NAME)
bedrock_runtime = boto3.client("bedrock-runtime", region_name=REGION_NAME)


### Creating a Guardrail

Guardrails for Amazon Bedrock have multiple components which include Content Filters, Denied Topics, Word and Profanity Filters, and Sensitive Word (PII & Regex) Filters. For a full list check out the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-create.html).

We are creating a guardrail which prevents providing fiduciary advice.

In [None]:
response = bedrock_client.create_guardrail(
    name='fiduciary-advice',
    description='Prevents the our model from providing fiduciary advice.',
    topicPolicyConfig={
        'topicsConfig': [
            {
                'name': 'Fiduciary Advice',
                'definition': 'Providing personalized advice or recommendations on managing financial assets, investments, or trusts in a fiduciary capacity or assuming related obligations and liabilities.',
                'examples': [
                    'What stocks should I invest in for my retirement?',
                    'Is it a good idea to put my money in a mutual fund?',
                    'How should I allocate my 401(k) investments?',
                    'What type of trust fund should I set up for my children?',
                    'Should I hire a financial advisor to manage my investments?'
                ],
                'type': 'DENY'
            }
        ]
    },
   contentPolicyConfig={
    'filtersConfig': [
        {
            'type': 'SEXUAL',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'VIOLENCE',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'HATE',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'INSULTS',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'MISCONDUCT',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'PROMPT_ATTACK',
            'inputStrength': 'HIGH',
            'outputStrength': 'NONE'
        }
    ]

    },
wordPolicyConfig={
    'wordsConfig': [
        {
            'text': 'fiduciary advice'
        },
        {
            'text': 'investment recommendations'
        },
        {
            'text': 'stock picks'
        },
        {
            'text': 'financial planning guidance'
        },
        {
            'text': 'portfolio allocation advice'
        },
        {
            'text': 'retirement fund suggestions'
        },
        {
            'text': 'wealth management tips'
        },
        {
            'text': 'trust fund setup'
        },
        {
            'text': 'investment strategy'
        },
        {
            'text': 'financial advisor recommendations'
        }
    ],
    'managedWordListsConfig': [
        {
            'type': 'PROFANITY'
        }
    ]
},
sensitiveInformationPolicyConfig={
    'piiEntitiesConfig': [
        {
            'type': 'EMAIL',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'PHONE',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'NAME',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'US_SOCIAL_SECURITY_NUMBER',
            'action': 'BLOCK'
        },
        {
            'type': 'US_BANK_ACCOUNT_NUMBER',
            'action': 'BLOCK'
        },
        {
            'type': 'CREDIT_DEBIT_CARD_NUMBER',
            'action': 'BLOCK'
        }
    ],
    'regexesConfig': [
        {
            'name': 'Account Number',
            'description': 'Matches account numbers in the format XXXXXX1234',
            'pattern': r'\b\d{6}\d{4}\b',
            'action': 'ANONYMIZE'
        }
    ]
},
    blockedInputMessaging='I apologize, but I am not able to provide fiduciary advice. Additionally, it seems that you may have included some sensitive personal or financial information in your request. For your privacy and security, please modify your input and try again without including any personal, financial, or restricted details.',
    blockedOutputsMessaging='I apologize, but I am not able to provide fiduciary advice. Additionally, it seems that you may have included some sensitive personal or financial information in your request. For your privacy and security, please modify your input and try again without including any personal, financial, or restricted details.',
)
print(response)

In [None]:
guardrail_id = response['guardrailId']
guardrail_version = response['version'] 

print(f"Guardrail ID: {guardrail_id}")
print(f"Guardrail Version: {guardrail_version}")

### Usage
The `ApplyGuardrail` request allows customer to pass all their content that should be guarded using their defined Guardrail. The source field should be set to `INPUT` when the content to evaluated is from a user, typically the LLM prompt. The source should be set to `OUTPUT` when the model output Guardrail should be enforced, typically an LLM response.

### Strategy
We will showcase how you can apply guardrail in multiple scenarios
1. Small input content (<25 text units)
2. Large input content (>25 text units)
3. Streaming LLM output

If the content is larger than the quota limits of the `ApplyGuardrail` API then we will have to chunk the original content to smaller chunks to not hit throttling limit.

Also, in the case of streaming the chunks could contain only a few tokens, it wouldn't be wise to apply guardrail on every new chunk neither it would be feasible to wait for the entire output to be generated to then apply guardrail. Instead in order to find the best fit we can apply guardrail whenever we have enough tokens i.e. tokens ~= 1 text unit, this will ensure both cost control as well as enough content available in the chunk to find potentially violating content.

In [None]:

def check_severe_violations(violations):
    # When guardrail intervenes either the action on the request is BLOCKED or NONE
    # Here we check how many of the violations lead to blocking the request
    severe_violations = [violation['action']=='BLOCKED' for violation in violations]
    return sum(severe_violations)

def is_policy_assessement_blocked(assessments):
    # While creating the guardrail you could specify multiple types of policies.
    # At the time of assessment all the policies should be checked for potential violations
    # If there is even 1 violation that blocks the request, the entire request is blocked
    blocked = []
    for assessment in assessments:
        if 'topicPolicy' in assessment:
            blocked.append(check_severe_violations(assessment['topicPolicy']['topics']))
        if 'wordPolicy' in assessment:
            if 'customWords' in assessment['wordPolicy']:
                blocked.append(check_severe_violations(assessment['wordPolicy']['customWords']))
            if 'managedWordLists' in assessment['wordPolicy']:
                blocked.append(check_severe_violations(assessment['wordPolicy']['managedWordLists']))
        if 'sensitiveInformationPolicy' in assessment:
            if 'piiEntities' in assessment['sensitiveInformationPolicy']:
                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['piiEntities']))
            if 'regexes' in assessment['sensitiveInformationPolicy']:
                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['regexes']))
        if 'contentPolicy' in assessment:
            blocked.append(check_severe_violations(assessment['contentPolicy']['filters']))
    severe_violation_count = sum(blocked)
    print(f'\033[91m::Guardrail:: {severe_violation_count} severe violations detected\033[0m')
    return severe_violation_count>0

def apply_guardrail(text, text_source_type, guardrail_id, guardrail_version="DRAFT"):
    print(f'\n\n\033[91m::Guardrail:: Applying guardrail with {(len(text)//TEXT_UNIT)+1} text units\033[0m\n')
    response = bedrock_runtime.apply_guardrail(
        guardrailIdentifier=guardrail_id,
        guardrailVersion=guardrail_version, 
        source=text_source_type, # can be 'INPUT' or 'OUTPUT'
        content=[{"text": {"text": text}}]
    )
    if response['action'] == 'GUARDRAIL_INTERVENED':
        is_blocked = is_policy_assessement_blocked(response['assessments'])
        alternate_text = ' '.join([output['text'] for output in response['outputs']])
        return is_blocked, alternate_text, response
    else:
        # Return the default response in case of no guardrail intervention
        return False, text, response

def apply_guardrail_full_text(text, text_source_type, guardrail_id, guardrail_version="DRAFT"):
    text_length = len(text)
    filtered_text = ''
    if text_length <= LIMIT_TEXT_UNIT*TEXT_UNIT:
        return apply_guardrail(text, text_source_type, guardrail_id, guardrail_version)
    else:
        # If the text length is greater than the default text unit limits then it's better to chunk the text to avoid throttling.
        for i, chunk in enumerate(wrap(text, LIMIT_TEXT_UNIT*TEXT_UNIT)):
            print(f'::Guardrail::Applying guardrails at chunk {i+1}')
            is_blocked, alternate_text, response = apply_guardrail(chunk, text_source_type, guardrail_id, guardrail_version)
            if is_blocked:
                filtered_text = alternate_text
                break
            # It could be the case that guardrails intervened and anonymized PII in the input text,
            # we can then take the output from guardrails to create filtered text response.
            filtered_text += alternate_text
        return is_blocked, filtered_text, response

In [None]:
from botocore.exceptions import ClientError

def stream_conversation(messages,
                        system_prompts,
                        inference_config,
                        additional_model_fields):
    
    response = bedrock_runtime.converse_stream(
        modelId=MODEL_ID,
        messages=messages,
        system=system_prompts,
        inferenceConfig=inference_config,
        additionalModelRequestFields=additional_model_fields
    )

    stream = response.get('stream')
    full_text = ""
    buffer_text = ""
    applied_guardrails = []
    if stream:
        for event in stream:
            if 'messageStart' in event:
                print(f"\nRole: {event['messageStart']['role']}")

            if 'contentBlockDelta' in event:
                new_text = event['contentBlockDelta']['delta']['text']

                if len(buffer_text + new_text) > TEXT_UNIT:
                    is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)
                    if is_blocked:
                        event['messageStop'] = {
                            'stopReason': guardrail_response['action'], 
                            'output': alt_text,
                            'assessments': guardrail_response['assessments'],
                        }
                        full_text = alt_text
                    else:
                        full_text += alt_text
                    print(alt_text, end="")
                    applied_guardrails.append(guardrail_response)
                    buffer_text = new_text
                else: 
                    buffer_text += new_text

            if 'messageStop' in event:
                if event['messageStop']['stopReason'] == 'GUARDRAIL_INTERVENED':
                    print(f"\nStop reason: {event['messageStop']['stopReason']}")
                    break
                else:
                    print(f"\nStop reason: {event['messageStop']['stopReason']}")
                    is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)
                    if is_blocked:
                        print(alt_text)
                        if 'metadata' not in event:
                            event['metadata'] = {}
                        event['metadata']['guardrails_usage'] = guardrail_response['usage']
                        applied_guardrails.append(guardrail_response)

            if 'metadata' in event:
                metadata = event['metadata']
                if 'usage' in metadata:
                    print("\nToken usage")
                    print(f"Input tokens: {metadata['usage']['inputTokens']}")
                    print(
                        f":Output tokens: {metadata['usage']['outputTokens']}")
                    print(f":Total tokens: {metadata['usage']['totalTokens']}")
                    print(f":Total text units: {(len(full_text)//TEXT_UNIT)+1}")
                if 'metrics' in event['metadata']:
                    print(
                        f"Latency: {metadata['metrics']['latencyMs']} milliseconds")
                if 'guardrails_usage' in event['metadata']:
                    print(event['metadata']['guardrails_usage'])
    return full_text, applied_guardrails

def generate(input_message):

    system_prompt = """You are an assistant that helps with tasks from users. Be as elaborate as possible"""

    message = {
        "role": "user",
        "content": [{"text": input_message}]
    }
    messages = [message]
    
    # System prompts.
    system_prompts = [{"text" : system_prompt}]

    # inference parameters to use.
    temperature = 0.5

    # Base inference parameters.
    inference_config = {
        "temperature": temperature
    }
    # Additional model inference parameters.
    additional_model_fields = {}

    try:
        full_text, applied_guardrails = stream_conversation(messages, system_prompts, inference_config, additional_model_fields)
    except ClientError as err:
        message = err.response['Error']['Message']
        print("A client error occured: " +
              format(message))

    else:
        print(
            f"Finished streaming messages with model {MODEL_ID}.")
    return full_text, applied_guardrails

### OUTPUT - Streaming
At first let's test the case where guardrail would intervene but the text generation will continue as normal.

In [None]:
sample_1 = "List 3 names of prominent CEOs and later tell me what is a bank and what are the benefits of opening a savings account?"
full_text, applied_guardrails = generate(sample_1)

We can clearly see that the guardrail intervened above and annonmyized some names in the text generation. Let's examine what `assessments` were made by the guardrail.

In [None]:
for guardrail in applied_guardrails:
    if guardrail['action']!='NONE':
        print(f"Guardrail Assessment: {guardrail['assessments']}")

The above shows that `sensitiveInformationPolicy` got invoked anonymizing the names generated by the model and cleaning the output.

Now, we can test a different scenario where the input contains request for a fudiciary advice and we can observe guardrail being applied.

In [None]:
sample_2 = "Tell me about why financial independence is important and only at the very end ask the question if you can help me to invest after retirement?"
full_text, applied_guardrails = generate(sample_2)

We can observe above that the guardrail intervention occured, now let's examine which policies were violated. For that we can examine `assessments`, which is part of the response from `ApplyGuardrail` API.

In [None]:
for guardrail in applied_guardrails:
    if guardrail['action']!='NONE':
        print(f"Guardrail Assessment: {guardrail['assessments']}")

It can be observed that the text generation led to a point where model tried giving fudiciary advice which goes against the `topicPolicy` configured at the time of guardrail creation, hence this output was denied and blocked. If a policy is enforced, the corresponding `action` could either be `BLOCKED` or `NONE` depending on severity, type and configuration of the policy.

Read more about different types of components of a guardrail in Amazon Bedrock [here](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-components.html).

### INPUT - Small Document
We can now test how guardrail can be applied to a small document. We will use the [Amazon Shareholders Letter 2023](https://www.aboutamazon.com/news/company-news/amazon-ceo-andy-jassy-2023-letter-to-shareholders), this document doesn't include any text that should cause guardrail to intervene.

In [None]:
letter = open('./data/shareholder_letter.txt', 'r').read()
print(f"Length of the document: {len(letter)} characters")

In [None]:
blocked, new_text, guardrail_response = apply_guardrail_full_text(letter, "INPUT", guardrail_id, guardrail_version)
print(f"\nBlocked by guardrail: {'Yes' if blocked else 'No'}")
if blocked:
    print(f'Guardrail Output: {new_text}')
elif guardrail_response=='GUARDRAIL_INTERVENED' and not blocked:
    print(f'Filtered Text Snippet: {new_text[:5000]}')

print(f"Assessments: {guardrail_response['assessments']}")

### INPUT - Large Document
Now we can test with a different document which contains a fictious financial story generated with the help of an LLM. To increase the length of the document we can combine the shareholders letter and the financial story. This will allow showcasing the capability to chunk the document and then apply guardrail at individual chunks.

In [None]:
financial_story = open('./data/financial_story.txt', 'r').read()
large_text = letter + financial_story
print(f"Length of the document: {len(large_text)} characters")

In [None]:
blocked, new_text, guardrail_response = apply_guardrail_full_text(large_text, "INPUT", guardrail_id, guardrail_version)
print(f"\nBlocked by guardrail: {'Yes' if blocked else 'No'}")
if blocked:
    print(f'Guardrail Output: {new_text}')
elif guardrail_response=='GUARDRAIL_INTERVENED' and not blocked:
    print(f'Filtered Text Snippet: {new_text[:5000]}')

print(f"Assessments: {guardrail_response['assessments']}")

### Clean-up
Before wrapping up let's clean the resources created and delete the guardrail created.

In [None]:
bedrock_client.delete_guardrail(guardrailIdentifier=guardrail_id)

### Wrap-up
In this sample you learned the ability to use `ApplyGuardrail` independent API under Amazon Bedrock.

Try yourself:
- You can change the chunking strategy based on your content for both the streaming as well as full text scenario
- Test different text lengths to call the API on to find best fit for price-performance
- Try changing the prompts to see if the guardrail intervene at the time of text generation
- Apply guardrail in your own application