# Automated Reasoning Guardrail Validation Playground

This notebook demonstrates how to create a guardrail with an automated reasoning policy attachedd and validates the LLM response.
It includes the following:

1. Setting up the Bedrock client with custom API models
2. Creating a guardrail with an automated reasoning policy attached
3. Call ApplyGuardrails API to validate LLM response
4. Print the formatted automated reasoning response

## Setup

First, let's import the necessary libraries and set up the Bedrock client with custom API models.

In [None]:

%pip install -r requirements.txt

In [None]:
import os
import json
import boto3
import uuid
import time
import pandas as pd
from IPython.display import display, HTML, JSON
import ipywidgets as widgets
from datetime import datetime
from findings_utils import extract_reasoning_findings
from policy_definition import get_policy_definition

In [None]:
# Create the Bedrock client
REGION_NAME="us-west-2" # Fill in the AWS Region
my_session = boto3.session.Session()
runtime_client = my_session.client('bedrock-runtime', region_name=REGION_NAME)
bedrock_client = my_session.client('bedrock', region_name=REGION_NAME)

## Create a Guardrail
In this section we will create a Bedrock Guardrail with automated reasoning
This allows us to validate LLM responses against predefined rules and constraints


In [None]:
# Policy arn for which will be attached to the guardrail
AR_POLICY_ARN="<AR_POLICY_ARN>"

# Unique identifier for the automated reasoning policy
AR_POLICY_ID="<AR_POLICY_ID>"

# Version of the automated reasoning policy to use
AR_POLICY_VERSION = "DRAFT"

# Id of the model used by bedrock when generating LLM responses
MODEL_ID="<MODEL_ID>"

# Guardrail profile ID used when creating guardrail.
# Guardrails with automated reasoning must have cross region guardrail profile.
GUARDRAIL_PROFILE_ID = 'us.guardrail.v1:0'

In [None]:
def create_guardrail(name, automated_reasoning_policy_config, cross_region_config, blocked_input_messaging, blocked_output_messaging):
    """
    Creates a new guardrail configuration in Amazon Bedrock.
    
    Args:
        name (str): The name of the guardrail
        automated_reasoning_policy_config (dict): Configuration for automated reasoning policies
        blocked_input_messaging (dict): Configuration for blocked input message patterns
        blocked_output_messaging (dict): Configuration for blocked output message patterns
        
    Returns:
        dict: Response from the create_guardrail API call
        
    Raises:
        Exception: If there is an error creating the guardrail
    """
    try:
        return bedrock_client.create_guardrail(
            name=name,
            automatedReasoningPolicyConfig=automated_reasoning_policy_config,
            crossRegionConfig=cross_region_config,
            blockedInputMessaging=blocked_input_messaging,
            blockedOutputsMessaging=blocked_output_messaging,
            clientRequestToken=str(uuid.uuid4())
        )
    except Exception as e:
        print(f"Error creating guardrail: {str(e)}")
        raise

In [None]:
# Create a new guardrail with specified configuration
# - Sets the name, policy details and blocked message text
# - Uses automated reasoning policy with confidence threshold of 1.0
create_guardrail_response = create_guardrail(
    name="test_guardrail",
    automated_reasoning_policy_config={
        "policies": [f"{AR_POLICY_ARN}"],
        "confidenceThreshold": 1.0
    },
    cross_region_config={ 'guardrailProfileIdentifier': GUARDRAIL_PROFILE_ID },
    blocked_input_messaging="Input is blocked", 
    blocked_output_messaging="Output is blocked")

# Extract the guardrail ID and version from the response
guardrail_id = create_guardrail_response["guardrailId"]
guardrail_version = create_guardrail_response["version"]

## Call ApplyGuardrail API

Now, we call the guardrail that we created above to validate the LLM response.

In [None]:
def apply_guardrail(guardrail_id, guardrail_version, content, source):
    """
    Calls ApplyGuardrail API to perform content validation.
    
    Args:
        guardrail_id (str): The ID of the guardrail
        guardrail_version (str): The version of the guardrail to use for validation
        content (str): The content to apply the guardrail to
        source (str): The source of the content, must be either 'INPUT' or 'OUTPUT'

    Returns:
        dict: Response from the apply_guardrail API call
    """
    try:
        return runtime_client.apply_guardrail(
            guardrailIdentifier=guardrail_id,
            guardrailVersion=guardrail_version,
            source=source,
            content=content
        )
    except Exception as e:
        print(f"Error applying guardrail: {str(e)}")
        raise


In [None]:
# The user's original query/prompt that was sent to the LLM
user_query = "<USER QUERY>"

# The response generated by the LLM that needs to be validated
llm_response = "<LLM RESPONSE>"

# Create a list of dictionaries containing the text content to validate
# Each dictionary has a nested text object with the actual text and qualifiers
content_to_validate = [
    {"text": {"text": user_query, "qualifiers": ["query"]}},
    {"text": {"text": llm_response, "qualifiers": ["guard_content"]}}
]

# Call ApplyGuardrail API to perform the content validation
apply_guardrail_response = apply_guardrail(
    guardrail_id=guardrail_id,
    guardrail_version=guardrail_version, 
    source="OUTPUT",
    content=content_to_validate)

# Convert the guardrail response to JSON format
print(json.dumps(apply_guardrail_response, indent=2))

In [None]:
# Get the policy definition
policy_definition = get_policy_definition(bedrock_policy_client, AR_POLICY_ARN)

# Generate a user readable output from the automated reasoning findings
formatted_findings = extract_reasoning_findings(
    apply_guardrail_response, 
    policy_definition
)
print(formatted_findings)