## Guardrails en Amazon Bedrock
> Notebook original: [aws-bedrock-samples](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/responsible_ai/bedrock-guardrails/Apply_Guardrail_with_Streaming_and_Long_Context.ipynb)

In [None]:
import boto3
import pprint
from loguru import logger
from textwrap import wrap

> Checa las regiones soportadas de Guardrails [aquí](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-supported.html)

Nota:
La API ApplyGuardrail tiene un límite predeterminado de 25 unidades de texto (aproximadamente 25,000 caracteres) por segundo. Si la entrada excede este límite, debe dividirse en trozos y procesarse secuencialmente para evitar throttling.

Más en [aws-blogs](https://aws.amazon.com/blogs/machine-learning/use-the-applyguardrail-api-with-long-context-inputs-and-streaming-outputs-in-amazon-bedrock/)

In [None]:
# Let's specify the parameters needed for execution later
MODEL_ID = "anthropic.claude-3-haiku-20240307-v1:0"
TEXT_UNIT = 1000 # characters
LIMIT_TEXT_UNIT = 25

In [None]:
# Make sure you have AWS credentials or AWS profile setup before running this cell
bedrock_client = boto3.client("bedrock")
bedrock_runtime = boto3.client("bedrock-runtime")

### Creación de Guardrail

In [None]:
topicPolicyConfig={
        'topicsConfig': [
            {
                'name': 'Fiduciary Advice',
                'definition': 'Providing personalized advice or recommendations on managing financial assets, investments, or trusts in a fiduciary capacity or assuming related obligations and liabilities.',
                'examples': [
                    'What stocks should I invest in for my retirement?',
                    'Is it a good idea to put my money in a mutual fund?',
                    'How should I allocate my 401(k) investments?',
                    'What type of trust fund should I set up for my children?',
                    'Should I hire a financial advisor to manage my investments?'
                ],
                'type': 'DENY'
            }
        ]
    }

contentPolicyConfig={
    'filtersConfig': [
        {
            'type': 'SEXUAL',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'VIOLENCE',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'HATE',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'INSULTS',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'MISCONDUCT',
            'inputStrength': 'HIGH',
            'outputStrength': 'HIGH'
        },
        {
            'type': 'PROMPT_ATTACK',
            'inputStrength': 'HIGH',
            'outputStrength': 'NONE'
        }
      ]
    }

wordPolicyConfig={
    'wordsConfig': [
        {
            'text': 'fiduciary advice'
        },
        {
            'text': 'investment recommendations'
        },
        {
            'text': 'stock picks'
        },
        {
            'text': 'financial planning guidance'
        },
        {
            'text': 'portfolio allocation advice'
        },
        {
            'text': 'retirement fund suggestions'
        },
        {
            'text': 'wealth management tips'
        },
        {
            'text': 'trust fund setup'
        },
        {
            'text': 'investment strategy'
        },
        {
            'text': 'financial advisor recommendations'
        }
    ],
    'managedWordListsConfig': [
        {
            'type': 'PROFANITY'
        }
    ]
}

sensitiveInformationPolicyConfig={
    'piiEntitiesConfig': [
        {
            'type': 'EMAIL',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'PHONE',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'NAME',
            'action': 'ANONYMIZE'
        },
        {
            'type': 'US_SOCIAL_SECURITY_NUMBER',
            'action': 'BLOCK'
        },
        {
            'type': 'US_BANK_ACCOUNT_NUMBER',
            'action': 'BLOCK'
        },
        {
            'type': 'CREDIT_DEBIT_CARD_NUMBER',
            'action': 'BLOCK'
        }
    ],
    'regexesConfig': [
        {
            'name': 'Account Number',
            'description': 'Matches account numbers in the format XXXXXX1234',
            'pattern': r'\b\d{6}\d{4}\b',
            'action': 'ANONYMIZE'
        }
    ]
}

In [None]:
blockedInputMessaging = """I apologize, but I am not able to provide fiduciary advice. 
Additionally, it seems that you may have included some sensitive personal or financial information in your request. 
For your privacy and security, please modify your input and try again without including any personal, 
financial, or restricted details.
"""

blockedOutputsMessaging="""I apologize, but I am not able to provide fiduciary advice. 
Additionally, it seems that you may have included some sensitive personal or financial information in your request. 
For your privacy and security, please modify your input and try again without including any personal, 
financial, or restricted details.
"""

In [None]:
from botocore.exceptions import ClientError

try:
    response = bedrock_client.create_guardrail(
        name='fiduciary-advice-advanced',
        description='Prevents the our model from providing fiduciary advice.',
        topicPolicyConfig=topicPolicyConfig,
        contentPolicyConfig=contentPolicyConfig,
        wordPolicyConfig=wordPolicyConfig,
        sensitiveInformationPolicyConfig=sensitiveInformationPolicyConfig,
        blockedInputMessaging=blockedInputMessaging,
        blockedOutputsMessaging=blockedOutputsMessaging,
    )
except ClientError as e:
    if e.response['Error']['Code'] == 'ConflictException':
        # Guardrail already exists, retrieve the existing guardrail information
        list_response = bedrock_client.list_guardrails()
        for guardrail in list_response['guardrails']:
            if guardrail['name'] == 'fiduciary-advice':
                logger.warning('Guardrail already exists.')
                pprint.pprint(guardrail)
                
                # Get values for guardrail_id and guardrail_version
                guardrail_id = guardrail['id']
                guardrail_version = guardrail['version']
    else:
        # Handle other exceptions
        raise e
else:
    # Guardrail creation was successful
    print(f"Guardrail created: {response}")
    guardrail_id = response['guardrailId']
    guardrail_version = response['version'] 

    print(f"Guardrail ID: {response['guardrailId']}")
    print(f"Guardrail Version: {response['version'] }")

#### Uso
La solicitud ApplyGuardrail permite al cliente pasar todo su contenido que debe protegerse utilizando su Guardrail definido. El campo de origen debe establecerse en INPUT cuando el contenido a evaluar proviene de un usuario, generalmente el aviso del LLM. La fuente debe establecerse en OUTPUT cuando se debe hacer cumplir el Guardrail de salida del modelo, generalmente una respuesta de LLM.

#### Estrategia
Mostraremos cómo puede aplicar el guardrail en múltiples escenarios:

1. Contenido de entrada pequeño (<25 unidades de texto)
2. Contenido de entrada grande (>25 unidades de texto)
3. Salida de LLM en streaming

Si el contenido es mayor que los límites de cuota de la API ApplyGuardrail, tendremos que dividir el contenido original en trozos más pequeños para no alcanzar el límite de throttling.

Además, en el caso del streaming, los trozos podrían contener solo unos pocos tokens, no sería sabio aplicar el guardrail en cada nuevo trozo, ni sería factible esperar a que se genere toda la salida para luego aplicar el guardrail. En su lugar, para encontrar el mejor ajuste, podemos aplicar el guardrail cada vez que tengamos suficientes tokens, es decir, tokens ~= 1 unidad de texto, esto asegurará tanto el control de costos como la cantidad suficiente de contenido disponible en el trozo para encontrar contenido que potencialmente viole.

In [None]:
def check_severe_violations(violations):
    # When guardrail intervenes either the action on the request is BLOCKED or NONE
    # Here we check how many of the violations lead to blocking the request
    severe_violations = [violation['action']=='BLOCKED' for violation in violations]
    return sum(severe_violations)

def is_policy_assessement_blocked(assessments):
    # While creating the guardrail you could specify multiple types of policies.
    # At the time of assessment all the policies should be checked for potential violations
    # If there is even 1 violation that blocks the request, the entire request is blocked
    blocked = []
    for assessment in assessments:
        if 'topicPolicy' in assessment:
            blocked.append(check_severe_violations(assessment['topicPolicy']['topics']))
        if 'wordPolicy' in assessment:
            if 'customWords' in assessment['wordPolicy']:
                blocked.append(check_severe_violations(assessment['wordPolicy']['customWords']))
            if 'managedWordLists' in assessment['wordPolicy']:
                blocked.append(check_severe_violations(assessment['wordPolicy']['managedWordLists']))
        if 'sensitiveInformationPolicy' in assessment:
            if 'piiEntities' in assessment['sensitiveInformationPolicy']:
                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['piiEntities']))
            if 'regexes' in assessment['sensitiveInformationPolicy']:
                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['regexes']))
        if 'contentPolicy' in assessment:
            blocked.append(check_severe_violations(assessment['contentPolicy']['filters']))
    severe_violation_count = sum(blocked)
    logger.error(f"::Guardrail:: {severe_violation_count} severe violations detected")
    
    return severe_violation_count>0


def apply_guardrail(text, text_source_type, guardrail_id, guardrail_version="DRAFT"):
    logger.success(f"::Guardrail:: Applying guardrail with {(len(text)//TEXT_UNIT)+1} text units")
    response = bedrock_runtime.apply_guardrail(
        guardrailIdentifier=guardrail_id,
        guardrailVersion=guardrail_version, 
        source=text_source_type, # can be 'INPUT' or 'OUTPUT'
        content=[{"text": {"text": text}}]
    )
    if response['action'] == 'GUARDRAIL_INTERVENED':
        is_blocked = is_policy_assessement_blocked(response['assessments'])
        alternate_text = ' '.join([output['text'] for output in response['outputs']])
        return is_blocked, alternate_text, response
    else:
        # Return the default response in case of no guardrail intervention
        return False, text, response


def apply_guardrail_full_text(text, text_source_type, guardrail_id, guardrail_version="DRAFT"):
    text_length = len(text)
    filtered_text = ''
    if text_length <= LIMIT_TEXT_UNIT*TEXT_UNIT:
        return apply_guardrail(text, text_source_type, guardrail_id, guardrail_version)
    else:
        # If the text length is greater than the default text unit limits then it's better to chunk the text to avoid throttling.
        for i, chunk in enumerate(wrap(text, LIMIT_TEXT_UNIT*TEXT_UNIT)):
            print(f'::Guardrail::Applying guardrails at chunk {i+1}')
            is_blocked, alternate_text, response = apply_guardrail(chunk, text_source_type, guardrail_id, guardrail_version)
            if is_blocked:
                filtered_text = alternate_text
                break
            # It could be the case that guardrails intervened and anonymized PII in the input text,
            # we can then take the output from guardrails to create filtered text response.
            filtered_text += alternate_text
        return is_blocked, filtered_text, response

In [None]:
from botocore.exceptions import ClientError

def stream_conversation(messages,
                        system_prompts,
                        inference_config,
                        additional_model_fields):
    
    response = bedrock_runtime.converse_stream(
        modelId=MODEL_ID,
        messages=messages,
        system=system_prompts,
        inferenceConfig=inference_config,
        additionalModelRequestFields=additional_model_fields
    )

    stream = response.get('stream')
    full_text = ""
    buffer_text = ""
    applied_guardrails = []
    if stream:
        for event in stream:
            if 'messageStart' in event:
                logger.info(f"Role: {event['messageStart']['role']}")

            if 'contentBlockDelta' in event:
                new_text = event['contentBlockDelta']['delta']['text']

                if len(buffer_text + new_text) > TEXT_UNIT:
                    is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)
                    if is_blocked:
                        event['messageStop'] = {
                            'stopReason': guardrail_response['action'], 
                            'output': alt_text,
                            'assessments': guardrail_response['assessments'],
                        }
                        full_text = alt_text
                    else:
                        full_text += alt_text
                    print(alt_text, end="")
                    applied_guardrails.append(guardrail_response)
                    buffer_text = new_text
                else: 
                    buffer_text += new_text

            if 'messageStop' in event:
                if event['messageStop']['stopReason'] == 'GUARDRAIL_INTERVENED':
                    logger.warning(f"Stop reason: {event['messageStop']['stopReason']}")
                    break
                else:
                    logger.warning(f"Stop reason: {event['messageStop']['stopReason']}")
                    is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)
                    if is_blocked:
                        print(alt_text)
                        if 'metadata' not in event:
                            event['metadata'] = {}
                        event['metadata']['guardrails_usage'] = guardrail_response['usage']
                        applied_guardrails.append(guardrail_response)

            if 'metadata' in event:
                metadata = event['metadata']
                if 'usage' in metadata:
                    logger.info("Token usage")
                    print(f"Input tokens: {metadata['usage']['inputTokens']}")
                    print(f":Output tokens: {metadata['usage']['outputTokens']}")
                    print(f":Total tokens: {metadata['usage']['totalTokens']}")
                    print(f":Total text units: {(len(full_text)//TEXT_UNIT)+1}")
                if 'metrics' in event['metadata']:
                    print(f"Latency: {metadata['metrics']['latencyMs']} milliseconds")
                if 'guardrails_usage' in event['metadata']:
                    print(event['metadata']['guardrails_usage'])
    return full_text, applied_guardrails


def generate(input_message):

    system_prompt = """You are an assistant that helps with tasks from users. Be as elaborate as possible"""

    message = {
        "role": "user",
        "content": [{"text": input_message}]
    }
    messages = [message]
    
    # System prompts.
    system_prompts = [{"text" : system_prompt}]

    # inference parameters to use.
    temperature = 0.5

    # Base inference parameters.
    inference_config = {
        "temperature": temperature
    }
    # Additional model inference parameters.
    additional_model_fields = {}

    try:
        full_text, applied_guardrails = stream_conversation(messages, system_prompts, inference_config, additional_model_fields)
    except ClientError as err:
        message = err.response['Error']['Message']
        logger.error("A client error occured: {message}")

    else:
        logger.info(f"Finished streaming messages with model {MODEL_ID}.")
        
    return full_text, applied_guardrails

In [None]:
sample_1 = "List 3 names of prominent CEOs and later tell me what is a bank and what are the benefits of opening a savings account?"
full_text, applied_guardrails = generate(sample_1)

Podemos ver claramente que el guardrail intervino anteriormente y anonimizó algunos nombres en la generación de texto. Examinemos qué evaluaciones realizó el guardrail.

In [None]:
# Pretty Print
pp = pprint.PrettyPrinter(indent=4)

#### Guardrail Assessment:

In [None]:
for guardrail in applied_guardrails:
    if guardrail['action']!='NONE':
        pp.pprint(guardrail['assessments'])

Lo anterior muestra que se invocó sensitiveInformationPolicy, anonimizando los nombres generados por el modelo y limpiando la salida.

Ahora, podemos probar un escenario diferente donde la entrada contiene una solicitud de asesoramiento fiduciario y podemos observar la aplicación del guardrail.

In [None]:
sample_2 = "Tell me about why financial independence is important and only at the very end ask the question if you can help me to invest after retirement?"
full_text, applied_guardrails = generate(sample_2)

Podemos observar arriba que ocurrió la intervención del guardrail, ahora examinemos qué políticas se violaron. Para eso, podemos examinar evaluaciones, que es parte de la respuesta de la API ApplyGuardrail.

In [None]:
for guardrail in applied_guardrails:
    if guardrail['action']!='NONE':
        pp.pprint(guardrail['assessments'])

### INPUT - Documento Pequeño
Ahora podemos probar cómo se puede aplicar el guardrail a un documento pequeño. Utilizaremos la Carta a los accionistas de Amazon 2023, este documento no incluye ningún texto que deba hacer que el guardrail intervenga.

In [None]:
letter = open('./data/guardrails/shareholder_letter.txt', 'r').read()
print(f"Length of the document: {len(letter)} characters")

In [None]:
blocked, new_text, guardrail_response = apply_guardrail_full_text(letter, "INPUT", guardrail_id, guardrail_version)
print(f"\nBlocked by guardrail: {'Yes' if blocked else 'No'}")
if blocked:
    print(f'Guardrail Output: {new_text}')
elif guardrail_response=='GUARDRAIL_INTERVENED' and not blocked:
    print(f'Filtered Text Snippet: {new_text[:5000]}')

pp.pprint(guardrail_response['assessments'])

### INPUT - Documento Grande: 
Ahora podemos probar con un documento diferente que contiene una historia financiera ficticia generada con la ayuda de un LLM. Para aumentar la longitud del documento, podemos combinar la carta a los accionistas y la historia financiera. Esto permitirá mostrar la capacidad de dividir el documento en trozos y luego aplicar el guardrail a cada trozo individualmente.

In [None]:
financial_story = open('./data/guardrails/financial_story.txt', 'r').read()
large_text = letter + financial_story
print(f"Length of the document: {len(large_text)} characters")

In [None]:
blocked, new_text, guardrail_response = apply_guardrail_full_text(large_text, "INPUT", guardrail_id, guardrail_version)
print(f"\nBlocked by guardrail: {'Yes' if blocked else 'No'}")
if blocked:
    print(f'Guardrail Output: {new_text}')
elif guardrail_response=='GUARDRAIL_INTERVENED' and not blocked:
    print(f'Filtered Text Snippet: {new_text[:5000]}')

pp.pprint(guardrail_response['assessments'])

---
# No olvides borrar los Guardrails!

In [None]:
response = bedrock_client.delete_guardrail(
    guardrailIdentifier=guardrail_id
)
print(response['ResponseMetadata']['HTTPStatusCode'])