# Apply Guardrail API - Boto3 Python Code Walkthrough

----

Guardrails can be used to implement safeguards for your generative AI applications that are customized to your use cases and aligned with your responsible AI policies. Guardrails allows you to:

- Configure denied topics
- Filter harmful content
- Remove sensitive information

For more information on publicly available capabilities:

- [Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html)
- [Guardrail Policies](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-components.html)
- [Pricing](https://aws.amazon.com/bedrock/pricing/)
- [WebPage](https://aws.amazon.com/bedrock/guardrails/)

## The new `ApplyGuardrail` API allows customers to assess any text using their pre-configured Bedrock Guardrails, without invoking the foundation models.

### Key Features:

1. **Content Validation**: Send any text input or output to the ApplyGuardrail API to have it evaluated against your defined topic avoidance rules, content filters, PII detectors, and word blocklists. You can evaluate user inputs and FM generated outputs independently.

2. **Flexible Deployment**: Integrate the Guardrails API anywhere in your application flow to validate data before processing or serving results to users. E.g. For a RAG application, you can now evaluate the user input prior to performing the retrieval instead of waiting until the final response generation.

3. **Decoupled from Foundation Models**: ApplyGuardrail is decoupled from foundational models. You can now use Guardrails without invoking Foundation Models.

You can use the assessment results to design the experience on your generative AI application. Let's now walk through a code-sample

In [None]:
#Start by installing the dependencies to ensure we have a recent version
!pip install --upgrade --force-reinstall boto3
import boto3
print(boto3.__version__)

### Important: Create a Guardrail First

Before running the code to apply a guardrail, you need to create a guardrail in Amazon Bedrock. If you haven't created a guardrail yet, please follow these steps:

1. Visit the following GitHub notebook for detailed instructions on creating and using guardrails:
   [Guardrails for Amazon Bedrock Samples](https://github.com/aws-samples/amazon-bedrock-samples/blob/main/responsible-ai/guardrails-for-amazon-bedrock-samples/guardrails-api.ipynb)

2. Follow the instructions in the notebook to create your guardrail.

3. Make note of the `guardrail_id` and `guardrail_version` that you create, as you'll need these values for the code in this notebook.

4. Once you have created your guardrail and have the necessary information, you can return to this notebook and run the code to apply the guardrail.

Remember: The `guardrail_id` and `guardrail_version` variables in the code must be set to the values of the guardrail you created before running the API call.

In [None]:
import boto3
import json
from botocore.exceptions import ClientError
from typing import Dict, Any


bedrock_runtime = boto3.client('bedrock-runtime')

# Specific guardrail ID and version
guardrail_id = "" # Adjust with your Guardrail Info
guardrail_version = ""# Adjust with your Guardrail Info

In [None]:
# Example of Input Prompt being Analyzed
content = [
    {
        "text": {
            "text": "Is the AB503 Product a better investment than the S&P 500?"
        }
    }
]

# Here's an example of something that should pass

#content = [
    #{
    #    "text": {
   #         "text": "What is the rate you offer for the AB503 Product?"
  #      }
 #   }
#]

# Call the ApplyGuardrail API
try:
    response = bedrock_runtime.apply_guardrail(
        guardrailIdentifier=guardrail_id,
        guardrailVersion=guardrail_version,
        source='INPUT',  # or 'INPUT' depending on your use case
        content=content
    )
    
    # Process the response
    print("API Response:")
    print(json.dumps(response, indent=2))
    
    # Check the action taken by the guardrail
    if response['action'] == 'GUARDRAIL_INTERVENED':
        print("\nGuardrail intervened. Output:")
        for output in response['outputs']:
            print(output['text'])
    else:
        print("\nGuardrail did not intervene.")
    
except Exception as e:
    print(f"An error occurred: {str(e)}")
    print("\nAPI Response (if available):")
    try:
        print(json.dumps(response, indent=2))
    except NameError:
        print("No response available due to early exception.")


In [None]:
# An Example of Analyzing an Output Response, This time using Contexual Grounding

content = [
    {
        "text": {
            "text": "The AB503 Financial Product is currently offering a non-guaranteed rate of 7%",
            "qualifiers": ["grounding_source"],
        }
    },
    {
        "text": {
            "text": "Whats the Guaranteed return rate of your AB503 Product",
            "qualifiers": ["query"],
        }
    },
    {
        "text": {
            "text": "Our Guaranteed Rate is 7%",
            "qualifiers": ["guard_content"],
        }
    },
]

# Call the ApplyGuardrail API
try:
    response = bedrock_runtime.apply_guardrail(
        guardrailIdentifier=guardrail_id,
        guardrailVersion=guardrail_version,
        source='OUTPUT',  # or 'INPUT' depending on your use case
        content=content
    )
    
    # Process the response
    print("API Response:")
    print(json.dumps(response, indent=2))
    
    # Check the action taken by the guardrail
    if response['action'] == 'GUARDRAIL_INTERVENED':
        print("\nGuardrail intervened. Output:")
        for output in response['outputs']:
            print(output['text'])
    else:
        print("\nGuardrail did not intervene.")

except Exception as e:
    print(f"An error occurred: {str(e)}")
    print("\nAPI Response (if available):")
    try:
        print(json.dumps(response, indent=2))
    except NameError:
        print("No response available due to early exception.")

## Using ApplyGuardrail API with a Third-Party or Self-Hosted Model

A common use case for the ApplyGuardrail API is in conjunction with a Language Model from a non Amazon Bedrock provider, or a model that you self-host. This combination allows you to apply guardrails to the input or output of any request.

The general flow would be:
1. Receive an input for your Model
2. Apply the guardrail to this input using the ApplyGuardrail API
3. If the input passes the guardrail, send it to your Model for Inference
4. Receive the output from your Model
5. Apply the Guardrail to your output
6. Return the final (potentially modified) output

### Here's a diagram illustrating this process:

<div style="text-align: center;">
    <img src="images/applyguardrail.png" alt="ApplyGuardrail API Flow" style="max-width: 100%;">
</div>

Let's walk through this with a code example that demonstrates this process

### For our examples today we will use a Self-Hosted SageMaker Model, but this could be any third-party model as well

We will use the `Meta-Llama-3-8B` model hosted on a SageMaker Endpoint. To deploy your own version of this model on Amazon SageMaker please checkout the guide here: [Meta Llama 3 models are now available in Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/meta-llama-3-models-are-now-available-in-amazon-sagemaker-jumpstart/)

In [None]:
# Configure our Endpoint to take Requests
from sagemaker.predictor import retrieve_default
endpoint_name = "" # Adjust this line with the name of your Endpoint
predictor = retrieve_default(endpoint_name)

In [None]:
payload = {
    "inputs": "How do I save for retirement?",
    "parameters": {
        "max_new_tokens": 256,
        "temperature": 0.0,
        "stop": "<|eot_id|>"
    }
}
response = predictor.predict(payload)
print(response)

### Incorporating the ApplyGuardrail API into our Self-Hosted Model

---
We've created a `TextGenerationWithGuardrails` class that integrates the ApplyGuardrail API with our SageMaker endpoint to ensure protected text generation. This class includes the following key methods:

1. `generate_text`: Calls our Language Model via a SageMaker endpoint to generate text based on the input.

2. `analyze_text`: A core method that applies our guardrail using the ApplyGuardrail API. It int|erprets the API response to determine if the guardrail passed or intervened.

3. `analyze_prompt` and `analyze_output`: These methods use `analyze_text` to apply our guardrail to the input prompt and generated output, respectively. They return a tuple indicating whether the guardrail passed and any associated message.

The class looks to implement the diagram above. It works as follows:

1. It first checks the input prompt using `analyze_prompt`.
2. If the input passes the guardrail, it generates text using `generate_text`.
3. The generated text is then checked using `analyze_output`.
4. If both guardrails pass, the generated text is returned. Otherwise, an intervention message is provided.

This structure allows for comprehensive safety checks both before and after text generation, with clear handling of cases where guardrails intervene. It's designed to easily integrate with larger applications while providing flexibility for error handling and customization based on guardrail results.

In [None]:
import boto3
from botocore.exceptions import ClientError
from typing import Tuple, List, Dict, Any

class TextGenerationWithGuardrails:
    def __init__(self, endpoint_name: str, guardrail_id: str, guardrail_version: str):
        self.predictor = retrieve_default(endpoint_name)
        self.bedrock_runtime = boto3.client('bedrock-runtime')
        self.guardrail_id = guardrail_id
        self.guardrail_version = guardrail_version

    def generate_text(self, inputs: str, max_new_tokens: int = 256, temperature: float = 0.0) -> str:
        """Generate text using the specified SageMaker endpoint."""
        payload = {
            "inputs": inputs,
            "parameters": {
                "max_new_tokens": max_new_tokens,
                "temperature": temperature,
                "stop": "<|eot_id|>"
            }
        }
    
        response = self.predictor.predict(payload)
        return response.get('generated_text', '')

    def analyze_text(self, grounding_source: str, query: str, guard_content: str, source: str) -> Tuple[bool, str, Dict[str, Any]]:
        """
        Analyze text using the ApplyGuardrail API with contextual grounding.
        Returns a tuple (passed, message, details) where:
        - passed is a boolean indicating if the guardrail passed,
        - message is either the guardrail message or an empty string,
        - details is a dictionary containing the full API response for further analysis if needed.
        """
        try:
            content = [
                {
                    "text": {
                        "text": grounding_source,
                        "qualifiers": ["grounding_source"]
                    }
                },
                {
                    "text": {
                        "text": query,
                        "qualifiers": ["query"]
                    }
                },
                {
                    "text": {
                        "text": guard_content,
                        "qualifiers": ["guard_content"]
                    }
                }
            ]

            response = self.bedrock_runtime.apply_guardrail(
                guardrailIdentifier=self.guardrail_id,
                guardrailVersion=self.guardrail_version,
                source=source,
                content=content
            )
            
            action = response.get("action", "")
            if action == "NONE":
                return True, "", response
            elif action == "GUARDRAIL_INTERVENED":
                message = response.get("outputs", [{}])[0].get("text", "Guardrail intervened")
                return False, message, response
            else:
                return False, f"Unknown action: {action}", response
        except ClientError as e:
            print(f"Error applying guardrail: {e}")
            raise

    def analyze_prompt(self, grounding_source: str, query: str) -> Tuple[bool, str, Dict[str, Any]]:
        """Analyze the input prompt."""
        return self.analyze_text(grounding_source, query, query, "INPUT")

    def analyze_output(self, grounding_source: str, query: str, generated_text: str) -> Tuple[bool, str, Dict[str, Any]]:
        """Analyze the generated output."""
        return self.analyze_text(grounding_source, query, generated_text, "OUTPUT")

    def generate_and_analyze(self, grounding_source: str, query: str, max_new_tokens: int = 256, temperature: float = 0.0) -> Tuple[bool, str, str]:
        """
        Generate text and analyze it with guardrails.
        Returns a tuple (passed, message, generated_text) where:
        - passed is a boolean indicating if the guardrail passed,
        - message is either the guardrail message or an empty string,
        - generated_text is the text generated by the model (if guardrail passed) or an empty string.
        """
        # First, analyze the prompt
        prompt_passed, prompt_message, _ = self.analyze_prompt(grounding_source, query)
        if not prompt_passed:
            return False, prompt_message, ""

        # If prompt passes, generate text
        generated_text = self.generate_text(query, max_new_tokens, temperature)

        # Analyze the generated text
        output_passed, output_message, _ = self.analyze_output(grounding_source, query, generated_text)
        if not output_passed:
            return False, output_message, ""

        return True, "", generated_text

### Now let's see a Sample Usage in action 

In [None]:
def main():
    query = "What are is the Guarenteed Rate of Return for AB503 Product"
    grounding_source = "The AB503 Financial Product is currently offering a non-guaranteed rate of 7%"
    max_new_tokens = 512  # You can change this value as needed
    temperature = 0.0  # Default value, can be edited

    text_gen = TextGenerationWithGuardrails(
        endpoint_name=endpoint_name,
        guardrail_id=guardrail_id,
        guardrail_version=guardrail_version
    )

    # Bold text function
    def bold(text):
        return f"\033[1m{text}\033[0m"

    # Analyze input
    print(bold("\n=== Input Analysis ===\n"))
    input_passed, input_message, input_details = text_gen.analyze_prompt(grounding_source, query)
    if not input_passed:
        print(f"Input Guardrail Intervened. The response to the User is: {input_message}\n")
        print("Full API Response:")
        print(json.dumps(input_details, indent=2))
        print()
        return
    else:
        print("Input Prompt Passed The Guardrail Check - Moving to Generate the Response\n")

    # Generate text
    print(bold("\n=== Text Generation ===\n"))
    generated_text = text_gen.generate_text(query, max_new_tokens=max_new_tokens, temperature=temperature)
    print(f"Here is what the Model Responded with: {generated_text}\n")

    # Analyze output
    print(bold("\n=== Output Analysis ===\n"))
    print("Analyzing Model Response with the Response Guardrail\n")
    output_passed, output_message, output_details = text_gen.analyze_output(grounding_source, query, generated_text)
    if not output_passed:
        print(f"Output Guardrail Intervened. The response to the User is: {output_message}\n")
        print("Full API Response:")
        print(json.dumps(output_details, indent=2))
        print()
    else:
        print(f"Model Response Passed. The information presented to the user is: {generated_text}\n")

if __name__ == "__main__":
    main()

## Using ApplyGuardrail API within a Self-Managed RAG Pattern

A common use case for the ApplyGuardrail API is in conjunction with a Language Model from a non Amazon Bedrock provider, or a model that you self-host, and applied within a Retrival Augmented Generation Pattern. 

The general flow would be:
1. Receive an input for your Model
2. Apply the guardrail to this input using the ApplyGuardrail API
3. If the input passes the guardrail, send it to your Embeddings Model for Query Embedding, and Query your Vector Embeddings
4. Receive the output from your Embeddings Model
5. Provide it as Context for your Language Model
6. Return the final (potentially modified) output

### Here's a diagram illustrating this process:

<div style="text-align: center;">
    <img src="images/managed_rag.png" alt="ApplyGuardrail API RAG Flow" style="max-width: 100%;">
</div>

Let's walk through this with a code example that demonstrates this process

### For our examples today we will use a Self-Hosted SageMaker Model for our Large Language Model, but this could be any third-party model as well, and a third-party embeddings model hosted on VoyageAI

We will use the `Meta-Llama-3-8B` model hosted on a SageMaker Endpoint. To deploy your own version of this model on Amazon SageMaker please checkout the guide here: [Meta Llama 3 models are now available in Amazon SageMaker JumpStart](https://aws.amazon.com/blogs/machine-learning/meta-llama-3-models-are-now-available-in-amazon-sagemaker-jumpstart/). For embeddings, we'll use the `voyage-large-2-instruct` model. To learn more about VoyageAI Embeddings models, check them out here: [Voyage AI](https://www.voyageai.com/)

In [None]:
# Let's Start by Embedding our Source Documents by creating a session with the VoyageAI SDK
import voyageai
vo = voyageai.Client()

In [None]:
# We have some sample documents here with descriptions of our products 
documents = [
    "The AG701 Global Growth Fund is currently projecting an annual return of 8.5%, focusing on emerging markets and technology sectors.",
    "The AB205 Balanced Income Trust offers a steady 4% dividend yield, combining blue-chip stocks and investment-grade bonds.",
    "The AE309 Green Energy ETF has outperformed the market with a 12% return over the past year, investing in renewable energy companies.",
    "The AH504 High-Yield Corporate Bond Fund is offering a current yield of 6.75%, targeting BB and B rated corporate debt.",
    "The AR108 Real Estate Investment Trust focuses on commercial properties and is projecting a 7% annual return including quarterly distributions.",
    "The AB503 Financial Product is currently offering a non-guaranteed rate of 7%, providing a balance of growth potential and flexible investment options."
]

In [None]:
# Embed the documents
documents_embeddings = vo.embed(documents, model="voyage-2", input_type="document").embeddings

In [None]:
# Let's ask a question
query = "What is the return rate on AB503?"

In [None]:
# Embedd the Query
query_embedding = vo.embed([query], model="voyage-2", input_type="query").embeddings[0]

In [None]:
#Sample KNN Implementation to find most relevant documents to query, this is typically done at the Vector Database Level

from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

def k_nearest_neighbors(query_embedding, documents_embeddings, k=5):
  query_embedding = np.array(query_embedding) # convert to numpy array
  documents_embeddings = np.array(documents_embeddings) # convert to numpy array

  # Reshape the query vector embedding to a matrix of shape (1, n) to make it compatible with cosine_similarity
  query_embedding = query_embedding.reshape(1, -1)

  # Calculate the similarity for each item in data
  cosine_sim = cosine_similarity(query_embedding, documents_embeddings)

  # Sort the data by similarity in descending order and take the top k items
  sorted_indices = np.argsort(cosine_sim[0])[::-1]

  # Take the top k related embeddings
  top_k_related_indices = sorted_indices[:k]
  top_k_related_embeddings = documents_embeddings[sorted_indices[:k]]
  top_k_related_embeddings = [list(row[:]) for row in top_k_related_embeddings] # convert to list

  return top_k_related_embeddings, top_k_related_indices

In [None]:
# Get the most relevant documents

retrieved_embd, retrieved_embd_index = k_nearest_neighbors(query_embedding, documents_embeddings, k=1)
retrieved_doc = [documents[index] for index in retrieved_embd_index]

print(retrieved_doc)

### Incorporating Embeddings, Document Retrieval, and the ApplyGuardrail API into our Self-Hosted Model
---
We've enhanced our `TextGenerationWithGuardrails` class to integrate embeddings, document retrieval, and the ApplyGuardrail API with our SageMaker endpoint. This ensures protected text generation with contextually relevant information. The class now includes the following key methods:

1. `generate_text`: Calls our Language Model via a SageMaker endpoint to generate text based on the input.
2. `analyze_text`: A core method that applies our guardrail using the ApplyGuardrail API. It interprets the API response to determine if the guardrail passed or intervened.
3. `analyze_prompt` and `analyze_output`: These methods use `analyze_text` to apply our guardrail to the input prompt and generated output, respectively. They return a tuple indicating whether the guardrail passed and any associated message.
4. `embed_text`: Embeds the given text using a specified embedding model.
5. `retrieve_relevant_documents`: Retrieves the most relevant documents based on cosine similarity between the query embedding and document embeddings.
6. `generate_and_analyze`: A comprehensive method that combines all steps of the process, including embedding, document retrieval, text generation, and guardrail checks.

The enhanced class implements the following workflow:

1. It first checks the input prompt using `analyze_prompt`.
2. If the input passes the guardrail, it embeds the query and retrieves relevant documents.
3. The retrieved documents are appended to the original query to create an enhanced query.
4. Text is generated using `generate_text` with the enhanced query.
5. The generated text is then checked using `analyze_output`, with the retrieved documents serving as the grounding source.
6. If both guardrails pass, the generated text is returned. Otherwise, an intervention message is provided.

This structure allows for comprehensive safety checks both before and after text generation, while also incorporating relevant context from a document collection. It's designed to:

- Ensure safety through multiple guardrail checks
- Enhance relevance by incorporating retrieved documents into the generation process
- Provide flexibility for error handling and customization based on guardrail results
- Easily integrate with larger applications

The class can be further customized to adjust the number of retrieved documents, modify the embedding process, or alter how retrieved documents are incorporated into the query. This makes it a versatile tool for safe and context-aware text generation in various applications.

In [None]:
class TextGenerationWithGuardrails:
    def __init__(self, endpoint_name: str, guardrail_id: str, guardrail_version: str, embedding_model: str = "voyage-2"):
        self.predictor = retrieve_default(endpoint_name)
        self.bedrock_runtime = boto3.client('bedrock-runtime')
        self.guardrail_id = guardrail_id
        self.guardrail_version = guardrail_version
        self.embedding_model = embedding_model

    def generate_text(self, inputs: str, max_new_tokens: int = 256, temperature: float = 0.0) -> str:
        """Generate text using the specified SageMaker endpoint."""
        payload = {
            "inputs": inputs,
            "parameters": {
                "max_new_tokens": max_new_tokens,
                "temperature": temperature,
                "stop": "<|eot_id|>"
            }
        }
    
        response = self.predictor.predict(payload)
        return response.get('generated_text', '')

    def analyze_text(self, grounding_source: str, query: str, guard_content: str, source: str) -> Tuple[bool, str, Dict[str, Any]]:
        """
        Analyze text using the ApplyGuardrail API with contextual grounding.
        Returns a tuple (passed, message, details) where:
        - passed is a boolean indicating if the guardrail passed,
        - message is either the guardrail message or an empty string,
        - details is a dictionary containing the full API response for further analysis if needed.
        """
        try:
            content = [
                {
                    "text": {
                        "text": grounding_source,
                        "qualifiers": ["grounding_source"]
                    }
                },
                {
                    "text": {
                        "text": query,
                        "qualifiers": ["query"]
                    }
                },
                {
                    "text": {
                        "text": guard_content,
                        "qualifiers": ["guard_content"]
                    }
                }
            ]

            response = self.bedrock_runtime.apply_guardrail(
                guardrailIdentifier=self.guardrail_id,
                guardrailVersion=self.guardrail_version,
                source=source,
                content=content
            )
            
            action = response.get("action", "")
            if action == "NONE":
                return True, "", response
            elif action == "GUARDRAIL_INTERVENED":
                message = response.get("outputs", [{}])[0].get("text", "Guardrail intervened")
                return False, message, response
            else:
                return False, f"Unknown action: {action}", response
        except ClientError as e:
            print(f"Error applying guardrail: {e}")
            raise

    def analyze_prompt(self, grounding_source: str, query: str) -> Tuple[bool, str, Dict[str, Any]]:
        """Analyze the input prompt."""
        return self.analyze_text(grounding_source, query, query, "INPUT")

    def analyze_output(self, grounding_source: str, query: str, generated_text: str) -> Tuple[bool, str, Dict[str, Any]]:
        """Analyze the generated output."""
        return self.analyze_text(grounding_source, query, generated_text, "OUTPUT")

    def embed_text(self, text: str, input_type: str = "query") -> List[float]:
        """Embed the given text using the specified embedding model."""
        return vo.embed([text], model=self.embedding_model, input_type=input_type).embeddings[0]

    def retrieve_relevant_documents(self, query_embedding: List[float], documents_embeddings: List[List[float]], k: int = 1) -> Tuple[List[List[float]], List[int]]:
        """Retrieve the k most relevant documents based on cosine similarity."""
        query_embedding = np.array(query_embedding).reshape(1, -1)
        documents_embeddings = np.array(documents_embeddings)
        
        cosine_sim = cosine_similarity(query_embedding, documents_embeddings)
        sorted_indices = np.argsort(cosine_sim[0])[::-1]
        
        top_k_related_indices = sorted_indices[:k]
        top_k_related_embeddings = documents_embeddings[top_k_related_indices].tolist()
        
        return top_k_related_embeddings, top_k_related_indices.tolist()

    def generate_and_analyze(self, query: str, documents: List[str], max_new_tokens: int = 256, temperature: float = 0.0) -> Tuple[bool, str, str]:
        """
        Generate text and analyze it with guardrails, including embedding and document retrieval steps.
        Returns a tuple (passed, message, generated_text) where:
        - passed is a boolean indicating if the guardrail passed,
        - message is either the guardrail message or an empty string,
        - generated_text is the text generated by the model (if guardrail passed) or an empty string.
        """
        # Embed the query and retrieve relevant documents
        query_embedding = self.embed_text(query)
        documents_embeddings = [self.embed_text(doc, input_type="document") for doc in documents]
        _, retrieved_doc_indices = self.retrieve_relevant_documents(query_embedding, documents_embeddings)
        
        retrieved_docs = [documents[index] for index in retrieved_doc_indices]
        retrieved_grounding = "\n".join(retrieved_docs)

        # First, analyze the prompt using retrieved documents as grounding
        prompt_passed, prompt_message, _ = self.analyze_prompt(retrieved_grounding, query)
        if not prompt_passed:
            return False, prompt_message, ""

        # Append retrieved documents to the query
        enhanced_query = f"{query}\n\nRelevant information:\n{retrieved_grounding}"

        # Generate text with the enhanced query
        generated_text = self.generate_text(enhanced_query, max_new_tokens, temperature)

        # Analyze the generated text using the retrieved documents as grounding
        output_passed, output_message, _ = self.analyze_output(retrieved_grounding, query, generated_text)
        if not output_passed:
            return False, output_message, ""

        return True, "", generated_text

In [None]:
def main():
    query = "What is the Guaranteed Rate of Return for AB503 Product?"
    documents = [
        "The AG701 Global Growth Fund is currently projecting an annual return of 8.5%, focusing on emerging markets and technology sectors.",
        "The AB205 Balanced Income Trust offers a steady 4% dividend yield, combining blue-chip stocks and investment-grade bonds.",
        "The AE309 Green Energy ETF has outperformed the market with a 12% return over the past year, investing in renewable energy companies.",
        "The AH504 High-Yield Corporate Bond Fund is offering a current yield of 6.75%, targeting BB and B rated corporate debt.",
        "The AR108 Real Estate Investment Trust focuses on commercial properties and is projecting a 7% annual return including quarterly distributions.",
        "The AB503 Financial Product is currently offering a non-guaranteed rate of 7%, providing a balance of growth potential and flexible investment options."
    ]
    max_new_tokens = 512
    temperature = 0.0

    text_gen = TextGenerationWithGuardrails(
        endpoint_name=endpoint_name,
        guardrail_id=guardrail_id,
        guardrail_version=guardrail_version
    )

    # Bold text function
    def bold(text):
        return f"\033[1m{text}\033[0m"

    # Embedding the Query
    print(bold("\n=== Query Embedding ===\n"))
    query_embedding = text_gen.embed_text(query)
    print(f"Query: {query}")
    print(f"Query embedding (first 5 elements): {query_embedding[:5]}...")
    print()

    # Embedding the Documents
    print(bold("\n=== Document Embedding ===\n"))
    documents_embeddings = [text_gen.embed_text(doc, input_type="document") for doc in documents]
    for i, (doc, embedding) in enumerate(zip(documents, documents_embeddings)):
        print(f"Document {i+1}: {doc[:50]}...")
        print(f"Embedding (first 5 elements): {embedding[:5]}...")
        print()

    # Document Retrieval
    print(bold("\n=== Document Retrieval ===\n"))
    retrieved_emb, retrieved_emb_index = text_gen.retrieve_relevant_documents(query_embedding, documents_embeddings, k=1)
    retrieved_doc = [documents[index] for index in retrieved_emb_index]
    print("Retrieved Document:")
    print(json.dumps(retrieved_doc, indent=2))
    print()

    # Analyze input
    print(bold("\n=== Input Analysis ===\n"))
    input_passed, input_message, input_details = text_gen.analyze_prompt(retrieved_doc[0], query)
    if not input_passed:
        print(f"Input Guardrail Intervened. The response to the User is: {input_message}\n")
        print("Full API Response:")
        print(json.dumps(input_details, indent=2))
        print()
        return
    else:
        print("Input Prompt Passed The Guardrail Check - Moving to Generate the Response\n")

    # Generate text
    print(bold("\n=== Text Generation ===\n"))
    enhanced_query = f"{query}\n\nRelevant information:\n{retrieved_doc[0]}"
    generated_text = text_gen.generate_text(enhanced_query, max_new_tokens=max_new_tokens, temperature=temperature)
    print(f"Here is what the Model Responded with: {generated_text}\n")

    # Analyze output
    print(bold("\n=== Output Analysis ===\n"))
    print("Analyzing Model Response with the Response Guardrail\n")
    output_passed, output_message, output_details = text_gen.analyze_output(retrieved_doc[0], query, generated_text)
    if not output_passed:
        print(f"Output Guardrail Intervened. The response to the User is: {output_message}\n")
        print("Full API Response:")
        print(json.dumps(output_details, indent=2))
        print()
    else:
        print(f"Model Response Passed. The information presented to the user is: {generated_text}\n")

if __name__ == "__main__":
    main()