# Observing Agentic AI Workloads with Amazon CloudWatch
## Introduction
With Amazon CloudWatch, you can observe generative AI workloads, including [Amazon Bedrock AgentCore agents](https://aws.amazon.com/bedrock/agentcore/), and gain insights into AI performance, health, and accuracy. CloudWatch provides pre-configured views into latency, usage, and errors of your AI workloads, allowing you to detect issues faster in components like models and agents. End-to-end prompt tracing helps you quickly identify issues in components such as knowledge bases, tools, and models. CloudWatch's AI monitoring capabilities are compatible with popular generative AI orchestration frameworks such as [AWS Strands](https://strandsagents.com/latest/), LangChain, and LangGraph, offering flexibility with your choice of framework.
## Key Capabilities
CloudWatch generative AI observability enables you to:
### Gain Insights into AI Performance and Reliability
- Gain insights into end-user outcomes, AI performance, health, and accuracy while reducing human-in-the-loop (HITL) assessment burden
- Monitor model invocations, Agents (managed, self-hosted, and third-party), knowledge bases, guardrails, and tools
### Streamline Troubleshooting and Debugging
- Identify the source of errors quickly using end-to-end prompt tracing, curated metrics, and logs
- Troubleshoot issues across your entire GenAI application and underlying infrastructure, leveraging existing CloudWatch observability tools such as [Application Signals](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Application-Monitoring-Sections.html), [Alarms](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html), [Dashboards](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Dashboards.html), [Sensitive data protection](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch-logs-data-protection-policies.html), and [Logs Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html)
### Integrate with Your Existing Observability Stack
- Access prompt traces while using Amazon Bedrock, and send structured traces of third-party models to CloudWatch using ADOT SDK
- For information about adding observability to your Amazon Bedrock AgentCore agent or tool, see [Amazon Bedrock AgentCore](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-genesis.html)

## Prerequisites

In [None]:
!pip install --force-reinstall -U -r requirements.txt --quiet

In [None]:
# Import required libraries
import os
import json
import boto3

session = boto3.Session()

sts = session.client('sts')
identity = sts.get_caller_identity()
account_id = identity['Account']
region = boto3.Session().region_name or 'us-west-2'

print(f"Account ID: {account_id}")
print(f"Region: {region}")

Bedrock model invocation logs can be written to an S3 destination or to CloudWatch. An IAM role is required to flow the logs to the targets.<br/>
For the IAM role requirements, refer: https://docs.aws.amazon.com/bedrock/latest/userguide/model-invocation-logging.html

In [None]:
# Read variables
from utils import get_param_value
logging_role_arn = get_param_value("/app/workshop/iam/logging/arn")
log_s3_bucket = get_param_value("/app/workshop/s3bucket/observability")
model_invocation_loggroup = get_param_value("/app/workshop/cloudwatch/log-group/model-invocation")

⚠️ **If you are running this notebook in your own account, please replace the variables with your own resource names & ARNs**

In [None]:
#logging_role_arn="<PLACEHOLDER>"
#log_s3_bucket="<PLACEHOLDER>"
#model_invocation_loggroup="<PLACEHOLDER>"

## Enabling Bedrock Model Invocation loggig

## Enabling Transaction Search in the console

In [None]:
logging_config_dict={
    'cloudWatchConfig': {
        'logGroupName': model_invocation_loggroup,
        'roleArn': logging_role_arn,
        'largeDataDeliveryS3Config': {
            'bucketName': log_s3_bucket,
            'keyPrefix': 'cloudwatch/bedrock/model_invocation/large_delivery/'
        }
    },
    's3Config': {
        'bucketName': log_s3_bucket,
        'keyPrefix': 'cloudwatch/bedrock/model_invocation/log/'
    },
    'textDataDeliveryEnabled': True,
    'imageDataDeliveryEnabled': True,
    'embeddingDataDeliveryEnabled': True,
    'videoDataDeliveryEnabled': True
}
json.dumps(logging_config_dict)

In [None]:
response = session.client("bedrock").put_model_invocation_logging_configuration(
    loggingConfig=logging_config_dict
)
response

In [None]:
#Policy
transaction_search_policy_dict = {
    "Version":"2012-10-17",
    "Statement": [
        {
            "Sid": "TransactionSearchXRayAccess",
            "Effect": "Allow",
            "Principal": {
                "Service": "xray.amazonaws.com"
            },
            "Action": "logs:PutLogEvents",
            "Resource": [
                f"arn:aws:logs:{region}:{account_id}:log-group:aws/spans:*",
                f"arn:aws:logs:{region}:{account_id}:log-group:/aws/application-signals/data:*"
            ],
            "Condition": {
                "ArnLike": {
                    "aws:SourceArn": f"arn:aws:xray:{region}:{account_id}:*"
                },
                "StringEquals": {
                    "aws:SourceAccount": account_id
                }
            }
        }
    ]
}

json.dumps(transaction_search_policy_dict)

In [None]:
# Step 1: Put resource policy
logs = session.client("logs")
logs.put_resource_policy(
    policyName="xray_policy_transaction_search",
    policyDocument=json.dumps(transaction_search_policy_dict)
)

In [None]:
# Step 2: Set Amazon Xray destination to CloudWatch
!aws xray update-trace-segment-destination --destination CloudWatchLogs

# IGNORE ERROR MESSAGE "The destination is already set to CloudWatchLogs"

In [None]:
# Step 3: Update indexing rule
!aws xray update-indexing-rule --name "Default" --rule '{"Probabilistic": {"DesiredSamplingPercentage": 1.0}}'

In [None]:
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session
boto_session = Session()
region = boto_session.region_name

agentcore_runtime = Runtime()
agent_name = "agentcore_ivr_observability"
response = agentcore_runtime.configure(
    entrypoint="strands_agent.py",
    auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name=agent_name
)
response

In [None]:
launch_result = agentcore_runtime.launch()
launch_result

In [None]:
import time
status_response = agentcore_runtime.status()
status = status_response.endpoint['status']
end_status = ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']
while status not in end_status:
    time.sleep(10)
    status_response = agentcore_runtime.status()
    status = status_response.endpoint['status']
    print(status)
status

In [None]:
invoke_response = agentcore_runtime.invoke({"prompt": "What is 2 + 3?"})
invoke_response

In [None]:
launch_result.ecr_uri, launch_result.agent_id, launch_result.ecr_uri.split('/')[1]

In [None]:
#cleanup
import boto3

agentcore_control_client = boto3.client(
    'bedrock-agentcore-control',
    region_name=region
)
ecr_client = boto3.client(
    'ecr',
    region_name=region
    
)

runtime_delete_response = agentcore_control_client.delete_agent_runtime(
    agentRuntimeId=launch_result.agent_id,
    
)

response = ecr_client.delete_repository(
    repositoryName=launch_result.ecr_uri.split('/')[1],
    force=True
)