## Overview

In this tutorial we will showcase CloudWatch GenAI Observability Dashboard. We will run a few tests and then review the traces generated by our agent.

Amazon Bedrock AgentCore Observability helps you trace, debug, and monitor agent performance in production environments.

AgentCore Observability provides:

* Detailed visualizations of each step in the agent workflow
* Real-time visibility into operational performance through CloudWatch dashboards
* Telemetry for key metrics such as session count, latency, duration, token usage, and error rates
* Rich metadata tagging and filtering for issue investigation
* Standardized OpenTelemetry (OTEL)-compatible format for easy integration with existing monitoring stacks


<div style="text-align:left">
    <img src="images/00-Observability.png" width="75%"/>
</div>

### Observability Concepts in AgentCore
This section defines the concepts of sessions, traces and spans as they relate to monitoring and observability of agents.
        
##### Sessions
A session represents a complete **interaction context** between user and agent. Sessions encapsulate the entire conversation or interaction flow, maintaining state and context across multiple exchanges. Each session has a unique identifier and captures the full lifecycle of user engagement with the agent, from initialization to termination.
##### Traces
A trace represents a detailed record of a single request-response cycle beginning from with an agent invocation and may include additional calls to other agents. Traces capture the **complete execution path** of a request, including all internal processing steps, external service calls, decision points, and resource utilization. 
##### Spans
A span represents a discrete, measurable **unit of work** within an agent's execution flow. Spans capture fine-grained operations that occur during request processing, providing detailed visibility into the internal components and steps that make up a complete trace. Each span has a defined start and end time, creating a precise timeline of agent activities and their durations.


The relationship between these three observability components can be visualized as:
- Sessions (highest level) - Represent complete user conversations or interaction contexts
- Traces (middle level) - Represent individual request-response cycles within a session
- Spans (lowest level) - Represent specific operations or steps within a trace


### Step 1
#### Examining how OpenTelemetry is configured with AgentCore

To enable tracing of AI applications, we need to add the AWS Distro for Open Telemetry (ADOT) SDK  `aws-opentelemetry-distro` to your agent code. When using the `bedrock_agentcore_starter_toolkit` to configure your agent, it takes care of the opentelemetry instrumentation for you.
 More details are available [here](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-configure.html#observability-configure-custom).
 1. Open the Code server. Look for the deploy_agentcore/Dockerfile that was created as part of the deployment process in the previous lab.

 2. Observe that it installs OpenTelemetry for you with RUN pip install aws-opentelemetry-distro>=0.10.1 and then runs it as part of the deployment process CMD ["opentelemetry-instrument", "python", "-m", "xxx"].

   
### Step 2 
::alert[We can skip this step if the agentcore memory lab is executed. If you don't see the **"Enable"**]{type="info"}
#### Enable [CloudWatch Transaction Search](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Enable-TransactionSearch.html).
This is a one-time setup to turn on Amazon CloudWatch Transaction Search. Follow the steps below to enable:

1. Open the [CloudWatch Console](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2)
2. Navigate to **Application Signals (APM)** → **Transaction search**
3. Choose **Enable Transaction Search**
4. Select checkbox to **ingest spans as structured logs**
5. (Optional) Adjust **X-Ray trace indexing** percentage (default: 1%)
6. Choose **Save**

<div style="text-align:left">
    <img src="images/observability-ts.png" width="75%"/>
    <img src="images/observability-ts-save.png" width="75%"/>

</div>

### Step 3
#### Run a few invocations
When using ADOT, in order to propagate session id correctly, define the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id in the request header. ADOT then sets the session_id correctly in the downstream headers.

To propoagate a trace ID, invoke the AgentCore runtime with the parameter traceId=<traceId> set.

We will invoke the agent using boto3 (AWS Python SDK). 
First lets retrieve the ARN of our agent

In [None]:
#list the AgentCore runtimes and copy the agentRuntimeArn
!aws bedrock-agentcore-control list-agent-runtimes

copy the agentRuntimeArn and replace <sample-agentRuntimeArn> below

In [None]:
agent_arn = "<sample-agentRuntimeArn>"

In [None]:
from boto3.session import Session
boto_session = Session()
region = boto_session.region_name

# Function to print Agent response
def print_response(boto3_response):
    # Print the basic response details
    print("Status Code:", boto3_response['statusCode'])
    print("Content Type:", boto3_response['contentType'])
    # Handle the StreamingBody response
    if 'response' in boto3_response:
        try:
            # Read and decode the streaming response
            response_body = boto3_response['response'].read().decode('utf-8')
            print("\nResponse Body:")
            print(response_body)
        except Exception as e:
            print(f"Error reading response body: {str(e)}")

In [None]:
from IPython.display import Markdown, display
import json
import boto3
agentcore_client = boto3.client(
    'bedrock-agentcore',
    region_name=region
)

#Set a user id for our test
user_id = "700001"

#set a session id for our test
session_id = f"test_observability_session_{user_id}"

prompt = f"I am customer {user_id}, what is the status of my application?"

payload=json.dumps({"prompt": prompt,
                   "user_id": user_id})

boto3_response = agentcore_client.invoke_agent_runtime(
    agentRuntimeArn=agent_arn,
    qualifier="DEFAULT",
    payload=payload,
    runtimeSessionId=session_id
)

print_response(boto3_response)

### Step 4
#### GenAI Observability for visualization

The [CloudWatch GenAI Observability](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/GenAI-observability.html) provides Bedrock AgentCore dashboard with Agent, Sessions View & Traces Views to understand the performance and execution flow of Runtime-hosted agents. You can access them by selecting GenAI Observability (Preview) in the CloudWatch console.

CloudWatch GenAI Observability provides two pre-built dashboards:
- Model Invocations – Detailed metrics on model usage, token consumption, and costs
- Amazon Bedrock AgentCore agents – Performance and decision metrics for the Amazon Bedrock agents

Click [here](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#/gen-ai-observability/agent-core?start=-3600000) to navigate to the Bedrock AgentCore dashboard.You can analyze various Agents and their associated interactions under Agent view, Sessions view, and Traces view.

Click on **Sessions view** and you will be able to see the session ID that we just tested with above.


<div style="text-align:left">
    <img src="images/02-Sessions-view.jpg" width="50%"/>
</div>

Select session "test_observability_session_700001" to see a list of traces for this session. In our test there is only trace.

The trace provides end-to-end visibility into agent execution paths including LLM calls and tool usage. Select "Trajectory" to review the interconnected relationship of the spans and subsequent calls from these spans. For illustration, the screenshot below highlights execution of one of the tools available to the agent and its execution time.

<div style="text-align:left">
    <img src="images/03-trace-view.jpg" width="70%"/>
</div>


### View Logs in CloudWatch

1. Open the [CloudWatch console](https://us-west-2.console.aws.amazon.com/cloudwatch/)
2. In the left navigation pane, expand **Logs** and select **Log groups**
3. Search for your agent's log group:
   - Standard logs (stdout/stderr): `/aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>/[runtime-logs] <UUID>`
   - OTEL structured logs: `/aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>/runtime-logs`

### View Metrics

1. Open the [CloudWatch console](https://us-west-2.console.aws.amazon.com/cloudwatch/)
2. Select **Metrics** from the left navigation
3. Browse to the `bedrock-agentcore` namespace
4. Explore the available metrics

See [here](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-service-provided.html) for more details on available metrics.