## Overview

In this tutorial we will showcase Observability through CloudWatch using AWS Opentelemetry Instrumentation and AgentCore Observability.

## Prerequisites

* Amazon CloudWatch Access
* Enable [transaction search](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Enable-TransactionSearch.html) on Amazon CloudWatch.

### Setup via CloudWatch Console
1. Open the [CloudWatch Console](https://console.aws.amazon.com/cloudwatch)
2. Navigate to **Application Signals (APM)** → **Transaction search**
3. Choose **Enable Transaction Search**
4. Select checkbox to **ingest spans as structured logs**
5. (Optional) Adjust **X-Ray trace indexing** percentage (default: 1%)
6. Choose **Save**

Please note that when using the `bedrock_agentcore_starter_toolkit` to configure your agent, it takes care of the opentelemetry instrumentation. 

When configuring for containerized environment (such as docker) add the following command, an example is given below:

`CMD ["opentelemetry-instrument", "python", "runtime_agent_main.py"]`

[AWS Distro for OpenTelemetry](https://aws-otel.github.io/docs/introduction) needs to be part of `requiremennts.txt`

you can view the Dockerfile created by `bedrock_agentcore_starter_toolkit` here:

## Observability Concepts

### Sessions
- **Definition**: Complete interaction context between user and agent
- **Scope**: Entire conversation lifecycle from initialization to termination
- **Provides**: Context persistence, state management, conversation history
- **Metrics**: Session count, duration, user engagement patterns

### Traces
- **Definition**: Detailed record of single request-response cycle
- **Scope**: Complete execution path from agent invocation to response
- **Provides**: Processing steps, tool invocations, resource utilization
- **Metrics**: Request latency, processing time, error rates

### Spans
- **Definition**: Discrete, measurable unit of work within execution flow
- **Scope**: Fine-grained operations with start/end timestamps
- **Provides**: Operation details, parent-child relationships, status information
- **Metrics**: Operation duration, success/failure rates, resource usage

# To Test observability lets do a few tests

TODO: 
* load agent
* create session 
* do some invocations..

## CloudWatch GenAI Observability Dashboard
walkthrough of [CloudWatch GenAI Observability](https://console.aws.amazon.com/cloudwatch/home#gen-ai-observability)

TODO:
    Show and tell 
* logs
* data
* etc 

You are able to view all your Agents that have observability in them and filter the data based on time frames, some examples are provided below :
<div style="text-align:left">
    <img src="images/genai-observability.png" width="50%"/>
</div>

In the main dashboard you are able to view runtime metrics accross all agents as shown below: 

<div style="text-align:left">
    <img src="images/runtime-all-agent-metrics.png" width="50%"/>
</div>

Now, if you click on the agent you just deployed you will be taken to a dashboard for the runtime metrics specific to this agent, you can also filter the data by a custom time frame: 

![runtime-metrics-per-agent.png](attachment:runtime-metrics-per-agent.png)

In the Sessions View tab, you can navigate to all the sessions associated with this agent: 

![Agent-sessions-view.png](attachment:Agent-sessions-view.png)

In the Trace View tab, you can look into the traces and span information for this agent on runtime. 

![Agentcore-trace.png](attachment:Agentcore-trace.png)


Please click through the various features of GenAI observability dashboard to get more detailed information on traces.

<div style="text-align:left">
    <img src="images/08-agentcore-trace-view.png" width="50%"/>
</div>