# Productionize Agentic AI Applications Using Amazon Bedrock AgentCore

## Overview
Successfully deploying AI agents to production requires a comprehensive set of capabilities that go far beyond basic model inference. This lab demonstrates the **essential production-grade components** needed for enterprise agentic workloads:

### **Infrastructure & Deployment Foundation**
- **Containerized Deployment**: Package agents with all dependencies in Docker containers for consistent, portable deployments across environments
- **Managed Runtime Environment**: AgentCore Runtime provides auto-scaling, health monitoring, and resource management without infrastructure overhead
- **Infrastructure as Code (IaC) Readiness**: Configuration-driven deployments that integrate seamlessly with CI/CD pipelines
- 

### **Enterprise Security & Identity Management**
- **Multi-Tenant Session Isolation**: Each user session runs in dedicated microVMs with isolated compute, memory, and filesystem resources
- **Industry-Standard Authentication**: Support for OAuth 2.0, JWT tokens, API keys, and AWS Sigv4 for flexible identity integration
- **Credential Management**: Secure handling of access tokens and service credentials without exposing sensitive information
- **Access Control**: Fine-grained permissions through IAM roles and policies, ensuring agents operate with least-privilege principles

### **Production Observability & Monitoring**
- **Real-Time Metrics**: Track invocation counts, latency, error rates, throttling, and resource utilization across all agent operations
- **Hierarchical Tracing**: Four-tier observability model (Runtime → Sessions → Traces → Spans) providing complete visibility into agent execution
- **OpenTelemetry Compatibility**: Standards-based telemetry integration with existing monitoring stacks and enterprise observability tools
- **Visual Debugging**: CloudWatch dashboards with trace timelines, error breakdowns, and performance analytics for rapid troubleshooting

### **Scalability & Performance Optimization**
- **Automatic Scaling**: Dynamic resource allocation based on demand without manual intervention
- **Session-Based Resource Management**: Efficient resource utilization through intelligent session lifecycle management
- **Cross-Session Data Protection**: Memory sanitization and resource cleanup preventing data leakage between user sessions
- **Optimized Execution Paths**: Performance monitoring and optimization capabilities for production workload efficiency


## Core Objectives
In this lab, we'll transform the local multi-agent system from lab7 into a **production-ready application** deployed on [Amazon Bedrock AgentCore Runtime](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agents-tools-runtime.html). You'll master:

**Note:** This lab depends on the completion of the Strands Agent in lab7. Please complete lab7 before running this particular lab.

The workflow shown in the diagram above is as follows:

1. A journalist submits facts to a front-end backed by an LLM (Interface Supervisor)
2. The Interface Supervisor agent sends the facts to a Research agent.
3. The Research agent is equipped with a Tool that does the following:
   1. Entity Extraction: These can be people, companies, products, etc.
   2. Gather background information: This uses the Bedrock Knowledge Base we created in the setup phase. If any entity has low confidence scores, i.e. not mentioned anywhere in the Knowledge Base it is discarded.
4. The Lambda then returns the research to the Research agent, which returns it to the Interface Supervisor agent.
5. Once additional context has been provided by the Research agent, the Interface Supervisor agent sends the research and the facts to the Article Generation agent. This agent is part of a reflection pattern we covered earlier (Lab 5):
   1. News Generation agent: This writes the main news article based on the information provided by the Research agent.
   2. Article Reviewer agent: This provides feedback to the News Generation agent and together, these agents iteratively improve the quality of the generated article.
6. The remainder of the architecture is shown for completeness, and won't be part of this lab. Feel free to implement that if you have time at the end.


### Agent Architecture
Let's review the architecture of the agent that we built in lab7. In the following section, we will describe how to deploy an existing agent to AgentCore runtime with all the production-grade capabilities outlined above.

<div style="text-align:left">
    <img src="../imgs/lab8-strands-local.png" width="80%"/>
</div>

This architecture transforms from a local development environment to a **cloud-native, enterprise-ready deployment** that demonstrates industry best practices for production AI agent systems.

### Turn On AgentCore Observability
In this lab, we'll be using the AgentCore Observability feature to help us trace the agent orchestrations. Make sure you've turned on `Transaction Search` in Cloudwatch before running the notebook in this lab. For more information, please refer to this [link](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability-configure.html).



In [None]:
# Make sure you download the latest botocore and boto3 libraries.
import shutil
import subprocess
import sys

def ensure_uv_installed():
    if shutil.which("uv") is None:
        print("🔧 'uv' not found. Installing with pip...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", "uv"])
    else:
        print("✅ 'uv' is already installed.")

def uv_install(*packages):
    ensure_uv_installed()
    uv_path = shutil.which("uv")
    print(f"📦 Installing {', '.join(packages)} using uv...")
    subprocess.check_call([uv_path, "pip", "install", *packages])

In [None]:
%uv pip install -r requirements.txt -U

## Environment Setup

Restore variables from previous notebook sessions (particularly the `lab7_kb_id` from lab7):


In [None]:
%store -r

Import the required libraries for AWS services and JSON processing:


In [None]:
import boto3
import json
import time
import uuid

## Agent Configuration

Define agent names and file paths. We'll create both local and runtime versions of the agent:


In [None]:
agent_name = "news_story_generator_agent"
news_story_agent_local_template_script = f"{agent_name}_local_template.py"
news_story_agent_local_script = f"{agent_name}_local.py"
news_story_agent_template_script = f"{agent_name}_template.py"
news_story_agent_script = f"{agent_name}.py"
agent_name = f"{agent_name}_{str(uuid.uuid4())[:5]}"


Read the local agent template file that contains placeholder values:
Replace the placeholder `{{lab7_kb_id}}` with the actual knowledge base ID from lab7 and write the configured agent to a new file:


In [None]:
with open(news_story_agent_local_template_script, "r") as f:
    agent_local_code = f.read()

new_agent_local_code = agent_local_code.replace("{{lab7_kb_id}}", lab7_kb_id)
with open(news_story_agent_local_script, "w") as f:
    f.write(new_agent_local_code)

Display the generated agent code to verify the configuration:


In [None]:
%pycat news_story_generator_agent_local.py

Let's test the script locally to ensure it works as expected.

## Local Testing

Create sample news facts to test our agent locally:


In [None]:
news_facts = """NeuraHealth Solutions announced its new medical diagnostic platform called "MediScan" at their annual developer conference yesterday.
The system demonstrated 94% accuracy in early disease detection across a trial of 12,000 patients.
Dr. Eliza Chen, Chief Medical Officer at NeuraHealth, revealed the system was trained on 50 million anonymized patient records.
"""
payload = json.dumps({ "query" : news_facts})

Run the agent locally with the sample news facts and measure the execution time:


In [None]:
%%time
!python news_story_generator_agent_local.py '{payload}'

## Preparing your agent for deployment on AgentCore Runtime

Let's now deploy our agents to AgentCore Runtime. To do so we need to:
* Import the Runtime App with `from bedrock_agentcore.runtime import BedrockAgentCoreApp`
* Initialize the App in our code with `app = BedrockAgentCoreApp()`
* Decorate the invocation function with the `@app.entrypoint` decorator
* Let AgentCoreRuntime control the running of the agent with `app.run()`

Prepare the deployment version of the agent by reading the template and configuring it with the knowledge base ID:


In [None]:
with open(news_story_agent_template_script, "r") as f:
    new_agent_code = f.read()

new_agent_code = new_agent_code.replace("{{lab7_kb_id}}", lab7_kb_id)
with open(news_story_agent_script, "w") as f:
    f.write(new_agent_code)

Optional: Uncomment the following line to test the agent locally before deployment:


In [None]:
# !python news_story_generator_agent.py # Uncomment this line if you want to launch the agent locally for testing purposes.

## What happens behind the scenes?

When you use `BedrockAgentCoreApp`, it automatically:

* Creates an HTTP server that listens on the port 8080
* Implements the required `/invocations` endpoint for processing the agent's requirements
* Implements the `/ping` endpoint for health checks (very important for asynchronous agents)
* Handles proper content types and response formats
* Manages error handling according to the AWS standards

## Deploying the agent to AgentCore Runtime

The `CreateAgentRuntime` operation supports comprehensive configuration options, letting you specify container images, environment variables and encryption settings. You can also configure protocol settings (HTTP, MCP) and authorization mechanisms to control how your clients communicate with the agent. 

**Note:** Operations best practice is to package code as container and push to ECR using CI/CD pipelines and IaC

In this tutorial can will the Amazon Bedrock AgentCore Python SDK to easily package your artifacts and deploy them to AgentCore runtime.

### Creating runtime role

Before starting, let's create an IAM role for our AgentCore Runtime. We will do so using the utils function pre-developed for you.

Set up the Python path to access utility functions and create the IAM role for AgentCore Runtime:


In [None]:
import sys
import os

# Get the current notebook's directory
current_dir = os.path.dirname(os.path.abspath('__file__' if '__file__' in globals() else '.'))

utils_dir = os.path.join(current_dir, '..')
utils_dir = os.path.join(utils_dir, '..')
utils_dir = os.path.abspath(utils_dir)

# Add to sys.path
sys.path.insert(0, utils_dir)
print("sys.path[0]:", sys.path[0])

from utils import create_agentcore_role

agentcore_iam_role = create_agentcore_role(agent_name=agent_name)

### Configure AgentCore Runtime deployment

Next we will use our starter toolkit to configure the AgentCore Runtime deployment with an entrypoint, the execution role we just created and a requirements file. We will also configure the starter kit to auto create the Amazon ECR repository on launch.

During the configure step, your docker file will be generated based on your application code

<div style="text-align:left">
    <img src="../imgs/agentcore-runtime-configure.png" width="70%"/>
</div>

Configure the AgentCore Runtime deployment using the starter toolkit with the agent script, execution role, and requirements:


In [None]:
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session
boto_session = Session()
region = boto_session.region_name
region

agentcore_runtime = Runtime()

response = agentcore_runtime.configure(
    entrypoint=news_story_agent_script,
    execution_role=agentcore_iam_role['Role']['Arn'],
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name=agent_name
)
response

### Launching agent to AgentCore Runtime

Now that we've got a docker file, let's launch the agent to the AgentCore Runtime. This will create the Amazon ECR repository and the AgentCore Runtime

<div style="text-align:left">
    <img src="../imgs/agentcore-runtime-launch.png" width="75%"/>
</div> 

Launch the Agentic AI application

### Deploy to AgentCore Runtime

Launch the agent to the AgentCore Runtime, which will build the Docker image, push it to ECR, and create the runtime:


In [None]:
launch_result = agentcore_runtime.launch()

Monitor the agent deployment status and wait for it to be ready:


In [None]:
status_response = agentcore_runtime.status()
status = status_response.endpoint['status']
end_status = ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']
while status not in end_status:
    time.sleep(10)
    status_response = agentcore_runtime.status()
    status = status_response.endpoint['status']
    print(status)
status

### Invoking AgentCore Runtime with Session Management and Context Handling

Finally, we can invoke our AgentCore Runtime with a payload

<div style="text-align:left">
    <img src="../imgs/agentcore-runtime-invoke.png" width=75%"/>
</div>

Notice we also pass a session ID to the invoke function. AgentCore Runtime lets you isolate each user session and safely reuse context across multiple invocations in a user session. Session isolation is critical for AI agent workloads due to their unique operational characteristics:

* Complete execution environment separation: Each user session in Runtime receives its own dedicated microVM with isolated Compute, memory, and filesystem resources. This prevents one user's agent from accessing another user's data. After session completion, the entire microVM is terminated and memory is sanitized to remove all session data, eliminating cross-session contamination risks.

* Stateful reasoning processes: Unlike stateless functions, AI agents maintain complex contextual state throughout their execution cycle, beyond simple message history for multi-turn conversations. Runtime preserves this state securely within a session while ensuring complete isolation between different users, enabling personalized agent experiences without compromising data boundaries.

* Privileged tool operations: AI agents perform privileged operations on users' behalf through integrated tools accessing various resources. Runtime's isolation model ensures these tool operations maintain proper security contexts and prevents credential sharing or permission escalation between different user sessions.

* Deterministic security for non-deterministic processes: AI agent behavior can be non-deterministic due to the probabilistic nature of foundation models. Runtime provides consistent, deterministic isolation boundaries regardless of agent execution patterns, delivering the predictable security properties required for enterprise deployments.


## Trace the Agent Logs
All the log information from the agents are written to Cloudwatch by default. In the output from the `agentcore_runtime.launch()` shown in the previous output cell, you could see the command that could be used for showing the cloudwatch logs. Open a CLI terminal in your IDE and run the command to see the output from the agent. 

An example command: `aws logs tail /aws/bedrock-agentcore/runtimes/news_story_generator_agent_b1135-7B4Adh5eDR-DEFAULT --follow`

Invoke the deployed agent with a unique session ID and the sample news facts payload:


In [None]:
session_id = uuid.uuid4()
invoke_response = agentcore_runtime.invoke(json.loads(payload), session_id=str(session_id))

Define a helper function to clean and extract the final output from the agent response:


In [None]:
invoke_response

## Part 2 - Identity Integration
Amazon Bedrock AgentCore Identity is an identity and credential management service designed specifically for AI agents and automated workloads. It provides secure authentication, authorization, and credential management capabilities that enable agents and tools to access AWS resources and third-party services on behalf of users while helping to maintain strict security controls and audit trails. 

Agent identities are implemented as workload identities with specialized attributes that enable agent-specific capabilities while helping to maintain compatibility with industry-standard workload identity patterns. The service integrates natively with Amazon Bedrock AgentCore to provide identity and credential management for agent applications, including Host agent or tools with Amazon Bedrock AgentCore Runtime and Amazon Bedrock AgentCore Gateway: Securely connect tools and other resources to your Gateway.

Amazon Bedrock AgentCore Identity supports seamless integration with AWS and third-party services through Sigv4, standardized OAuth 2.0 flows, and API keys.

### Provision a Cognito User Pool
Lets provision a Cognito Userpool with an App client and one test user. Note down the 1/Cognito Discovery url and 2/the Cognito app client id. We will use it to configure our agent for Inbound Auth with Cognito.


In [None]:
!chmod +x setup_cognito_user_pool.sh
!bash setup_cognito_user_pool.sh

Set up a new boto session for the identity-enabled deployment:


### Configure AgentCore Runtime deployment

Next we will use our starter toolkit to configure the AgentCore Runtime deployment with an entrypoint, the execution role we just created and a requirements file. We will also configure the starter kit to auto create the Amazon ECR repository on launch.

During the configure step, your docker file will be generated based on your application code. 
We'll use parse the values from these files to test the new engpoint.

In [None]:
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session
boto_session = Session()
region = boto_session.region_name
region

### Create IAM Role for Identity-Enabled Agent

Create a new IAM role and agent name for the identity-enabled deployment:

In [None]:
agent_name = f"news_story_generator_agent_{str(uuid.uuid4())[:5]}"
agentcore_iam_role = create_agentcore_role(agent_name=agent_name)

Extract the identity details from the script that created the cognito user pool.

Read the Cognito configuration details from the JSON files created by the setup script:


In [None]:
with open("client.json", "r") as f:
    client_data = f.read()
    client_data_dict = json.loads(client_data)
    user_pool_id = client_data_dict["UserPoolClient"]["UserPoolId"]
    client_id = client_data_dict["UserPoolClient"]["ClientId"]

with open("auth.json", "r") as f:
    auth_data = f.read()
    auth_data_dict = json.loads(auth_data)
    bearer_access_token = auth_data_dict["AuthenticationResult"]["AccessToken"]

cognito_discovery_url=f"https://cognito-idp.{region}.amazonaws.com/{user_pool_id}/.well-known/openid-configuration"

Configure the AgentCore Runtime with Cognito JWT authorization:

In [None]:
agentcore_runtime = Runtime()
response = agentcore_runtime.configure(
    entrypoint=news_story_agent_script,
    execution_role=agentcore_iam_role['Role']['Arn'],
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name=agent_name,
    authorizer_configuration={
        "customJWTAuthorizer": {
            "discoveryUrl": cognito_discovery_url,
            "allowedClients": [client_id]
        }
    }
)
response


Launch the identity-enabled agent to AgentCore Runtime:

In [None]:
launch_result = agentcore_runtime.launch()
launch_result

Monitor the identity-enabled agent deployment status:

In [None]:
status_response = agentcore_runtime.status()
status = status_response.endpoint['status']
end_status = ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']
while status not in end_status:
    time.sleep(10)
    status_response = agentcore_runtime.status()
    status = status_response.endpoint['status']
    print(status)
status

Test the agent without any authentication. The following call is expected to fail, given no credential was provided to the AgentCore runtime invocation.

In [None]:
session_id = uuid.uuid4()
invoke_response = agentcore_runtime.invoke(json.loads(payload), session_id=str(session_id))

### Invoking AgentCore Runtime with authorization

Lets invoke the agent with the right authorization token type. In our case, it will be the Cognito access token. 

In [None]:
#Update the Cognito access token here. Copy the access token from the cell "Provision a Cognito User Pool"
session_id = uuid.uuid4()
invoke_response = agentcore_runtime.invoke(json.loads(payload), session_id=str(session_id), bearer_token=bearer_access_token)
invoke_response

# Part 3 - Agent Observability
With AgentCore, you can trace, debug, and monitor AI agents' performance in production environments.

AgentCore Observability helps you trace, debug, and monitor agent performance in production environments. It offers detailed visualizations of each step in the agent workflow, enabling you to inspect an agent's execution path, audit intermediate outputs, and debug performance bottlenecks and failures.

AgentCore Observability gives you real-time visibility into agent operational performance through access to dashboards powered by Amazon CloudWatch and telemetry for key metrics such as session count, latency, duration, token usage, and error rates. Rich metadata tagging and filtering simplify issue investigation and quality maintenance at scale. AgentCore emits telemetry data in standardized OpenTelemetry (OTEL)-compatible format, enabling you to easily integrate it with your existing monitoring and observability stack.

By default, AgentCore outputs a set of key built-in metrics for agents, gateway resources, and memory resources. For memory resources, AgentCore also outputs spans and log data if you enable it. You can also instrument your agent code to provide additional span and trace data and custom metrics and logs. See Add observability to your Amazon Bedrock AgentCore resources to learn more.

All of the metrics, spans, and logs output by AgentCore are stored in Amazon CloudWatch, and can be viewed in the CloudWatch console or downloaded from CloudWatch using the AWS CLI or one of the AWS SDKs.

In addition to the raw data stored in CloudWatch Logs, for agent runtime data only, the CloudWatch console provides an observability dashboard containing trace visualizations, graphs for custom span metrics, error breakdowns, and more. To learn more about viewing your agents' observability data, see View observability data for your Amazon Bedrock AgentCore agents.

Agent observability can be categorized into 4 main components: `Runtime`, `Sessions`, `Traces` and `Spans`.

`Runtime` Metrics that provides agent execution activity levels, processing latency, resource utilization, and error rates. AgentCore also provides aggregated metrics for total invocations and sessions.


### The relationship between the observability components

![agentcore-observability-hierarchy](../imgs/agentcore-observability-hierarchy.png)


`Sessions`, `traces`, and `spans` form a three-tiered hierarchical relationship in the observability framework for agents. A session contains multiple traces, with each trace representing a discrete interaction within the broader context of the session. Each trace, in turn, contains multiple spans that capture the fine-grained operations and steps within that interaction. This hierarchical structure allows you to analyze agent behavior at different levels of granularity, from high-level session patterns to mid-level interaction flows to detailed execution paths for specific operations.

- Sessions (highest level) - Represent complete user conversations or interaction contexts
- Traces (middle level) - Represent individual request-response cycles within a session
- Spans (lowest level) - Represent specific operations or steps within a trace

This multi-tiered relationship enables several important observability capabilities:

- Contextual analysis of individual interactions within their broader conversation flow
- Correlation of related requests across a user's interaction journey
- Progressive troubleshooting from session-level anomalies to trace-level patterns to span-level root causes
- Comprehensive performance profiling across different temporal and functional dimensions
- Holistic understanding of agent behavior patterns and evolution throughout a conversation
- Precise identification of performance bottlenecks at the operation level through span analysis

While traces provide visibility into complete request-response cycles, spans offer deeper insights into the internal workings of those cycles. Spans reveal exactly which operations consume the most time, where errors originate, and how different components interact within a single trace. This granularity is particularly valuable when troubleshooting complex issues or optimizing performance in sophisticated agent implementations.

By leveraging session, trace, and span data in your observability strategy, you can gain comprehensive insights into your agent's behavior, performance, and effectiveness at multiple levels of detail. This multi-layered approach to observability supports continuous improvement, robust troubleshooting, and informed optimization of your agent implementations, from high-level conversation patterns down to individual operation performance.

In the next section, we'll dive into it each of the section in detail.

Navigate to the AWS console for [agentCore runtime](https://us-east-1.console.aws.amazon.com/bedrock-agentcore/agents), then click on the Agent runtime, then click on `Agent Details`, and click on the `Observability` link as shown in the following diagram:

![agentcore-runtime-to-observability](../imgs/agentcore-runtime-to-observability.png)

## Agent Runtime Metrics
The runtime metrics provided by AgentCore give you visibility into your agent execution activity levels, processing latency, resource utilization, and error rates. AgentCore also provides aggregated metrics for total invocations and sessions.

The following list describes the runtime metrics provided by AgentCore. Runtime metrics are batched at one minute intervals. To learn more about viewing runtime metrics, see View observability data for your Amazon Bedrock AgentCore agents.

### Invocations
Shows the total number of requests made to the Data Plane API. Each API call counts as one invocation, regardless of the request payload size or response status.

### Invocations (aggregated)
Shows the total number of invocations across all resources

### Throttles
Displays the number of requests throttled by the service due to exceeding allowed TPS (Transactions Per Second) or quota limits. These requests return ThrottlingException with HTTP status code 429. Monitor this metric to determine if you need to review your service quotas or optimize request patterns.

### System Errors
Shows the number of server-side errors encountered by AgentCore during request processing. High levels of server-side errors can indicate potential infrastructure or service issues that require investigation. See Error types for a list of possible error codes.

### User Errors
Represents the number of client-side errors resulting from invalid requests. These require user action to resolve. High levels of client-side errors can indicate issues with request formatting or permissions that need to be addressed. See Error types for a list of possible error codes.

### Latency
The total time elapsed between receiving the request and sending the final response token. Represents complete end-to-end processing time of the request.

### Total Errors
The total number of system and user errors. In the Amazon Bedrock AgentCore console, this metric displays the number of errors as a percentage of the total number of invocations.

### Session Count
Shows the total number of agent sessions. Useful for monitoring overall platform usage, capacity planning, and understanding user engagement patterns.

### Sessions (aggregated)
Shows the total number of sessions across all resources.
From the observability landing page, you will find the metrics captured at the general levels. Under `Overview` tab, you'll see agent metrics including number of sessions, token usage counts, errors, latency, throttle counts or others as visualization. 

![agentcore-observability-landing](../imgs/agentcore-observability-landing.png)

Agentcore observability also includes separate `Session`, `Trace` and `Spans` level metrics. We'll dive into those specific in the sections below.
    

## Session
A session represents a complete interaction context between a user and an agent. Sessions encapsulate the entire conversation or interaction flow, maintaining state and context across multiple exchanges. Each session has a unique identifier and captures the full lifecycle of user engagement with the agent, from initialization to termination.

Sessions provide the following capabilities for agents:

Context persistence across multiple interactions within the same conversation

State management for maintaining user-specific information

Conversation history tracking for contextual understanding

Resource allocation and management for the duration of the interaction

Isolation between different user interactions with the same agent

From an observability perspective, sessions provide a high-level view of user engagement patterns, allowing you to monitor agent performance across metrics, traces, and spans and to understand how users interact with your agents over time and across different use cases.

By default, AgentCore provides a set of observability metrics at the session level for agents that are running in the AgentCore runtime. This page offers a variety of graphs and visualizations to help you interpret your agents' data. AgentCore also outputs a default set of metrics for memory resources, gateway resources, and built-in tools. All of these metrics can be viewed in CloudWatch. In addition to the provided metrics, logs and spans are provided by default for memory resources, and by instrumenting your agent code, you can capture custom metrics, logs, and spans for your agent which can also be viewed on the CloudWatch generative AI observability page. See the following sections and View observability data for your Amazon Bedrock AgentCore agents to learn more.

![agentcore-observability-session-page](../imgs/agentcore-observability-session-page.png)



## Traces and Spans

### Trace
A trace represents a detailed record of a single `request-response` cycle beginning from with an agent invocation and may include additional calls to other agents. Traces capture the complete execution path of a request, including all internal processing steps, external service calls, decision points, and resource utilization. Each trace is associated with a specific session and provides granular visibility into the agent's behavior for a particular interaction.

Traces include the following components for agents:

- Request details including timestamps, input parameters, and context
- Processing steps showing the sequence of operations performed
- Tool invocations with input/output parameters and execution times
- Resource utilization metrics such as processing time
- Error information including exception details and recovery attempts
- Response generation details and final output

From an observability perspective, traces provide deep insights into the internal workings of your agents, allowing you to troubleshoot issues, optimize performance, and understand behavior patterns. By analyzing trace data, you can identify bottlenecks, detect anomalies, and verify that your agent is functioning as expected across different scenarios and inputs.

With Agentcore Runtime, agent trace is turned on by default on your behalf at deployment time so you don't have to do anything to obtain the metrics. 

Here's screenshot on Agentcore Summary that shows several key metrics at this level:

![agentcore-observability-trace-summary](../imgs/agentcore-observability-trace-summary.png)

Here is a screenshot of agent metrics at the `Trace` and `Span` level in CloudWatch:
![agentcore-observability-traces-top](../imgs/agentcore-observability-traces-top.png)

### Span

A span represents a discrete, measurable unit of work within an agent's execution flow. Spans capture fine-grained operations that occur during request processing, providing detailed visibility into the internal components and steps that make up a complete trace. Each span has a defined start and end time, creating a precise timeline of agent activities and their durations.

Spans include the following essential attributes for agent observability:

- Operation name identifying the specific function or process being executed
- Timestamps marking the exact start and end times of the operation
- Parent-child relationships showing how operations nest within larger processes
- Tags and attributes providing contextual metadata about the operation
- Events marking significant occurrences within the span's lifetime
- Status information indicating success, failure, or other completion states
- Resource utilization metrics specific to the operation

Spans form a hierarchical structure within traces, with parent spans encompassing child spans that represent more granular operations. For example, a high-level "process user query" span might contain child spans for "parse input," "retrieve context," "generate response," and "format output." This hierarchical organization creates a detailed execution tree that reveals the complete flow of operations within the agent.

Spans' visualization is organized into `Timeline` and `Trajectory`. Timeline provides a visual representation for the duration each span took to complete an event. Trajectory shows a visualized lineage of the spans and in an hierarchical structure.

Here's a screenshot that shows a `Timeline` view of a Spans within a Trace:

![agentcore-observability-span-timeline](../imgs/agentcore-observability-span-timeline.png)

Here's a screenshot for a Spans' Trajectory:

![agentcore-observability-spans-trajectory](../imgs/agentcore-observability-span-trajectory.png)

To further drill down into specific details for a span, you could use the span data to access the resource level data and the corresponding event details. Here's a screenshot of a span data:

![agentcore-observability-span-data](../imgs/agentcore-observability-spans-data.png)

You could drill down into the specific event to access the information such as the model prompts, system prompts and the response from an LLM invocation and more. 

Here's a screenshot of an event detail for a particular Span data:

![agentcore-observability-span-event](../imgs/agentcore-observability-span-event.png)

# Cleanup

In [None]:
agentcore_control_client = boto3.client(
    'bedrock-agentcore-control',
    region_name=region
)
ecr_client = boto3.client(
    'ecr',
    region_name=region
    
)

iam_client = boto3.client('iam')

runtime_delete_response = agentcore_control_client.delete_agent_runtime(
    agentRuntimeId=launch_result.agent_id,
    
)

response = ecr_client.delete_repository(
    repositoryName=launch_result.ecr_uri.split('/')[1],
    force=True
)

policies = iam_client.list_role_policies(
    RoleName=agentcore_iam_role['Role']['RoleName'],
    MaxItems=100
)

for policy_name in policies['PolicyNames']:
    iam_client.delete_role_policy(
        RoleName=agentcore_iam_role['Role']['RoleName'],
        PolicyName=policy_name
    )
iam_response = iam_client.delete_role(
    RoleName=agentcore_iam_role['Role']['RoleName']
)

os.remove("client.json")
os.remove("auth.json")
os.remove("pool.json")

## Summary
### **AgentCore Runtime Deployment Pipeline**
- **Learned**: End-to-end deployment workflow using the Bedrock AgentCore Python SDK
- **Accomplished**: 
  - Configured agent templates with dynamic knowledge base IDs
  - Created IAM roles with appropriate permissions
  - Built and pushed Docker containers to Amazon ECR
  - Deployed agents with custom configuration options
- **Key Insight**: The AgentCore starter toolkit simplifies the containerization and deployment process significantly

### **Session Management & Context Isolation**
- **Learned**: How AgentCore Runtime provides secure session isolation for multi-user environments
- **Accomplished**: Implemented session-based agent invocations with unique session IDs
- **Key Features Explored**:
  - **MicroVM Isolation**: Each session gets dedicated compute, memory, and filesystem resources
  - **State Persistence**: Maintains conversational context across multiple interactions
  - **Security Boundaries**: Prevents cross-session data contamination
  - **Memory Sanitization**: Automatic cleanup after session completion

### **Identity & Access Management Integration**
- **Learned**: How to secure agent endpoints with industry-standard authentication
- **Accomplished**: 
  - Provisioned Amazon Cognito User Pool with test users
  - Configured JWT-based authorization for agent access
  - Implemented bearer token authentication patterns
  - Tested both authenticated and unauthenticated access scenarios
- **Key Insight**: AgentCore Identity supports OAuth 2.0, Sigv4, and API key authentication methods

### **Production-Grade Observability**
- **Learned**: Comprehensive monitoring and debugging capabilities for AI agents
- **Key Components Mastered**:
  - **Runtime Metrics**: Invocation counts, latency, error rates, throttling, session analytics
  - **Session-Level Monitoring**: Complete user interaction lifecycle tracking
  - **Trace Analysis**: Detailed request-response cycle visibility with execution paths
  - **Span Granularity**: Fine-grained operation timing and hierarchical relationships
- **Tools Explored**: CloudWatch integration, OpenTelemetry compatibility, visual trace timelines


## **Key Takeaways**

1. **Seamless Migration**: Moving from local development to cloud production is streamlined with AgentCore Runtime
2. **Enterprise Features**: Session isolation, identity integration, and observability are built-in capabilities
3. **Developer Experience**: The starter toolkit abstracts complex containerization and deployment tasks
4. **Production Operations**: Comprehensive monitoring and debugging tools support enterprise requirements
5. **Security First**: Multiple authentication methods and session isolation ensure secure multi-tenant operations

## **Next Steps**
- Learn how to easily convert your lambda functions into MCP server using AgentCore Gateway feature in [02-mcp-integration-agentcore.ipynb](02-mcp-integration-agentocore.ipynb)
