# Lab 4: Deploy your Agent to Production with AgentCore Runtime

## Overview

In Lab 3 we centralized tools through AgentCore Gateway. Now we'll deploy our agent to production using [Amazon Bedrock AgentCore Runtime](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/agents-tools-runtime.html) - a secure, fully managed runtime that provides enterprise-grade reliability, automatic scaling, and comprehensive monitoring.

**Workshop Journey:**
- **Lab 1 (Done):** Create Agent Prototype
- **Lab 2 (Done):** Enhance with Memory
- **Lab 3 (Done):** Scale with Gateway & Identity
- **Lab 4 (Current):** Deploy to Production
- **Lab 5:** Build User Interface

### Why AgentCore Runtime Matters

**Current State (Lab 1-3):** Agent runs locally in a single session with no monitoring, cannot handle multiple concurrent users.

**After this lab:** Production-ready infrastructure with serverless auto-scaling, comprehensive observability (traces, metrics, logs), enterprise reliability, and secure deployment.

### Comprehensive Observability

[AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) automatically captures traces, metrics, and logs from agent interactions, tool usage, and memory access. CloudWatch GenAI Observability provides dashboards for analyzing patterns, identifying bottlenecks, and real-time troubleshooting.

## Architecture for Lab 4

<div style="text-align:left">
    <img src="images/Intelligent Customer Service Assistant Lab4 - Runtime.png" width="100%"/>
</div>

### Step 1: Import Required Libraries

In [None]:
# Import required libraries
import os
import json
import boto3
from strands import Agent
from strands.models import BedrockModel
from lab_helpers.lab2_memory import create_or_get_memory_resource

from IPython.display import Markdown, display
def printmd(string):
    display(Markdown(string))

create_or_get_memory_resource()  # Just in case the memory lab wasn't executed

print("[WHITE HEAVY CHECK MARK] Required libraries imported.")

### Step 2: Preparing Your Agent for AgentCore Runtime

#### Creating the Runtime-Ready Agent

Let's first define the necessary AgentCore Runtime components via Python SDK within our previous local agent implementation.

Observe the `#### AGENTCORE RUNTIME - LINE i ####` comments below to see where is the relevant deployment code added. You'll find 4 such lines that prepare the runtime-ready agent:

1. Import the Runtime App with `from bedrock_agentcore.runtime import BedrockAgentCoreApp`
2. Initialize the App with `app = BedrockAgentCoreApp()`
3. Decorate our invocation function with `@app.entrypoint`
4. Let AgentCore Runtime control the execution with `app.run()`


In [None]:
%%writefile ./lab_helpers/lab4_runtime.py
from bedrock_agentcore.runtime import (
    BedrockAgentCoreApp,
)  #### AGENTCORE RUNTIME - LINE 1 ####
from strands import Agent
from strands.models import BedrockModel
from strands.tools.mcp import MCPClient
from mcp.client.streamable_http import streamablehttp_client
import requests
import boto3
from scripts.utils import get_ssm_parameter, get_cognito_client_secret
from lab_helpers.lab1_strands_agent import (
    get_return_policy,
    get_product_info,
    MODEL_ID,
)

from lab_helpers.lab2_memory import (
    CustomerSupportMemoryHooks,
    memory_client,
    ACTOR_ID,
    SESSION_ID,
)

# Lab1 import: Create the Bedrock model
model = BedrockModel(model_id=MODEL_ID)

# Lab2 import : Initialize memory via hooks
memory_id = get_ssm_parameter("/app/customersupport/agentcore/memory_id")
memory_hooks = CustomerSupportMemoryHooks(
    memory_id, memory_client, ACTOR_ID, SESSION_ID
)

# Lab3 import: Set up gateway client for MCP tools
def get_token(client_id: str, client_secret: str, scope_string: str, url: str) -> dict:
    """Get OAuth token for gateway authentication"""
    try:
        headers = {"Content-Type": "application/x-www-form-urlencoded"}
        data = {
            "grant_type": "client_credentials",
            "client_id": client_id,
            "client_secret": client_secret,
            "scope": scope_string,
        }
        response = requests.post(url, headers=headers, data=data)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.RequestException as err:
        return {"error": str(err)}

# Get gateway access token
gateway_access_token = get_token(
    get_ssm_parameter("/app/customersupport/agentcore/machine_client_id"),
    get_cognito_client_secret(),
    get_ssm_parameter("/app/customersupport/agentcore/cognito_auth_scope"),
    get_ssm_parameter("/app/customersupport/agentcore/cognito_token_url")
)

# Get gateway URL from SSM
gateway_id = get_ssm_parameter("/app/customersupport/agentcore/gateway_id")
gateway_client = boto3.client("bedrock-agentcore-control", region_name=boto3.session.Session().region_name)
gateway_response = gateway_client.get_gateway(gatewayIdentifier=gateway_id)
gateway_url = gateway_response["gatewayUrl"]

# Set up MCP client for gateway tools
mcp_client = MCPClient(
    lambda: streamablehttp_client(
        gateway_url,
        headers={"Authorization": f"Bearer {gateway_access_token['access_token']}"},
    )
)

# Initialize MCP client
try:
    mcp_client.start()
except Exception as e:
    print(f"Error initializing MCP client: {str(e)}")

# Combine local tools with gateway tools
tools = [get_return_policy, get_product_info] + mcp_client.list_tools_sync()

# Customer Service Assistant system prompt for runtime
CUSTOMER_SERVICE_SYSTEM_PROMPT = """You are a helpful and professional Customer Service Assistant for an electronics e-commerce company.
Your role is to:
- Help customers with return policy questions using accurate information
- Provide detailed product information and specifications
- Assist with order tracking and status inquiries
- Be friendly, patient, and understanding with customers
- Always offer additional help after answering questions
- If you can't help with something, direct customers to the appropriate contact

You have access to the following tools:
1. get_return_policy() - For return policy and warranty questions
2. get_product_info() - To get detailed product specifications and information
3. Gateway tools - For order tracking and customer profile information

Always use the appropriate tool to get accurate, up-to-date information rather than making assumptions about products, policies, or order status."""

# Create the Customer Service Assistant agent with all tools (local + gateway)
agent = Agent(
    model=model,
    tools=tools,  # Includes both local tools and gateway tools from MCP client
    system_prompt=CUSTOMER_SERVICE_SYSTEM_PROMPT,
    hooks=[memory_hooks],
)

# Initialize the AgentCore Runtime App
app = BedrockAgentCoreApp()  #### AGENTCORE RUNTIME - LINE 2 ####


@app.entrypoint  #### AGENTCORE RUNTIME - LINE 3 ####
def invoke(payload):
    """AgentCore Runtime entrypoint function"""
    user_input = payload.get("prompt", "")

    # Invoke the agent
    response = agent(user_input)
    return response.message["content"][0]["text"]


if __name__ == "__main__":
    app.run()  #### AGENTCORE RUNTIME - LINE 4 ####



#### What happens behind the scenes?

When you use `BedrockAgentCoreApp`, it automatically:

- Creates an HTTP server that listens on port 8080
- Implements the required `/invocations` endpoint for processing requests
- Implements the `/ping` endpoint for health checks
- Handles proper content types and response formats
- Manages error handling according to AWS standards


### Step 3: Deploying to AgentCore Runtime

Now let's deploy our agent to AgentCore Runtime using the [AgentCore Starter Toolkit](https://github.com/aws/bedrock-agentcore-starter-toolkit).

#### Configure the Secure Runtime Deployment (AgentCore Runtime + AgentCore Identity)

First we will use our starter toolkit to configure the AgentCore Runtime deployment with an entrypoint, the execution role we will create and a requirements file. We will also configure the identity authorization using an Amazon Cognito user pool and we will configure the starter kit to auto create the Amazon ECR repository on launch.

During the configure step, your docker file will be generated based on your application code

<div style="text-align:left"> 
    <img src="images/configure.png" width="75%"/> 
</div>

**Note**: The Cognito access_token is valid for 2 hours only. If the access_token expires you can vend another access_token by using the `reauthenticate_user` method.


In [None]:
from lab_helpers.utils import setup_cognito_user_pool, reauthenticate_user

print("Setting up Amazon Cognito user pool...")
cognito_config = (
    setup_cognito_user_pool()
)  # You'll get your bearer token from this output cell.
print("[WHITE HEAVY CHECK MARK] Cognito setup completed")

In [None]:
%%capture --no-stdout 
from bedrock_agentcore_starter_toolkit import Runtime
from lab_helpers.utils import create_agentcore_runtime_execution_role

# Initialize the runtime toolkit
boto_session = boto3.session.Session()
region = boto_session.region_name

execution_role_arn = create_agentcore_runtime_execution_role()

agentcore_runtime = Runtime()

# Configure the deployment
response = agentcore_runtime.configure(
    entrypoint="lab_helpers/lab4_runtime.py",
    execution_role=execution_role_arn,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name="customer_service_assistant",
    authorizer_configuration={
        "customJWTAuthorizer": {
            "allowedClients": [cognito_config.get("client_id")],
            "discoveryUrl": cognito_config.get("discovery_url"),
        }
    },
)

print("[WHITE HEAVY CHECK MARK] Configuration completed:", response)

#### Launch the Agent

Now let's launch our agent to AgentCore Runtime. This will create an AWS CodeBuild pipeline, the Amazon ECR repository and the AgentCore Runtime components.

<div style="text-align:left"> 
    <img src="images/launch.png" width="100%"/> 
</div>

In [None]:
# Launch the agent (this will build and deploy the container)
from lab_helpers.utils import put_ssm_parameter

printmd("**--------------------------------------------------------------------------------------------------------**")
printmd("**Please wait until the agent is launched. THIS WILL TAKE 2-3 MINUTES...**")
printmd("**--------------------------------------------------------------------------------------------------------**")

launch_result = agentcore_runtime.launch()
print("Launch completed:", launch_result.agent_arn)

agent_arn = put_ssm_parameter(
    "/app/customersupport/agentcore/runtime_arn", launch_result.agent_arn
)

print("[WHITE HEAVY CHECK MARK] Agent Launched!!")

#### Check Deployment Status

Let's wait for the deployment to complete:


In [None]:
import time
from scripts.utils import get_ssm_parameter

# Wait for memory creation
client = boto3.client('bedrock-agentcore-control')
memory_id = get_ssm_parameter("/app/customersupport/agentcore/memory_id")
memory_status = "UNKNOWN"
memory_end_status = ["ACTIVE", "CREATING", "FAILED", "DELETING"]
while memory_status not in memory_end_status:
    print(f"Waiting for memory to be ready... Current status: {memory_status}")
    response = client.get_memory(memoryId=memory_id)
    memory_status = response['memory']['status']
    time.sleep(10)

print(f"[WHITE HEAVY CHECK MARK] Final memory status: {memory_status}")

In [None]:
import time

# Wait for the agent to be ready
status_response = agentcore_runtime.status()
status = status_response.endpoint["status"]

end_status = ["READY", "CREATE_FAILED", "DELETE_FAILED", "UPDATE_FAILED"]
while status not in end_status:
    print(f"Waiting for deployment... Current status: {status}")
    time.sleep(10)
    status_response = agentcore_runtime.status()
    status = status_response.endpoint["status"]

print(f"[WHITE HEAVY CHECK MARK] Final deployment status: {status}")

### Step 4: Invoking Your Deployed Agent

Now that our agent is deployed and ready, let's test it with some queries. We invoke the agent with the right authorization token type. In out case it'll be Cognito access token. Copy the access token from the cell above

<div style="text-align:left"> 
    <img src="images/invoke.png" width="100%"/> 
</div>

#### Using the AgentCore Starter Toolkit

We can validate that the agent works using the AgentCore Starter Toolkit for invocation. The starter toolkit can automatically create a session id for us to query our agent. Alternatively, you can also pass the session id as a parameter during invocation. For demonstration purpose, we will create our own session id.

In [None]:
import uuid

# Create a session ID for demonstrating session continuity
session_id = uuid.uuid4()

# Test different customer service scenarios
user_query = "What's the return policy for smartphones if I'm not satisfied with my purchase?"

bearer_token = reauthenticate_user(
    cognito_config.get("client_id"), 
    cognito_config.get("client_secret")
)

response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id)
)
response

#### Invoking the agent with session continuity

Since we are using AgentCore Runtime, we can easily continue our conversation with the same session id.

In [None]:
user_query = "I bought it 2 weeks ago, do I still have time to return it?"
response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id)
)
response

#### Invoking the agent with a new user
In the example below we have not mentioned the smartphone in the second query, but our agent still has the context of it. This is due to the AgentCore Runtime session continuity. The agent won't know the context for a new user.

In [None]:
# Creating a new session ID for demonstrating new customer
session_id2 = uuid.uuid4()

user_query = "Can you help me track my order? I need to know when it will arrive."
response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id2)
)
response

#### Additional Customer Service Test Scenarios

Let's test more Customer Service Assistant capabilities with different types of queries:

In [None]:
# Test product information query
session_id3 = uuid.uuid4()

user_query = "Can you tell me about your laptop specifications? I'm looking for something with good performance."
response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id3)
)
response

In [None]:
# Test order tracking with specific order ID (this will use gateway tools)
session_id4 = uuid.uuid4()

user_query = "What's the status of my order ORD-2025-001234? When will it be delivered?"
response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id4)
)
response

In [None]:
# Test specific return policy question
session_id5 = uuid.uuid4()

user_query = "I bought headphones last week but they don't fit well. What's your return policy for accessories?"
response = agentcore_runtime.invoke(
    {"prompt": user_query}, 
    bearer_token=bearer_token,
    session_id=str(session_id5)
)
response

In this case our agent does not have the context anymore and needs more information. 

And that is all it takes to have a secure and scalable endpoint for our Customer Service Assistant with no need to manage all the underlying infrastructure!

### Step 5: AgentCore Observability

[AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) provides monitoring and tracing capabilities for AI agents using Amazon OpenTelemetry Python Instrumentation and Amazon CloudWatch GenAI Observability.

**Note: If the sessions and traces are not visible yet, please continue to Lab 05. Then return and check again, as it may take up to 15 minutes for the logs to appear.**

#### Agents

Default AgentCore Runtime configuration allows for logging our agent's traces on CloudWatch by means of **AgentCore Observability**. These traces can be seen on the AWS CloudWatch GenAI Observability dashboard. 

Navigate to [Amazon Cloudwatch](https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2) &rarr; GenAI Observability &rarr; Bedrock AgentCore.

![Agents Overview on CloudWatch](images/observability_agents.png)

#### Sessions

The Sessions view shows the list of all the sessions associated with all agents in your account.

![sessions](images/sessions_lab5_observability.png)

#### Traces

Trace view lists all traces from your agents in this account. To work with traces:

- Choose Filter traces to search for specific traces.
- Sort by column name to organize results.
- Under Actions, select Logs Insights to refine your search by querying across your log and span data or select Export selected traces to export.

![traces](images/traces_lab4_observability.png)


### Congratulations! [PARTY POPPER]

You have successfully completed **Lab 4: Deploy Customer Service Assistant to Production - Use AgentCore Runtime with Observability!**

Here is what you accomplished:

##### Production-Ready Deployment:

- Prepared your agent for production with minimal code changes (only 4 lines added)
- Validated proper session isolation between different customers
- Confirmed session continuity + memory persistence and context awareness per session

##### Enterprise-Grade Security & Identity:

- Implemented secure authentication using Cognito integration with JWT tokens
- Configured proper IAM roles and execution permissions for production workloads
- Established identity-based access control for secure agent invocation

##### Comprehensive Observability:

- Enabled AgentCore Observability for full request tracing across all customer sessions
- Configured CloudWatch GenAI Observability dashboard monitoring

##### Current Limitations (We'll fix these next!):

- **Developer Focused Interaction** - Agent accessible via SDK/API calls but no user-friendly web interface
- **Manual Session Management** - Requires programmatic session creation rather than intuitive user experience

##### Next Up [Lab 5: Build User Interface â†’](lab-05-frontend.ipynb)
In Lab 5, you'll complete the customer experience by building a user-friendly interface !! Lets go !!
