# Lab 4: Deploy to Production with AgentCore Runtime

In this lab, you'll deploy the customer support agent to **AgentCore Runtime** as a production Flask server.

## What you'll learn
- How to package the Claude Agent SDK agent as a Flask server
- How to deploy to AgentCore Runtime using the starter toolkit
- How to invoke the deployed agent with JWT authentication
- How to use sessions for conversation continuity

## Architecture

<div style="text-align:left">
    <img src="images/architecture_lab4_runtime.png" width="75%"/>
</div>

## Key Pattern: Flask Server on Port 8080

Unlike Strands (which uses `BedrockAgentCoreApp` + `@app.entrypoint`), Claude Agent SDK
uses a standard Flask server:

```python
from flask import Flask
app = Flask(__name__)

@app.route('/ping')
def ping(): return {'status': 'healthy'}

@app.route('/invocations', methods=['POST'])
def invocations():
    # Use claude_agent_sdk.query()
    ...

app.run(host='0.0.0.0', port=8080)
```

## Step 1: Setup

In [None]:
import os
import json
import time

os.environ["CLAUDE_CODE_USE_BEDROCK"] = "1"
os.environ.pop("CLAUDECODE", None)

import boto3
from boto3.session import Session

boto_session = Session()
REGION = boto_session.region_name
account_id = boto3.client("sts").get_caller_identity()["Account"]

from utils.aws_helpers import (
    get_ssm_parameter,
    put_ssm_parameter,
    create_agentcore_runtime_execution_role,
    get_or_create_cognito_pool,
)

print(f"Region: {REGION}")
print(f"Account: {account_id}")

## Step 2: Review the Runtime Server

The Flask server at `runtime/app.py` handles:
- `/ping` - Health check endpoint
- `/invocations` - Main agent endpoint
- Memory context retrieval before `query()`
- Interaction saving after response
- Gateway tool integration via MCP

Key differences from the Strands runtime:

| Strands Runtime | Claude Agent SDK Runtime |
|----------------|------------------------|
| `BedrockAgentCoreApp()` | `Flask(__name__)` |
| `@app.entrypoint` | `@app.route('/invocations')` |
| `app.run()` | `app.run(host='0.0.0.0', port=8080)` |
| Automatic session handling | Manual session ID from headers |
| Hook-based memory | Explicit retrieve/save |

In [None]:
# Review the runtime server code
with open("runtime/app.py", "r") as f:
    print(f.read())

## Step 3: Create Execution Role

In [None]:
# Create or get the IAM execution role for AgentCore Runtime
execution_role_arn = create_agentcore_runtime_execution_role()
print(f"Execution Role ARN: {execution_role_arn}")

## Step 4: Configure Runtime Deployment

Use the starter toolkit configuration step to generate deployment assets and runtime settings.

<div style="text-align:left">
    <img src="images/configure.png" width="75%"/>
</div>


In [None]:
from bedrock_agentcore_starter_toolkit import Runtime

# Get Cognito config for authorization
cognito_config = get_or_create_cognito_pool(refresh_token=True)
client_id = cognito_config["client_id"]
discovery_url = cognito_config["discovery_url"]

# Initialize runtime
agentcore_runtime = Runtime()

# Configure deployment
response = agentcore_runtime.configure(
    entrypoint="runtime/app.py",
    execution_role=execution_role_arn,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=REGION,
    agent_name="customer_support_agent",
    authorizer_configuration={
        "customJWTAuthorizer": {
            "allowedClients": [client_id],
            "discoveryUrl": discovery_url,
        }
    },
    request_header_configuration={
        "requestHeaderAllowlist": [
            "Authorization",
        ]
    },
)

print("Runtime configured successfully!")

## Step 5: Launch Deployment

This builds the Docker container and deploys to AgentCore Runtime. It may take several minutes.

<div style="text-align:left">
    <img src="images/launch.png" width="100%"/>
</div>


In [None]:
# Get memory ID for environment variable
try:
    memory_id = get_ssm_parameter("/app/customersupport/agentcore/memory_id")
except Exception:
    memory_id = ""

# Launch deployment (auto_update_on_conflict handles existing runtimes)
launch_result = agentcore_runtime.launch(
    env_vars={
        "MEMORY_ID": memory_id,
        "CLAUDE_CODE_USE_BEDROCK": "1",
    },
    auto_update_on_conflict=True,
)

print("Deployment launched!")

## Step 6: Monitor Deployment Status

In [None]:
# Check deployment status
while True:
    status_response = agentcore_runtime.status()
    status = status_response.endpoint["status"]
    print(f"Status: {status}")
    
    if status == "READY":
        runtime_arn = status_response.endpoint.get("agentRuntimeArn", "")
        put_ssm_parameter("/app/customersupport/agentcore/runtime_arn", runtime_arn)
        print(f"\nRuntime ARN: {runtime_arn}")
        break
    elif status in ["CREATE_FAILED", "FAILED"]:
        print(f"Deployment failed: {status_response}")
        break
    
    time.sleep(30)

## Step 7: Invoke Deployed Agent

Use the runtime endpoint with a valid Cognito access token to test the deployed agent.

<div style="text-align:left">
    <img src="images/invoke.png" width="100%"/>
</div>


In [None]:
import uuid

# Refresh token
cognito_config = get_or_create_cognito_pool(refresh_token=True)
access_token = cognito_config["bearer_token"]

session_id = str(uuid.uuid4())

# Invoke the deployed agent
response = agentcore_runtime.invoke(
    {"prompt": "What is your return policy for smartphones?", "actor_id": "customer_001"},
    bearer_token=access_token,
    session_id=session_id,
)

print("=" * 60)
print("Response from deployed agent:")
print("=" * 60)
print(response)

In [None]:
# Test session continuity (same session_id)
response2 = agentcore_runtime.invoke(
    {"prompt": "And what about for laptops?", "actor_id": "customer_001"},
    bearer_token=access_token,
    session_id=session_id,
)

print("=" * 60)
print("Follow-up response (same session):")
print("=" * 60)
print(response2)

## Summary

In this lab, you deployed the agent to production:

1. **Flask server** on port 8080 with `/ping` and `/invocations`
2. **IAM execution role** with Bedrock, Memory, Gateway, and SSM permissions
3. **Docker deployment** via `bedrock-agentcore-starter-toolkit`
4. **JWT authentication** with Cognito
5. **Session management** for conversation continuity

In **Lab 5**, we'll set up online evaluations to monitor agent quality.