## 5: AgentCore Runtime

This notebook deploys your agent to AgentCore Runtime for production use with automatic scaling and enterprise reliability. You'll prepare agent code, configure deployment, and test the deployed agent.

**Prerequisites:** Completed 1 and 3, Docker/Finch/Podman installed

### Import Required Libraries

In [None]:
import os
import json
import time
import boto3
from bedrock_agentcore_starter_toolkit import Runtime

# Get AWS session information
session = boto3.Session()
region = session.region_name or 'us-west-2'

sts = session.client('sts')
identity = sts.get_caller_identity()
account_id = identity['Account']

print(f"Account ID: {account_id}")
print(f"Region: {region}")

### Load Configurations

In [None]:
# Load Cognito configuration from 2
with open('cognito_config.json', 'r') as f:
    cognito_config = json.load(f)

print("‚úÖ Loaded Cognito configuration")
print(f"Client ID: {cognito_config.get('client_id')}")
print(f"Discovery URL: {cognito_config.get('discovery_url')}")

# Load memory and knowledge base configurations from 1
with open('memory_config.json', 'r') as f:
    memory_config = json.load(f)
memory_id = memory_config['memory_id']

with open('kb_config.json', 'r') as f:
    kb_config = json.load(f)
kb_id = kb_config['kb_id']

print(f"\nMemory ID: {memory_id}")
print(f"Knowledge Base ID: {kb_id}")

### Step 1: Preparing Your Agent for AgentCore Runtime

To make your agent runtime-ready, add just 4 lines of code:
1. Import `BedrockAgentCoreApp`
2. Initialize the app
3. Decorate your function with `@app.entrypoint`
4. Call `app.run()`

The agent is configured with:
- **Knowledge Base**: Policy questions via `retrieve` tool
- **Gateway Tools**: Refund management (create, list, approve)
- **Three Memory Strategies**:
  - SEMANTIC: Retrieves factual details from past conversations
  - USER_PREFERENCE: Recalls customer preferences and behavior patterns
  - SUMMARY: Provides conversation context and summaries

In [None]:
%%writefile ./agent_runtime.py
import os
import json
import requests
from bedrock_agentcore.runtime import BedrockAgentCoreApp

from strands import Agent, tool
from strands.models import BedrockModel
from strands_tools import retrieve, current_time
from bedrock_agentcore.memory.integrations.strands.config import AgentCoreMemoryConfig, RetrievalConfig
from bedrock_agentcore.memory.integrations.strands.session_manager import AgentCoreMemorySessionManager
from utils.agent_memory import REGION, SESSION_ID, ACTOR_ID

MODEL_ID = "us.anthropic.claude-3-5-haiku-20241022-v1:0"

bedrock_model = BedrockModel(model_id=MODEL_ID, temperature=0.3)
app = BedrockAgentCoreApp()

kb_id = os.environ.get("KNOWLEDGE_BASE_ID", "NOT AVAILABLE")
memory_id = os.environ.get("MEMORY_ID")
if not memory_id:
    raise Exception("Environment variable MEMORY_ID is required")

# Get configuration from environment variables (loaded at runtime, not module load time)
gateway_url = os.environ.get("GATEWAY_URL")
cognito_client_id = os.environ.get("COGNITO_CLIENT_ID")
cognito_client_secret = os.environ.get("COGNITO_CLIENT_SECRET")
cognito_discovery_url = os.environ.get("COGNITO_DISCOVERY_URL")

# Cache for gateway token (will be fetched on first use)
_gateway_token_cache = None

def get_gateway_token():
    """Get OAuth token for gateway access"""
    import logging
    import base64
    logger = logging.getLogger(__name__)
    
    try:
        if not all([gateway_url, cognito_client_id, cognito_client_secret, cognito_discovery_url]):
            missing = []
            if not gateway_url: missing.append("GATEWAY_URL")
            if not cognito_client_id: missing.append("COGNITO_CLIENT_ID")
            if not cognito_client_secret: missing.append("COGNITO_CLIENT_SECRET")
            if not cognito_discovery_url: missing.append("COGNITO_DISCOVERY_URL")
            raise Exception(f"Missing required environment variables: {', '.join(missing)}")
        
        logger.info(f"Fetching token endpoint from: {cognito_discovery_url}")
        discovery_response = requests.get(cognito_discovery_url, timeout=10)
        discovery_response.raise_for_status()
        token_endpoint = discovery_response.json()['token_endpoint']
        logger.info(f"Token endpoint: {token_endpoint}")
        
        # Create Basic auth header
        credentials = f"{cognito_client_id}:{cognito_client_secret}"
        encoded_credentials = base64.b64encode(credentials.encode()).decode()
        
        logger.info("Requesting OAuth token...")
        response = requests.post(
            token_endpoint,
            headers={
                'Authorization': f'Basic {encoded_credentials}',
                'Content-Type': 'application/x-www-form-urlencoded'
            },
            data={
                'grant_type': 'client_credentials',
                'scope': 'workshop-api/read workshop-api/write'
            },
            timeout=10
        )
        response.raise_for_status()
        token = response.json()['access_token']
        logger.info("Successfully obtained OAuth token")
        return token
    except Exception as e:
        logger.error(f"Failed to get gateway token: {str(e)}")
        raise

def get_cached_gateway_token():
    """Get cached gateway token or fetch a new one"""
    global _gateway_token_cache
    if _gateway_token_cache is None:
        _gateway_token_cache = get_gateway_token()
    return _gateway_token_cache

# Gateway tool definitions
@tool
def create_refund_request(order_id: str, amount: str, reason: str, user_id: str = "user456") -> str:
    """Create a new refund request.
    
    Args:
        order_id: Order ID
        amount: Refund amount in dollars (as string, e.g., '49.99')
        reason: Reason for refund
        user_id: User ID (default: user456)
    """
    response = requests.post(
        gateway_url,
        headers={"Authorization": f"Bearer {get_cached_gateway_token()}", "Content-Type": "application/json"},
        json={
            "jsonrpc": "2.0",
            "id": 1,
            "method": "tools/call",
            "params": {
                "name": "CreateRefundRequest___create_refund_request",
                "arguments": {"order_id": order_id, "amount": float(amount), "reason": reason, "user_id": user_id}
            }
        }
    )
    response.raise_for_status()
    result = response.json()
    if 'error' in result:
        raise Exception(f"Gateway error: {result['error']}")
    return json.dumps(result.get('result', {}), indent=2)

@tool
def list_refund_requests(user_id: str = "user456") -> str:
    """List all refund requests for a user.
    
    Args:
        user_id: User ID (default: user456)
    """
    response = requests.post(
        gateway_url,
        headers={"Authorization": f"Bearer {get_cached_gateway_token()}", "Content-Type": "application/json"},
        json={
            "jsonrpc": "2.0",
            "id": 1,
            "method": "tools/call",
            "params": {
                "name": "ListReturnRequest___list_refund_requests",
                "arguments": {"user_id": user_id}
            }
        }
    )
    response.raise_for_status()
    result = response.json()
    if 'error' in result:
        raise Exception(f"Gateway error: {result['error']}")
    return json.dumps(result.get('result', {}), indent=2)

@tool
def approve_refund_request(refund_request_id: str, user_id: str = "user456", status: str = "approved", approver_notes: str = "") -> str:
    """Approve or reject a refund request.
    
    Args:
        refund_request_id: Refund request ID
        user_id: User ID (default: user456)
        status: Status - 'approved' or 'rejected' (default: approved)
        approver_notes: Optional notes from approver
    """
    response = requests.post(
        gateway_url,
        headers={"Authorization": f"Bearer {get_cached_gateway_token()}", "Content-Type": "application/json"},
        json={
            "jsonrpc": "2.0",
            "id": 1,
            "method": "tools/call",
            "params": {
                "name": "ApproveReturnRequest___approve_refund_request",
                "arguments": {
                    "refund_request_id": refund_request_id,
                    "user_id": user_id,
                    "status": status,
                    "approver_notes": approver_notes
                }
            }
        }
    )
    response.raise_for_status()
    result = response.json()
    if 'error' in result:
        raise Exception(f"Gateway error: {result['error']}")
    return json.dumps(result.get('result', {}), indent=2)

system_prompt = f"""You are an Amazon Returns & Refunds assistant with access to:
- Knowledge Base (retrieve tool with knowledgeBaseId="{kb_id}") for policy questions
- Gateway tools for refund management: create_refund_request, list_refund_requests, approve_refund_request
- Customer conversation history and preferences through memory

Use conversation history to provide personalized assistance. Reference past interactions when relevant.
For refund operations, use user_id 'user456' as default if not specified."""

@app.entrypoint
def invoke(payload, context=None):
    session_id = context.session_id if context else SESSION_ID
    actor_id = payload.get("actor_id", ACTOR_ID)
    
    agentcore_memory_config = AgentCoreMemoryConfig(
        memory_id=memory_id,
        session_id=session_id,
        actor_id=actor_id,
        retrieval_config={
            f"returns/customer/{actor_id}/semantic": RetrievalConfig(top_k=3, relevance_score=0.2),
            f"returns/customer/{actor_id}/preferences": RetrievalConfig(top_k=3, relevance_score=0.2),
            f"returns/customer/{actor_id}/{session_id}/summary": RetrievalConfig(top_k=2, relevance_score=0.2)
        }
    )
    
    session_manager = AgentCoreMemorySessionManager(
        agentcore_memory_config=agentcore_memory_config,
        region_name=REGION
    )
    
    agent = Agent(
        model=bedrock_model,
        tools=[retrieve, current_time, create_refund_request, list_refund_requests, approve_refund_request],
        system_prompt=system_prompt,
        session_manager=session_manager
    )
    
    user_input = payload.get("prompt", "")
    response = agent(user_input)
    return response.message["content"][0]["text"]

if __name__ == "__main__":
    app.run()

### What Happens Behind the Scenes?

`BedrockAgentCoreApp` automatically creates an HTTP server on port 8080, implements `/invocations` and `/ping` endpoints, and handles proper content types and error handling.

### Step 2: Configure the Runtime Deployment

**‚ö†Ô∏è IMPORTANT: If you get ResourceNotFoundException, run the cell below first!**

Configure deployment settings including entrypoint, execution role, and Cognito authentication.

In [None]:
import os

# Remove old configuration if it exists
config_file = '.bedrock_agentcore.yaml'
if os.path.exists(config_file):
    print("üßπ Found existing configuration file...")
    os.remove(config_file)
    print("‚úÖ Removed old configuration. Will create fresh deployment.")
else:
    print("‚úÖ No old configuration found. Ready for fresh deployment.")

In [None]:
# Initialize the AgentCore runtime toolkit
agentcore_runtime = Runtime()

# Configure the AgentCore agent deployment
response = agentcore_runtime.configure(
    entrypoint="agent_runtime.py",
    auto_create_ecr=True,
    execution_role=cognito_config.get("execution_role"),
    auto_create_execution_role=False,
    memory_mode="NO_MEMORY",
    requirements_file="requirements.txt",
    region=region,
    agent_name="returns_refunds_agent",
    authorizer_configuration={
        "customJWTAuthorizer": {
            "allowedClients": [cognito_config.get("client_id")],
            "discoveryUrl": cognito_config.get("discovery_url"),
        }
    },
)

print("Configuration completed:", response)

### View Generated Configuration

In [None]:
!cat .bedrock_agentcore.yaml

### Step 3: Launch the Agent

Deploy to AgentCore Runtime. This creates a CodeBuild pipeline, ECR repository, and runtime components.

**Note:** If you previously deleted the agent or are re-running this lab, the launch will create a new agent instance.

In [None]:
# Load gateway configuration
with open('gateway_config.json', 'r') as f:
    gateway_config = json.load(f)

print("üöÄ Launching agent to AgentCore Runtime...")
print("   This will:")
print("   1. Build a Docker container with your agent code")
print("   2. Push the container to Amazon ECR")
print("   3. Deploy to AgentCore Runtime")
print("   4. Configure authentication and environment variables")
print("\n‚è≥ This process typically takes 5-10 minutes...\n")

try:
    launch_result = agentcore_runtime.launch(
        env_vars={
            "MEMORY_ID": memory_id,
            "KNOWLEDGE_BASE_ID": kb_id,
            "GATEWAY_URL": gateway_config['gateway_url'],
            "COGNITO_CLIENT_ID": cognito_config['client_id'],
            "COGNITO_CLIENT_SECRET": cognito_config['client_secret'],
            "COGNITO_DISCOVERY_URL": cognito_config['discovery_url']
        },
        auto_update_on_conflict=True
    )
    
    print(f"‚úÖ Launch initiated successfully!")
    print(f"   Agent ARN: {launch_result.agent_arn}")
    
    # Save runtime configuration to JSON file
    runtime_config = {
        "agent_arn": launch_result.agent_arn
    }
    with open('runtime_config.json', 'w') as f:
        json.dump(runtime_config, f, indent=2)
    
    print("\n‚úÖ Runtime configuration saved to runtime_config.json")
    print("\nüí° Proceed to the next cell to monitor deployment status.")
    
except Exception as e:
    print(f"‚ùå Launch failed: {e}")
    print("\nCommon issues:")
    print("1. Docker/Finch/Podman not running - start your container runtime")
    print("2. Insufficient IAM permissions - check your AWS credentials")
    print("3. ECR repository issues - verify ECR access")
    print("4. Configuration file missing - ensure gateway_config.json exists")
    raise

### Step 4: Check Deployment Status

Monitor the deployment progress. The agent goes through several stages: building the container image, pushing to ECR, and deploying to AgentCore Runtime.

In [None]:
# Wait for the agent to be ready
print("üîç Checking agent deployment status...\n")

try:
    status_response = agentcore_runtime.status()
    status = status_response.endpoint["status"]
    
    end_status = ["READY", "CREATE_FAILED", "DELETE_FAILED", "UPDATE_FAILED"]
    
    while status not in end_status:
        print(f"‚è≥ Current status: {status}")
        time.sleep(10)
        status_response = agentcore_runtime.status()
        status = status_response.endpoint["status"]
    
    if status == "READY":
        print(f"\n‚úÖ Agent is {status} and ready to use!")
    else:
        print(f"\n‚ö†Ô∏è Agent deployment ended with status: {status}")
        print("Check CloudWatch logs for details.")
        
except Exception as e:
    print(f"‚ö†Ô∏è Error checking status: {e}")
    print("\nThis might happen if:")
    print("1. The agent is still being created (wait a few minutes)")
    print("2. The agent was deleted and needs to be re-launched")
    print("3. There's a configuration mismatch")
    print("\nüí° If the agent was previously deleted, re-run the launch cell above.")

### Step 5: Test Your Deployed Agent

Test the agent with Knowledge Base queries, Gateway tool calls, and memory-aware responses.

**Note:** If you get a 424 error, the agent may still be initializing. Wait a minute and try again.

### Test 1: Knowledge Base Integration

Test the agent's ability to query the knowledge base for return policy information.

In [None]:
from utils.identity_ssm_utils import reauthenticate_user

# Get fresh bearer token
bearer_token = reauthenticate_user(
    cognito_config.get("client_id"),
    cognito_config.get("client_secret")
)

print("\n" + "="*80)
print("TEST 1: Knowledge Base Query")
print("="*80)

# Test 1: Knowledge Base query
query1 = "What's the return policy for electronics in the US?"
print(f"\nQuery: {query1}\n")

try:
    response1 = agentcore_runtime.invoke(
        {"prompt": query1},
        bearer_token=bearer_token
    )
    print(f"Response: {response1}\n")
except Exception as e:
    print(f"‚ùå Error: {e}")
    print("\nTroubleshooting tips:")
    print("1. Check if agent status is READY (run the status cell above)")
    print("2. Wait 1-2 minutes for the agent to fully initialize")
    print("3. Verify the gateway_config.json and cognito_config.json files exist")
    print("4. Check CloudWatch logs for the agent runtime")

print("="*80)

### Test 2: Gateway Tool - Create Refund Request

Test the agent's ability to create a refund request using the gateway tool.

In [None]:
from utils.identity_ssm_utils import reauthenticate_user

# Get fresh bearer token
bearer_token = reauthenticate_user(
    cognito_config.get("client_id"),
    cognito_config.get("client_secret")
)

print("\n" + "="*80)
print("TEST 2: Create Refund Request (Gateway Tool)")
print("="*80)

# Test 2: Create refund request
query2 = "Create a refund request for order ORD-12345 with amount $49.99 because the product arrived damaged. Use user_id user456"
print(f"\nQuery: {query2}\n")

try:
    response2 = agentcore_runtime.invoke(
        {"prompt": query2},
        bearer_token=bearer_token
    )
    print(f"Response: {response2}\n")
except Exception as e:
    print(f"‚ùå Error: {e}\n")

print("="*80)

### Test 3: Gateway Tool - List Refund Requests

Test the agent's ability to list refund requests using the gateway tool.

In [None]:
from utils.identity_ssm_utils import reauthenticate_user

# Get fresh bearer token
bearer_token = reauthenticate_user(
    cognito_config.get("client_id"),
    cognito_config.get("client_secret")
)

print("\n" + "="*80)
print("TEST 3: List Refund Requests (Gateway Tool)")
print("="*80)

# Test 3: List refund requests
query3 = "List all refund requests for user_id user456"
print(f"\nQuery: {query3}\n")

try:
    response3 = agentcore_runtime.invoke(
        {"prompt": query3},
        bearer_token=bearer_token
    )
    print(f"Response: {response3}\n")
except Exception as e:
    print(f"‚ùå Error: {e}\n")

print("="*80)

### Summary

You've deployed your agent to production with just 4 lines of code changes. The agent now runs in a scalable, managed environment with automatic scaling and enterprise reliability.

### Next Steps

- **6: AgentCore Observability** - Monitor and trace your agent