# Deploying Strands Agents to AWS Bedrock AgentCore

This notebook demonstrates how to deploy Strands Agents agent to AWS Bedrock AgentCore using direct boto3 API calls. It also enables you to visualize the agent's decision-making process in the GenAI Observability dashboard in Amazon CloudWatch.

If you're not familiar with the Strands Agents SDK, check out the [official documentation](https://strandsagents.com).

To can access the Amazon Bedrock AgentCore Developer Guide, check out the [AWS documentation](https://docs.aws.amazon.com/bedrock-agentcore/).

## Prerequisites

- Python 3.12 or later
- boto3 1.39 or later
- AWS CLI installed and configured with appropriate permissions

In addition, you should have the following readily available:
- Amazon S3 bucket to package code
- AWS IAM role for automation access
- Knowledge bases are fully synched
- External functions for enterprise systems

## Step 0: Install required packages

Ensure you have the latest versions required installed.

In [None]:
# Upgrade boto3
!pip install boto3 botocore --upgrade

## Step 1: Import required libraries

Import all necessary Python libraries for AWS interactions and file handling.

In [None]:
import os
import re
import sys
import json
import time
import uuid
import string
import random
import zipfile
import tempfile
import collections
import boto3
import botocore
from pathlib import Path

## Step 2: Set required parameters

Update the parameters below using your environment specific details. These parameters will be used throughout the notebook for creating and configuring the agent.

In [None]:
Knowledge_Base_1_Id = '<provide 1st Knowledge Base Id>'
Knowledge_Base_2_Id = '<provide 2nd Knowledge Base Id>'
System_Function_1_Name = '<provide 1st System Function Name>'
System_Function_2_Name = '<provide 2nd System Function Name>'
Agent_Directory_Name = '<provide agent directory name based on event use case>'
CodeBucketForAutomationName = '<provide CodeBucketForAutomationName - not full ARN>'
SolutionAccessRoleArn = '<provide SolutionAccessRoleArn>'

## Step 3: Verify agent code requirements

We'll check that necessary files for the agent exist in the expected locations before proceeding with the deployment.

In [None]:
# Verify AgentCore entrypoint exists
agentcore_entrypoint_file = Path(f"{Agent_Directory_Name}/agentcore_entrypoint.py")
if agentcore_entrypoint_file.exists():
    print(f"AgentCore entrypoint found at {agentcore_entrypoint_file}")
else:
    print(f"AgentCore entrypoint not found at {agentcore_entrypoint_file}")

# Verify requirements file exists
requirements_file = Path(f"{Agent_Directory_Name}/requirements.txt")
if requirements_file.exists():
    print(f"Requirements file found at {requirements_file}")
else:
    print(f"Requirements file not found at {requirements_file}")

##### Step 4: Create Dockerfile for Agent

We need to create a Dockerfile to package the agent for deployment to AgentCore Runtime. This Dockerfile will include all necessary dependencies and configuration.

In [None]:
def create_dockerfile():
    '''Create a Dockerfile for AgentCore Runtime'''
    dockerfile_content = f'''
FROM --platform=linux/arm64 public.ecr.aws/docker/library/python:3.12-slim-bookworm

WORKDIR /app

# Copy requirements and install dependencies
COPY {Agent_Directory_Name}/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install bedrock-agentcore
RUN pip install aws-opentelemetry-distro

# Copy agent code and tools
COPY {Agent_Directory_Name}/*.py ./

# Set default AWS region
ENV AWS_DEFAULT_REGION=${{AWS_DEFAULT_REGION}}

# Set agent resources variables
ENV KNOWLEDGE_BASE_1_ID=${{KNOWLEDGE_BASE_1_ID}}
ENV KNOWLEDGE_BASE_2_ID=${{KNOWLEDGE_BASE_2_ID}}
ENV SYSTEM_FUNCTION_1_NAME=${{SYSTEM_FUNCTION_1_NAME}}
ENV SYSTEM_FUNCTION_2_NAME=${{SYSTEM_FUNCTION_2_NAME}}

# OpenTelemetry Configuration for AWS CloudWatch GenAI Observability
ENV OTEL_PYTHON_DISTRO=aws_distro
ENV OTEL_PYTHON_CONFIGURATOR=aws_configurator
ENV OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
ENV OTEL_TRACES_EXPORTER=otlp
ENV OTEL_EXPORTER_OTLP_LOGS_HEADERS=x-aws-log-group=agents/strands-agent-logs,x-aws-log-stream=default,x-aws-metric-namespace=agents
ENV OTEL_RESOURCE_ATTRIBUTES=service.name=strands-agent
ENV AGENT_OBSERVABILITY_ENABLED=true

# Expose the port that AgentCore Runtime expects
EXPOSE 8080

# Run the agent
CMD ["opentelemetry-instrument", "python", "agentcore_entrypoint.py"]
'''
    
    # Write the Dockerfile
    dockerfile_path = Path("Dockerfile")
    with open(dockerfile_path, 'w') as f:
        f.write(dockerfile_content)
    
    return dockerfile_path

# Create the Dockerfile
dockerfile_path = create_dockerfile()
print(f"Dockerfile created at {dockerfile_path}")

## Step 5: Build Docker image using SageMaker Docker Build CLI

Use the SageMaker Docker Build CLI to build and push our Docker image to Amazon ECR. This tool handles the Docker build process in the background using AWS CodeBuild.

In [None]:
# Define required functions for building arm64 container image

Position = collections.namedtuple("Position", ["timestamp", "skip"])

def _log_stream(client, log_group, stream_name, position):
    start_time, skip = position
    next_token = None
    event_count = 1
    while event_count > 0:
        token_arg = {"nextToken": next_token} if next_token else {}
        response = client.get_log_events(
            logGroupName=log_group, logStreamName=stream_name,
            startTime=start_time, startFromHead=True, **token_arg
        )
        next_token = response["nextForwardToken"]
        events = response["events"]
        event_count = len(events)
        if event_count > skip:
            events = events[skip:]
            skip = 0
        else:
            skip = skip - event_count
            events = []
        for ev in events:
            ts, count = position
            if ev["timestamp"] == ts:
                position = Position(timestamp=ts, skip=count + 1)
            else:
                position = Position(timestamp=ev["timestamp"], skip=1)
            yield ev, position

def _logs_for_build(build_id, session, wait=False, poll=10):
    codebuild = session.client("codebuild")
    description = codebuild.batch_get_builds(ids=[build_id])["builds"][0]
    status = description["buildStatus"]
    log_group = description["logs"].get("groupName")
    stream_name = description["logs"].get("streamName")
    position = Position(timestamp=0, skip=0)
    config = botocore.config.Config(retries={"max_attempts": 15})
    client = session.client("logs", config=config)
    
    while log_group is None and status == "IN_PROGRESS":
        time.sleep(poll)
        description = codebuild.batch_get_builds(ids=[build_id])["builds"][0]
        log_group = description["logs"].get("groupName")
        stream_name = description["logs"].get("streamName")
        status = description["buildStatus"]
    
    last_describe_job_call = time.time()
    dot_printed = False
    dot = True
    
    while True:
        for event, position in _log_stream(client, log_group, stream_name, position):
            print(event["message"].rstrip())
            if dot:
                dot = False
                if dot_printed:
                    print()
        
        if not wait or status != "IN_PROGRESS":
            break
            
        time.sleep(poll)
        if dot:
            print(".", end="")
            sys.stdout.flush()
            dot_printed = True
            
        if time.time() - last_describe_job_call >= 30:
            description = codebuild.batch_get_builds(ids=[build_id])["builds"][0]
            status = description["buildStatus"]
            last_describe_job_call = time.time()
            if status != "IN_PROGRESS":
                print()
                break

def build_arm64_image(role, bucket, repository_name, verbose=False):
    session = boto3.Session()
    account_id = session.client("sts").get_caller_identity()["Account"]
    region = session.region_name
    
    # Upload source code
    random_suffix = "".join(random.choices(string.ascii_letters, k=16))
    key = f"codebuild-{random_suffix}.zip"
    
    with tempfile.TemporaryFile() as tmp:
        with zipfile.ZipFile(tmp, "w") as zip:
            for dirname, _, filelist in os.walk("."):
                for file in filelist:
                    zip.write(f"{dirname}/{file}")
            # Add buildspec for ARM64
            buildspec = """version: 0.2
phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 763104351884.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 217643126080.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 727897471807.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 626614931356.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 683313688378.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 520713654638.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin 462105765813.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
  build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG .
      - docker tag $IMAGE_REPO_NAME:$IMAGE_TAG $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG
  post_build:
    commands:
      - echo Build completed on `date`
      - echo Pushing the Docker image...
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/$IMAGE_REPO_NAME:$IMAGE_TAG"""
            zip.writestr("buildspec.yml", buildspec)
        tmp.seek(0)
        session.client("s3").upload_fileobj(tmp, bucket, key)
    
    # Create ECR repo
    try:
        session.client("ecr").create_repository(repositoryName=repository_name)
        print(f"Created ECR repository {repository_name}")
    except:
        pass
    
    # Create and run CodeBuild project
    project_name = f"build-{random_suffix}"
    codebuild = session.client("codebuild")
    
    codebuild.create_project(
        name=project_name,
        source={"type": "S3", "location": f"{bucket}/{key}"},
        artifacts={"type": "NO_ARTIFACTS"},
        environment={
            "type": "ARM_CONTAINER",
            "image": "aws/codebuild/amazonlinux2-aarch64-standard:3.0",
            "computeType": "BUILD_GENERAL1_SMALL",
            "environmentVariables": [
                {"name": "AWS_DEFAULT_REGION", "value": region},
                {"name": "AWS_ACCOUNT_ID", "value": account_id},
                {"name": "IMAGE_REPO_NAME", "value": repository_name},
                {"name": "IMAGE_TAG", "value": "latest"},
            ],
            "privilegedMode": True,
        },
        serviceRole=f"arn:aws:iam::{account_id}:role/{role}",
    )
    
    build_id = codebuild.start_build(projectName=project_name)["build"]["id"]
    print(f"Starting build {build_id} with verbose={verbose}")
    
    if verbose:
        # Stream logs using library implementation
        _logs_for_build(build_id, session, wait=True)
    else:
        # Just wait for completion without streaming logs
        while True:
            build_info = codebuild.batch_get_builds(ids=[build_id])["builds"][0]
            status = build_info["buildStatus"]
            if status != "IN_PROGRESS":
                break
            print(".", end="", flush=True)
            time.sleep(30)
        print()

    # Get final status
    build_info = codebuild.batch_get_builds(ids=[build_id])["builds"][0]
    status = build_info["buildStatus"]
    print(f"Build complete, status = {status}")
    
    # Cleanup
    codebuild.delete_project(name=project_name)
    session.client("s3").delete_object(Bucket=bucket, Key=key)
    
    if status == "SUCCEEDED":
        ecr_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{repository_name}:latest"
        print(f"Image URI: {ecr_uri}")
        return ecr_uri
    else:
        raise Exception(f"Build failed with status: {status}")

In [None]:
# Kick start building arm64 container image
role_name = SolutionAccessRoleArn.split('/')[-1]
repository_name = "strands-agent-repo"
ecr_uri = build_arm64_image(role_name, CodeBucketForAutomationName, repository_name)

# Verify the image exists in ECR
try:
    ecr_client = boto3.client('ecr')
    response = ecr_client.describe_images(
        repositoryName=repository_name,
        imageIds=[{'imageTag': 'latest'}]
    )
    print(f"Image verified in repository: {repository_name}")
except Exception as e:
    print(f"Error verifying image: {str(e)}")

## Step 6: Deploy AgentCore Runtime

Create the AgentCore Runtime using boto3 APIs with the Docker image we built in the previous steps.

In [None]:
# Create or Update AgentCore Runtime
existing_runtime = None
region = boto3.Session().region_name
agent_runtime_name = "StrandsAgentCoreRuntime"
agentcore_control_client = boto3.client('bedrock-agentcore-control')

# Try to get existing agent runtime first
list_response = agentcore_control_client.list_agent_runtimes()
for runtime in list_response.get('agentRuntimes', []):
    if runtime['agentRuntimeName'] == agent_runtime_name:
        existing_runtime = runtime
        agent_runtime_id = existing_runtime['agentRuntimeId']
        agent_runtime_arn = existing_runtime['agentRuntimeArn']
        print(f"Found existing AgentCore Runtime ID: {agent_runtime_id}")

if existing_runtime: # Update the existing runtim
    update_response = agentcore_control_client.update_agent_runtime(
        agentRuntimeId=agent_runtime_id,
        roleArn=SolutionAccessRoleArn,
        agentRuntimeArtifact={
            "containerConfiguration": {
                "containerUri": ecr_uri
            }
        },            
        networkConfiguration={
            "networkMode": "PUBLIC"
        },
        environmentVariables={
            "AWS_DEFAULT_REGION": region,
            "KNOWLEDGE_BASE_1_ID": Knowledge_Base_1_Id,
            "KNOWLEDGE_BASE_2_ID": Knowledge_Base_2_Id,
            "SYSTEM_FUNCTION_1_NAME": System_Function_1_Name,
            "SYSTEM_FUNCTION_2_NAME": System_Function_2_Name
        }
    )
    print(f"Updated existing AgentCore Runtime")
else: # Create new runtime
    create_response = agentcore_control_client.create_agent_runtime(
        agentRuntimeName=agent_runtime_name,
        roleArn=SolutionAccessRoleArn,
        agentRuntimeArtifact={
            "containerConfiguration": {
                "containerUri": ecr_uri
            }
        },
        networkConfiguration={
            "networkMode": "PUBLIC"
        },
        environmentVariables={
            "AWS_DEFAULT_REGION": region,
            "KNOWLEDGE_BASE_1_ID": Knowledge_Base_1_Id,
            "KNOWLEDGE_BASE_2_ID": Knowledge_Base_2_Id,
            "SYSTEM_FUNCTION_1_NAME": System_Function_1_Name,
            "SYSTEM_FUNCTION_2_NAME": System_Function_2_Name
        }
    )
    agent_runtime_id = create_response['agentRuntimeId']
    agent_runtime_arn = create_response['agentRuntimeArn']
    print(f"Created new AgentCore Runtime ID: {agent_runtime_id}")

## Step 7: Check AgentCore Runtime Status

Monitor the status of the AgentCore Runtime until ready for use.

In [None]:
def check_runtime_status(agent_runtime_id):
    """Check the status of the AgentCore Runtime"""
    response = agentcore_control_client.get_agent_runtime(
        agentRuntimeId=agent_runtime_id
    )
    return response['status']

# Wait for the runtime to be ready
print("Waiting for AgentCore Runtime to be ready...")
runtime_status = check_runtime_status(agent_runtime_id)
while runtime_status not in ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']:
    print(f"Runtime status: {runtime_status}")
    time.sleep(10)
    runtime_status = check_runtime_status(agent_runtime_id)
print(f"Runtime status: {runtime_status}")

## Step 8: Test Agent Runtime Deployment

Send a test prompt to the AgentCore Runtime to verify that the agent is live.

In [None]:
# Create a client for the AgentCore data plane
agentcore_client = boto3.client('bedrock-agentcore')

# Test the AgentCore Runtime with a sample query
try:
    invoke_response = agentcore_client.invoke_agent_runtime(
        agentRuntimeArn=agent_runtime_arn,
        qualifier="DEFAULT",
        traceId=str(uuid.uuid4()),
        contentType="application/json",
        payload=json.dumps({"prompt": "A new user is asking about the price of Doggy Delights"})
    )
    
    # Process the response
    if "text/event-stream" in invoke_response.get("contentType", ""):
        content = []
        for line in invoke_response["response"].iter_lines(chunk_size=1):
            if line:
                line = line.decode("utf-8")
                if line.startswith("data: "):
                    line = line[6:]
                    content.append(line)
        response_text = "\n".join(content)
    else:
        events = []
        for event in invoke_response.get("response", []):
            events.append(event)
        
        # Combine all events to fix truncation
        combined_content = ""
        for event in events:
            combined_content += event.decode("utf-8")
        
        response_text = json.loads(combined_content)
    print("Agent Response:")
    print(response_text)
except Exception as e:
    print(f"Error invoking AgentCore Runtime: {str(e)}")

💡 Pro Tip: This notebook enables monitoring and tracing capabilities using AWS OpenTelemetry Python Distro. To visualize the agent's decision-making process and gain insights into its performance, go to the GenAI Observability dashboard in Amazon CloudWatch. Click through the various features of the GenAI observability dashboard to get more detailed information on traces.

## Step 9: Cleanup Resources (Optional)

Clean up the AWS resources created in this notebook to avoid incurring unnecessary charges by uncommenting the last line.

In [None]:
def cleanup_resources():
    '''Clean up AWS resources created in this notebook.'''  
    # Delete the AgentCore Runtime
    try:
        agentcore_control_client.delete_agent_runtime(
            agentRuntimeId=agent_runtime_id
        )
        print(f"Initiated deletion of AgentCore Runtime: {agent_runtime_id}")
    except Exception as e:
        print(f"Error deleting AgentCore Runtime: {agent_runtime_id}")
    
    # Delete the ECR repository
    try:
        ecr_client = boto3.client('ecr')
        ecr_client.delete_repository(
            repositoryName=repository_name,
            force=True  # Force deletion even if it contains images
        )
        print(f"Deleted ECR repository: {repository_name}")
    except Exception as e:
        print(f"Error deleting ECR repository: {repository_name}")
    
    # Remove the Dockerfile
    try:
        if os.path.exists('Dockerfile'):
            os.remove('Dockerfile')
            print("Deleted Dockerfile")
    except Exception as e:
        print(f"Error deleting Dockerfile")

# Uncomment the line below to clean up resources
# cleanup_resources()

## Summary

In this notebook, we demonstrated how to:

1. Package the agent code into a Docker container
2. Deploy the agent to AgentCore Runtime using direct boto3 API calls
3. Test the deployed agent using AgentCore Runtime Endpoint