# Prerequisites: Creating Sample Agents

## Overview

Let's first start by creating agents to be evaluated. This tutorial creates two sample agents for evaluation using different frameworks:
- [Strands Agents SDK](https://strandsagents.com/)
- [LangGraph](https://www.langchain.com/langgraph)

Both agents uses Anthropic Claude Haiku 4.5 from Amazon Bedrock as the LLM model but you can use any model of your preference and they have identical capabilities:
- **Math Tool**: Tool to perform basic math operations
- **Weather Tool**: Dummy implementation for weather tool


The architecture looks as following:

![Architecture](../images/agent_architecture.png)

## Prerequisites
- Python 3.10+
- AWS credentials

In [1]:
!pip install -r ../requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m26.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Setup

Import required packages and configure AWS session:

In [2]:
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session
import uuid
import time

boto_session = Session()
region = boto_session.region_name
print(f"Using region: {region}")

Using region: us-west-2


## Deploy Strands Agent
Let's deploy our Strands Agents to AgentCore Runtime

In [3]:
agentcore_runtime = Runtime()

agent_name = "ac_eval_strands2"
response = agentcore_runtime.configure(
    entrypoint="eval_agent_strands.py",
    auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements_strands.txt",
    region=region,
    agent_name=agent_name,
    idle_timeout=120
)
launch_result_strands = agentcore_runtime.launch()
print(f"Strands agent deployment started: {launch_result_strands}")

Entrypoint parsed: file=/home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/eval_agent_strands.py, bedrock_agentcore_name=eval_agent_strands
Memory disabled - agent will be stateless
Configuring BedrockAgentCore agent: ac_eval_strands2
Memory disabled
Network mode: PUBLIC


Generated Dockerfile: Dockerfile
Generated .dockerignore: /home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/.dockerignore
Setting 'ac_eval_strands2' as default agent
Bedrock AgentCore configured: /home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/.bedrock_agentcore.yaml
üöÄ Launching Bedrock AgentCore (cloud mode - RECOMMENDED)...
   ‚Ä¢ Deploy Python code directly to runtime
   ‚Ä¢ No Docker required (DEFAULT behavior)
   ‚Ä¢ Production-ready deployment

üí° Deployment options:
   ‚Ä¢ runtime.launch()                ‚Üí Cloud (current)
   ‚Ä¢ runtime.launch(local=True)      ‚Üí Local development
Memory disabled - skipping memory creation
Starting CodeBuild ARM64 deployment for agent 'ac_eval_strands2' to account 339712707840 (us-west-2)
Setting up AWS resources (ECR repository, execution roles)...
Getting or creating ECR repository for agent: ac_eval_strands2


‚úÖ Reusing existing ECR repository: 339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_strands2


‚úÖ Reusing existing execution role: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKRuntime-us-west-2-a67917b625
Execution role available: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKRuntime-us-west-2-a67917b625
Preparing CodeBuild project and uploading source...
Getting or creating CodeBuild execution role for agent: ac_eval_strands2
Role name: AmazonBedrockAgentCoreSDKCodeBuild-us-west-2-a67917b625
Reusing existing CodeBuild execution role: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKCodeBuild-us-west-2-a67917b625
Using dockerignore.template with 46 patterns for zip filtering
Uploaded source to S3: ac_eval_strands2/source.zip
Updated CodeBuild project: bedrock-agentcore-ac_eval_strands2-builder
Starting CodeBuild build (this may take several minutes)...
Starting CodeBuild monitoring...
üîÑ QUEUED started (total: 0s)
‚úÖ QUEUED completed in 1.1s
üîÑ PROVISIONING started (total: 1s)
‚úÖ PROVISIONING completed in 7.9s
üîÑ DOWNLOAD_SOURCE started (tota

Strands agent deployment started: mode='codebuild' tag='bedrock_agentcore-ac_eval_strands2:latest' env_vars=None port=None runtime=None ecr_uri='339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_strands2' agent_id='ac_eval_strands2-4U1R8Y4JF0' agent_arn='arn:aws:bedrock-agentcore:us-west-2:339712707840:runtime/ac_eval_strands2-4U1R8Y4JF0' codebuild_id='bedrock-agentcore-ac_eval_strands2-builder:d4d52566-eaff-48e5-bf29-cdc27415cf4c' build_output=None


### Check status for deployment on AgentCore Runtime
Wait for deployment to be in ACTIVE status..

In [4]:
def wait_for_deployment(runtime, name):
    end_statuses = ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']
    while True:
        status_response = runtime.status()
        status = status_response.endpoint['status']
        print(f"{name} status: {status}")
        if status in end_statuses:
            print(f"{name} deployment completed with status: {status}")
            return status
        time.sleep(10)

strands_status = wait_for_deployment(agentcore_runtime, "Strands")

Retrieved Bedrock AgentCore status for: ac_eval_strands2


Strands status: READY
Strands deployment completed with status: READY


### Invoke the Strands Agents on Runtime
Let's test the Strands agent by invoking the AgentCore Runtime endpoint with a payload.

In [5]:
session_id_strands = str(uuid.uuid4())
print(f"Session ID: {session_id_strands}")

Session ID: f8da8ce8-b155-42a7-a967-2d91151ca9eb


In [6]:
invoke_response = agentcore_runtime.invoke(
    payload={"prompt": "How much is 2+2?"},
    session_id=session_id_strands
)
invoke_response

{'ResponseMetadata': {'RequestId': '249c7457-6465-4b18-a702-7ee64984bd05',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:32:06 GMT',
   'content-type': 'application/json',
   'transfer-encoding': 'chunked',
   'connection': 'keep-alive',
   'x-amzn-requestid': '249c7457-6465-4b18-a702-7ee64984bd05',
   'baggage': 'Self=1-69878523-64688c4409d0c9ad4982dbc1,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-bedrock-agentcore-runtime-session-id': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-trace-id': 'Root=1-69878523-57dadb192eead12847ce2db9;Parent=79073c66ba6d2d8e;Sampled=1;Self=1-69878523-64688c4409d0c9ad4982dbc1'},
  'RetryAttempts': 0},
 'runtimeSessionId': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'traceId': 'Root=1-69878523-57dadb192eead12847ce2db9;Parent=79073c66ba6d2d8e;Sampled=1;Self=1-69878523-64688c4409d0c9ad4982dbc1',
 'baggage': 'Self=1-69878523-64688c4409d0c9ad4982dbc1,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'contentType': 

In [7]:
invoke_response = agentcore_runtime.invoke(
    payload={"prompt": "How is the weather now?"},
    session_id=session_id_strands
)
invoke_response

{'ResponseMetadata': {'RequestId': '6771b1a5-6282-46cc-990e-0571f3370209',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:32:13 GMT',
   'content-type': 'application/json',
   'transfer-encoding': 'chunked',
   'connection': 'keep-alive',
   'x-amzn-requestid': '6771b1a5-6282-46cc-990e-0571f3370209',
   'baggage': 'Self=1-6987852b-5bfc4fd131556f4a0bdbc52d,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-bedrock-agentcore-runtime-session-id': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-trace-id': 'Root=1-6987852b-5b3f237707d669240f59bdc6;Parent=f386ec261c0d5780;Sampled=1;Self=1-6987852b-5bfc4fd131556f4a0bdbc52d'},
  'RetryAttempts': 0},
 'runtimeSessionId': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'traceId': 'Root=1-6987852b-5b3f237707d669240f59bdc6;Parent=f386ec261c0d5780;Sampled=1;Self=1-6987852b-5bfc4fd131556f4a0bdbc52d',
 'baggage': 'Self=1-6987852b-5bfc4fd131556f4a0bdbc52d,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'contentType': 

In [8]:
invoke_response = agentcore_runtime.invoke(
    payload={"prompt": "Can you tell me the capital of the US?"},
    session_id=session_id_strands
)
invoke_response

{'ResponseMetadata': {'RequestId': '2e59fead-ea54-45a6-84bf-91ead694c042',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:32:20 GMT',
   'content-type': 'application/json',
   'transfer-encoding': 'chunked',
   'connection': 'keep-alive',
   'x-amzn-requestid': '2e59fead-ea54-45a6-84bf-91ead694c042',
   'baggage': 'Self=1-69878532-115b58fa3aecd2ba043aa65f,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-bedrock-agentcore-runtime-session-id': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
   'x-amzn-trace-id': 'Root=1-69878532-7f9685bf469e1dd91e8da733;Parent=0ac1f63d97bdaaaa;Sampled=1;Self=1-69878532-115b58fa3aecd2ba043aa65f'},
  'RetryAttempts': 0},
 'runtimeSessionId': 'f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'traceId': 'Root=1-69878532-7f9685bf469e1dd91e8da733;Parent=0ac1f63d97bdaaaa;Sampled=1;Self=1-69878532-115b58fa3aecd2ba043aa65f',
 'baggage': 'Self=1-69878532-115b58fa3aecd2ba043aa65f,session.id=f8da8ce8-b155-42a7-a967-2d91151ca9eb',
 'contentType': 

## Deploy LangGraph Agent to AgentCore Runtime

Let's also deploy our LangGraph agent to AgentCore Runtime.

In [9]:
langgraph_agentcore_runtime = Runtime()
agent_name = "ac_eval_langgraph2"
response = langgraph_agentcore_runtime.configure(
    entrypoint="eval_agent_langgraph.py",
    auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements_langgraph.txt",
    region=region,
    agent_name=agent_name,
    idle_timeout=120
)
launch_result_langgraph = langgraph_agentcore_runtime.launch()
print(f"LangGraph agent deployment started: {launch_result_langgraph}")

Entrypoint parsed: file=/home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/eval_agent_langgraph.py, bedrock_agentcore_name=eval_agent_langgraph
Memory disabled - agent will be stateless
Configuring BedrockAgentCore agent: ac_eval_langgraph2
Memory disabled
Network mode: PUBLIC


Generated Dockerfile: Dockerfile
Generated .dockerignore: /home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/.dockerignore
Changing default agent from 'ac_eval_strands2' to 'ac_eval_langgraph2'
Bedrock AgentCore configured: /home/mom/Desktop/02072026eval/amazon-bedrock-agentcore-samples/01-tutorials/07-AgentCore-evaluations/00-prereqs/.bedrock_agentcore.yaml
üöÄ Launching Bedrock AgentCore (cloud mode - RECOMMENDED)...
   ‚Ä¢ Deploy Python code directly to runtime
   ‚Ä¢ No Docker required (DEFAULT behavior)
   ‚Ä¢ Production-ready deployment

üí° Deployment options:
   ‚Ä¢ runtime.launch()                ‚Üí Cloud (current)
   ‚Ä¢ runtime.launch(local=True)      ‚Üí Local development
Memory disabled - skipping memory creation
Starting CodeBuild ARM64 deployment for agent 'ac_eval_langgraph2' to account 339712707840 (us-west-2)
Setting up AWS resources (ECR repository, execution roles)...
Getting or creating ECR repository

‚úÖ Reusing existing ECR repository: 339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_langgraph2


‚úÖ Reusing existing execution role: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKRuntime-us-west-2-a18a56d054
Execution role available: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKRuntime-us-west-2-a18a56d054
Preparing CodeBuild project and uploading source...
Getting or creating CodeBuild execution role for agent: ac_eval_langgraph2
Role name: AmazonBedrockAgentCoreSDKCodeBuild-us-west-2-a18a56d054
Reusing existing CodeBuild execution role: arn:aws:iam::339712707840:role/AmazonBedrockAgentCoreSDKCodeBuild-us-west-2-a18a56d054
Using dockerignore.template with 46 patterns for zip filtering
Uploaded source to S3: ac_eval_langgraph2/source.zip
Updated CodeBuild project: bedrock-agentcore-ac_eval_langgraph2-builder
Starting CodeBuild build (this may take several minutes)...
Starting CodeBuild monitoring...
üîÑ QUEUED started (total: 0s)
‚úÖ QUEUED completed in 1.1s
üîÑ PROVISIONING started (total: 1s)
‚úÖ PROVISIONING completed in 7.9s
üîÑ DOWNLOAD_SOURCE started

LangGraph agent deployment started: mode='codebuild' tag='bedrock_agentcore-ac_eval_langgraph2:latest' env_vars=None port=None runtime=None ecr_uri='339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_langgraph2' agent_id='ac_eval_langgraph2-mZw4hcGHHr' agent_arn='arn:aws:bedrock-agentcore:us-west-2:339712707840:runtime/ac_eval_langgraph2-mZw4hcGHHr' codebuild_id='bedrock-agentcore-ac_eval_langgraph2-builder:08c0a806-b4b7-41da-8963-728c17d7ef10' build_output=None


### Check the status of LangGraph agent
Now that we've deployed the LangGraph agent to AgentCore Runtime, let's check for it's deployment status

In [10]:
langgraph_status = wait_for_deployment(langgraph_agentcore_runtime, "LangGraph")

Retrieved Bedrock AgentCore status for: ac_eval_langgraph2


LangGraph status: READY
LangGraph deployment completed with status: READY


### Invoke the Langgraph Agent on Runtime
Test the LangGraph agent endpoint on AgentCore Runtime with a payload:

In [11]:
session_id_langgraph = str(uuid.uuid4())
print(f"Session ID: {session_id_langgraph}")

Session ID: 57d90dac-4c01-4dce-a4c1-7dd167841dd5


In [12]:
response = langgraph_agentcore_runtime.invoke(
    payload={"prompt": "What is 2+2?"},
    session_id=session_id_langgraph
)
print(response)

{'ResponseMetadata': {'RequestId': '7bcb6d44-0afb-481a-9e67-12e1c0e30c9c', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:33:54 GMT', 'content-type': 'application/json', 'transfer-encoding': 'chunked', 'connection': 'keep-alive', 'x-amzn-requestid': '7bcb6d44-0afb-481a-9e67-12e1c0e30c9c', 'baggage': 'Self=1-6987858f-4494846843fd133a2a2ff84d,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5', 'x-amzn-bedrock-agentcore-runtime-session-id': '57d90dac-4c01-4dce-a4c1-7dd167841dd5', 'x-amzn-trace-id': 'Root=1-6987858f-7cefddfb6dafca1550ea63c4;Parent=cc9215be0eda953d;Sampled=1;Self=1-6987858f-4494846843fd133a2a2ff84d'}, 'RetryAttempts': 0}, 'runtimeSessionId': '57d90dac-4c01-4dce-a4c1-7dd167841dd5', 'traceId': 'Root=1-6987858f-7cefddfb6dafca1550ea63c4;Parent=cc9215be0eda953d;Sampled=1;Self=1-6987858f-4494846843fd133a2a2ff84d', 'baggage': 'Self=1-6987858f-4494846843fd133a2a2ff84d,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5', 'contentType': 'application/json', 'statusCode

In [13]:
invoke_response = agentcore_runtime.invoke(
    payload={"prompt": "What is the weather now?"},
    session_id=session_id_langgraph
)
invoke_response

{'ResponseMetadata': {'RequestId': 'c60d8703-84bb-436d-aa39-1b6ba6b8a96a',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:34:01 GMT',
   'content-type': 'application/json',
   'transfer-encoding': 'chunked',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'c60d8703-84bb-436d-aa39-1b6ba6b8a96a',
   'baggage': 'Self=1-69878597-3706f08f1d0a0aa10f0bfc61,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5',
   'x-amzn-bedrock-agentcore-runtime-session-id': '57d90dac-4c01-4dce-a4c1-7dd167841dd5',
   'x-amzn-trace-id': 'Root=1-69878597-411a465105c849cd05b8aa3a;Parent=c7bb57a373d8eac0;Sampled=1;Self=1-69878597-3706f08f1d0a0aa10f0bfc61'},
  'RetryAttempts': 0},
 'runtimeSessionId': '57d90dac-4c01-4dce-a4c1-7dd167841dd5',
 'traceId': 'Root=1-69878597-411a465105c849cd05b8aa3a;Parent=c7bb57a373d8eac0;Sampled=1;Self=1-69878597-3706f08f1d0a0aa10f0bfc61',
 'baggage': 'Self=1-69878597-3706f08f1d0a0aa10f0bfc61,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5',
 'contentType': 

In [14]:
invoke_response = agentcore_runtime.invoke(
    payload={"prompt": "Can you tell me the capital of the US?"},
    session_id=session_id_langgraph
)
invoke_response

{'ResponseMetadata': {'RequestId': 'd9c7be99-8634-4dfe-89c4-960f6a703b7d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 07 Feb 2026 18:34:09 GMT',
   'content-type': 'application/json',
   'transfer-encoding': 'chunked',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'd9c7be99-8634-4dfe-89c4-960f6a703b7d',
   'baggage': 'Self=1-698785a0-59361f236d55f498006e7d79,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5',
   'x-amzn-bedrock-agentcore-runtime-session-id': '57d90dac-4c01-4dce-a4c1-7dd167841dd5',
   'x-amzn-trace-id': 'Root=1-698785a0-773d946a456a9d0838489e43;Parent=a693ac65d68ee722;Sampled=1;Self=1-698785a0-59361f236d55f498006e7d79'},
  'RetryAttempts': 0},
 'runtimeSessionId': '57d90dac-4c01-4dce-a4c1-7dd167841dd5',
 'traceId': 'Root=1-698785a0-773d946a456a9d0838489e43;Parent=a693ac65d68ee722;Sampled=1;Self=1-698785a0-59361f236d55f498006e7d79',
 'baggage': 'Self=1-698785a0-59361f236d55f498006e7d79,session.id=57d90dac-4c01-4dce-a4c1-7dd167841dd5',
 'contentType': 

In [15]:
print(f"Strands: {launch_result_strands}, Session: {session_id_strands}")
print(f"LangGraph: {launch_result_langgraph}, Session: {session_id_langgraph}")

Strands: mode='codebuild' tag='bedrock_agentcore-ac_eval_strands2:latest' env_vars=None port=None runtime=None ecr_uri='339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_strands2' agent_id='ac_eval_strands2-4U1R8Y4JF0' agent_arn='arn:aws:bedrock-agentcore:us-west-2:339712707840:runtime/ac_eval_strands2-4U1R8Y4JF0' codebuild_id='bedrock-agentcore-ac_eval_strands2-builder:d4d52566-eaff-48e5-bf29-cdc27415cf4c' build_output=None, Session: f8da8ce8-b155-42a7-a967-2d91151ca9eb
LangGraph: mode='codebuild' tag='bedrock_agentcore-ac_eval_langgraph2:latest' env_vars=None port=None runtime=None ecr_uri='339712707840.dkr.ecr.us-west-2.amazonaws.com/bedrock-agentcore-ac_eval_langgraph2' agent_id='ac_eval_langgraph2-mZw4hcGHHr' agent_arn='arn:aws:bedrock-agentcore:us-west-2:339712707840:runtime/ac_eval_langgraph2-mZw4hcGHHr' codebuild_id='bedrock-agentcore-ac_eval_langgraph2-builder:08c0a806-b4b7-41da-8963-728c17d7ef10' build_output=None, Session: 57d90dac-4c01-4dce-a4c1-7dd1678

In [16]:
%store launch_result_strands
%store session_id_strands
%store launch_result_langgraph
%store session_id_langgraph

Stored 'launch_result_strands' (LaunchResult)
Stored 'session_id_strands' (str)
Stored 'launch_result_langgraph' (LaunchResult)
Stored 'session_id_langgraph' (str)


## Next Steps

Now that you have all the required pre-requisites, let's go through the individual evaluation tutorials:
Continue with the evaluation tutorials:
- [01-creating-custom-evaluators](../01-creating-custom-evaluators/): Create custom evaluators
- [02-running-evaluations](../02-running-evaluations/): Run on-demand and online evaluations
- [03-advanced](../03-advanced/): : Advanced techniques and dashboards