# Deploy Intelligent RAG Agent to AgentCore Runtime

This notebook demonstrates how to deploy the Strands-based intelligent RAG agent to Amazon Bedrock AgentCore Runtime.

### Why Deploy to AgentCore Runtime?

Previously, our intelligent RAG agent ran locally in the development environment. While this works for prototyping, production applications need:

- **Internet Accessibility**: Make your agent available as a service that can be invoked from anywhere
- **Automatic Scaling**: Handle varying loads without manual infrastructure management
- **Enterprise Observability**: Built-in CloudWatch monitoring, traces, and metrics
- **Memory Capabilities**: Maintain conversation context and learn user preferences over time
- **Serverless Infrastructure**: No servers to manage, pay only for what you use

AgentCore Runtime transforms your local agent prototype into a production-ready service by handling all infrastructure complexity, security, and operational concerns.

In this lab, we have a Strands Agent in seperate python files.  
You can invoke the agents by executing `intelligent_rag_agent_runtime.py` locally.

They have 2 tools to help them lookup data.
* Structured data is handled by `structured_data_assistant()`
* Unstructured data is handled by `unstructured_data_assistant()`.

This lab walks you through deploying the agents to AgentCore runtime while leveraging AgentCore Memory and Observability capabilities.

![agent_core](../images/agentcore.png)



## Prerequisites
- Complete previous labs to set up the knowledge bases
- Knowledge base IDs stored in variables
- AgentCore runtime permissions configured

Install the AgentCore SDK and starter toolkit to enable deployment and management of agents on the AgentCore platform:

In [None]:
# Install required packages
!uv pip install bedrock-agentcore bedrock-agentcore-starter-toolkit --quiet

Import required libraries and initialize the AgentCore Runtime client to interact with the deployment service:

In [None]:
import json
import os
import random, string
import time

import boto3
from bedrock_agentcore_starter_toolkit import Runtime

boto_session = boto3.Session()
region = boto_session.region_name

print(f"current region: {region}")
account_id = boto_session.client("sts").get_caller_identity()["Account"]
print(f"current account: {account_id}")

agentcore_runtime = Runtime()

Get the current notebook's directory path to locate the agent code and policy files for deployment:

In [None]:
import os
from pathlib import Path

import ipynbname

try:
    # Get the notebook name and path
    notebook_path = ipynbname.path()
    notebook_dir = Path(notebook_path.parent)
except:
    notebook_dir = Path.resolve()

print(f"notebook_dir: {notebook_dir}")

# Agentcore starter toolkit expects the files in the current directory
# So changing our working directory.
os.chdir(notebook_dir)
print(f"changed working directory to: {notebook_dir}")

### Create IAM Execution Role

The AgentCore runtime requires an IAM role with specific permissions to:
- Access Bedrock models for AI inference
- Retrieve from knowledge bases (KB)
- Write to CloudWatch logs
- Access SSM Parameter for KB ID
- Pull/push Docker images to ECR

We'll create this role using our predefined policy templates that include all necessary permissions.

In [None]:
# Load policy from external file
with open('policy.json', 'r') as f:
    policy_template = f.read()

# load trust policy from external file
with open('trust-policy.json', 'r') as f:
    trust_policy_template = f.read()

policy = policy_template.replace('REGION', region).replace('ACCOUNT_ID', account_id).replace('REPO_ARN', '*')
print(f"Policy loaded and updated with region: {region}, account: {account_id}")

trust_policy = trust_policy_template.replace('REGION', region).replace('ACCOUNT_ID', account_id)

# create IAM role using the policies
suffix = random.choices(string.ascii_lowercase + string.digits, k=8)
iam_client = boto3.client('iam')
role_name = f"bedrock-runtime-execution-role-{''.join(suffix)}"

role = iam_client.create_role(
    RoleName=role_name,
    AssumeRolePolicyDocument=trust_policy
)

iam_client.put_role_policy(
    RoleName=role_name,
    PolicyName='bedrock-runtime-execution-policy',
    PolicyDocument=policy
)

### Agent Configuration Process

The `configure()` method prepares your agent for deployment by:

1. **Creating Infrastructure**: ECR repository for Docker images, IAM roles if needed
2. **Generating Dockerfile**: Configures Python environment, installs dependencies, sets up OpenTelemetry
3. **Creating .dockerignore**: Excludes unnecessary files from the container
4. **Generating .bedrock_agentcore.yaml**: Configuration file that defines:
   - Entrypoint script (your agent code)
   - Execution role ARN
   - Region and agent name
   - Runtime requirements

This configuration is saved locally and used during the launch process.

**Agent Code Location:**

The agent implementation is in `intelligent_rag_agent_runtime.py` in this directory. Take a moment to review the code to understand:
- How the agent routes queries between structured and unstructured knowledge bases
- The tool definitions for querying each knowledge base type
- The agent's decision-making logic for selecting the appropriate data source

**What happens during configuration:**
- Creates ECR repository and IAM role (if needed)
- Generates Dockerfile and .dockerignore for containerization
- Creates `.bedrock_agentcore.yaml` configuration file
- Packages your agent code (`intelligent_rag_agent_runtime.py`) for deployment

In [None]:
agent_name = "intelligent_rag_agent"
response = agentcore_runtime.configure(
    entrypoint="intelligent_rag_agent_runtime.py",
    execution_role=role['Role']['Arn'],
    # auto_create_execution_role=True,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region=region,
    agent_name=agent_name
)
response

### CodeBuild Deployment Process

When you call `launch()`, AgentCore triggers a CodeBuild job that:

**Pre-build Phase:**
- Authenticates with ECR registry
- Prepares build environment

**Build Phase:**
- Creates ARM64 Docker image (optimized for AWS Graviton processors)
- Installs Python dependencies from requirements.txt
- Copies your agent code into the container
- Configures OpenTelemetry instrumentation

**Post-build Phase:**
- Pushes Docker image to ECR
- Creates AgentCore runtime endpoint
- Configures auto-scaling and monitoring

The entire process takes 3-5 minutes. You can monitor progress through the status checks.

In [None]:
launch_result = agentcore_runtime.launch(auto_update_on_conflict=True)

Monitor the deployment status to ensure the agent is ready before attempting to invoke it:

In [None]:
import time
status_response = agentcore_runtime.status()
status = status_response.endpoint['status']
end_status = ['READY', 'CREATE_FAILED', 'DELETE_FAILED', 'UPDATE_FAILED']
while status not in end_status:
    time.sleep(10)
    status_response = agentcore_runtime.status()
    status = status_response.endpoint['status']
    print(status)
print(status)

# Store the agent ARN for use in other notebooks
if status == 'READY':
    try:
        endpoint = status_response.endpoint
        # Get the agent ARN from the status response
        if 'agentRuntimeArn' in endpoint:
            agent_arn = endpoint['agentRuntimeArn']
        elif 'endpointName' in endpoint:
            # Construct ARN from endpoint name if not directly available
            endpoint_name = endpoint['endpointName']
            agent_arn = f"arn:aws:bedrock:{region}:{account_id}:agent-runtime/{endpoint_name}"
        else:
            raise ValueError("Could not extract agent ARN from status response")

        %store agent_arn
        print(f"Agent ARN stored for use in other notebooks: {agent_arn}")
    except Exception as e:
        print(f"Could not store agent ARN: {e}")

### Testing Your Deployed Agent

Now that the agent is deployed to AgentCore Runtime, you can invoke it just like you did locally, but with key differences:

- **Invocation Method**: Uses `agentcore_runtime.invoke()` instead of direct Python execution
- **Scalability**: Automatically scales based on request volume
- **Observability**: Every invocation generates traces and metrics in CloudWatch
- **Availability**: Accessible via API endpoint, not just local execution

The agent maintains the same intelligent routing behavior between structured and unstructured knowledge bases, but now runs in a production-ready environment.

**Helper Function for Response Formatting**

The `print_response_text()` function extracts and displays the text content from API responses in a clean, readable format:

In [None]:
import json


def print_response_text(invoke_response):
    response = invoke_response['response']
    
    # If it's a list, join all parts first
    if isinstance(response, list):
        response = ''.join(response)
    
    # Parse JSON
    response = json.loads(response)
    
    # Extract the text content
    text = response['result']['content'][0]['text']
    print(text)

**Convenience Function for Agent Invocation**

Define a convenience function to simplify agent invocations and format responses consistently:

In [None]:
def ask(question):
    print(f"Query: {question}")
    print("-" * 100)
    response = agentcore_runtime.invoke({"prompt": question})
    print_response_text(response)
    print("\n")

#### Queries

Test the deployed agent with sample queries to verify it can route between structured and unstructured knowledge bases:

In [None]:
ask("How many customers reviewed product_890, are those reviews positive or negative?")

In [None]:
ask("What are customer complaints about the lowest rated product?")

In [None]:
ask("What products have the best reviews?")

## Try making up your own queries!
Now it's your turn! Try creating some queries to test the system:

**Examples you could try:**
- 'What are the main features of product X?'
- 'How does service Y compare to competitors?'
- 'What are the pricing options for Z?'
- 'Can you explain the benefits of using this solution?'
Feel free to experiment with different types of questions!


In [None]:
# Try your own queries here using the ask() function
# Example: ask("Your question here")

## Next Steps

Your agent is now deployed to AgentCore Runtime and ready for production use!

**Ready to continue?** Proceed to [**Lab 3.2**](3.2-agentcore-memory.ipynb) to add AgentCore Memory capabilities for conversation context and user preferences.
