# Chapter 5: Deploying Strands Agents to Production

## Introduction to Deployment

After developing your Strands Agents locally, the next step is deploying them to production. In this chapter, we'll focus on AWS Lambda, which provides an excellent serverless platform for running AI agents.

We'll cover:
- Advantages of AWS Lambda for Strands Agents
- Preparing your agent for deployment
- Setting up API Gateway for web access
- Monitoring and scaling considerations

Throughout this chapter, we'll use the Nova Lite model (`us.amazon.nova-lite-v1:0`) as specified for our course.

## Why AWS Lambda for Strands Agents?

AWS Lambda offers several advantages for deploying Strands Agents:

1. **Serverless Architecture**: No server management required
2. **Pay-per-use Pricing**: Only pay for actual computation time
3. **Automatic Scaling**: Handles varying loads effortlessly
4. **AWS Integration**: Seamless integration with other AWS services
5. **Python Support**: Native support for Python-based applications

These benefits make Lambda ideal for deploying AI agents with variable usage patterns.

## Preparing Your Agent for Lambda

Let's create a simple weather agent that we'll prepare for deployment:

In [None]:
from strands import Agent, tool

@tool
def weather_info(location: str) -> str:
    """
    Get weather information for a location.
    
    Args:
        location (str): City or location name
        
    Returns:
        str: Weather information for the location
    """
    # Mock data for demonstration
    weather_data = {
        "new york": "72°F, Partly Cloudy",
        "london": "64°F, Rainy",
        "tokyo": "79°F, Sunny"
    }
    
    return weather_data.get(location.lower(), "Weather information not available")

# Create the agent
agent = Agent(
    model="us.amazon.nova-lite-v1:0",
    tools=[weather_info],
    system_prompt="You are a helpful weather assistant."
)

# Test locally
response = agent("What's the weather like in Tokyo?")

Now let's structure our code for Lambda deployment by creating a Lambda handler function:

In [None]:
%%writefile lambda_function.py
import json
from strands import Agent, tool

@tool
def weather_info(location: str) -> str:
    """
    Get weather information for a location.
    
    Args:
        location (str): City or location name
        
    Returns:
        str: Weather information for the location
    """
    # Mock data for demonstration
    weather_data = {
        "new york": "72°F, Partly Cloudy",
        "london": "64°F, Rainy",
        "tokyo": "79°F, Sunny"
    }
    
    return weather_data.get(location.lower(), "Weather information not available")

# Create the agent outside the handler to benefit from container reuse
agent = Agent(
    model="us.amazon.nova-lite-v1:0",
    tools=[weather_info],
    system_prompt="You are a helpful weather assistant."
)

def lambda_handler(event, context):
    """
    AWS Lambda handler function
    """
    try:
        # Extract the user message
        body = json.loads(event.get('body', '{}'))
        user_message = body.get('message', '')
        
        if not user_message:
            return {
                'statusCode': 400,
                'body': json.dumps({'error': 'No message provided'})
            }
        
        # Process with our agent
        response = agent(user_message)
        
        return {
            'statusCode': 200,
            'body': json.dumps({'message': response.message})
        }
        
    except Exception as e:
        # Log the error
        print(f"Error: {str(e)}")
        
        return {
            'statusCode': 500,
            'body': json.dumps({'error': 'Internal server error'})
        }

## IAM Permissions for Bedrock

To use Amazon Bedrock models in Lambda, you need to set up appropriate IAM permissions. Here's an example IAM policy:

In [None]:
%%writefile trust-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

In [None]:
%%writefile bedrock-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
            "Resource": ["arn:aws:bedrock:*:*:foundation-model/amazon.nova-lite-v1:0", "arn:aws:bedrock:us-east-1:350393882861:inference-profile/us.amazon.nova-lite-v1:0"]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        }
    ]
}

In [None]:
%%writefile iam_role_policy.json
# Get account id
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

# Create IAM role
aws iam create-role \
  --role-name strands-agents-lambda-bedrock-role \
  --assume-role-policy-document file://trust-policy.json

# Create IAM policy
aws iam create-policy \
  --policy-name strands-agents-lambda-bedrock-policy \
  --policy-document file://bedrock-policy.json

# Attach IAM policy
aws iam attach-role-policy \
  --role-name strands-agents-lambda-bedrock-role \
  --policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/strands-agents-lambda-bedrock-policy

In [None]:
!sh iam_role_policy.json

## Creating a Deployment Package

For Lambda, you need to package your code and dependencies into a ZIP file. Here's a script to create the deployment package:

In [None]:
!sudo yum install zip -y
!sudo apt install zip -y

In [None]:
%%writefile create_package.sh
#!/bin/bash

# Create a temporary directory
mkdir -p package
cp lambda_function.py package/

# Create a virtual environment and install dependencies
python -m venv venv
. venv/bin/activate
pip install strands-agents -t package/

# Create the ZIP file
cd package
zip -r ../deployment.zip .
cd ..

# Clean up
rm -rf package venv
echo "Deployment package created: deployment.zip"

In [None]:
!sh create_package.sh

## Deploying to AWS Lambda

Here's how to deploy your function using the AWS CLI:

In [None]:
%%writefile deploy.sh
#!/bin/bash

# Configuration
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
FUNCTION_NAME="strands_agents_lambda"
ROLE_ARN="arn:aws:iam::${ACCOUNT_ID}:role/strands-agents-lambda-bedrock-role"  # Replace with your role ARN
REGION="us-east-1"

# Create the Lambda function
aws lambda create-function \
    --function-name $FUNCTION_NAME \
    --runtime python3.12 \
    --handler lambda_function.lambda_handler \
    --timeout 30 \
    --memory-size 512 \
    --role $ROLE_ARN \
    --zip-file fileb://deployment.zip \
    --region $REGION

In [None]:
!sh deploy.sh

## Testing the Deployed Agent

Once deployed, you can test your agent by sending requests:

In [None]:
import boto3
import json

def invoke_weather_agent(location_query):
    """
    Invokes the strands_agents_lambda with a weather query
    
    Args:
        location_query (str): The weather query (e.g. "What's the weather in Tokyo?")
    
    Returns:
        dict: The response from the Lambda function
    """
    # Create Lambda client
    lambda_client = boto3.client('lambda', region_name='us-east-1')

    try:
        # Invoke the Lambda function
        response = lambda_client.invoke(
            FunctionName='strands_agents_lambda',
            InvocationType='RequestResponse',  # Synchronous call
            Payload=json.dumps({
                'body': json.dumps({
                    'message': location_query
                })
            })
        )
        
        # Parse the response
        payload = json.loads(response['Payload'].read().decode('utf-8'))
        return {
            'status_code': response['StatusCode'],
            'response': json.loads(payload.get('body', '{}'))
        }
        
    except Exception as e:
        print(f"Error invoking Lambda: {str(e)}")
        return {
            'status_code': 500,
            'error': str(e)
        }

# Example usage
if __name__ == "__main__":
    result = invoke_weather_agent("What's the weather like in Tokyo?")
    print("Lambda Response:\n\n", result['response']['message']['content'][0]['text'])

## Advanced Deployment Options

For production deployments, consider these advanced options:

### 1. API Gateway
Setting up API Gateway to expose your agent via HTTP.

### 2. AWS CDK

The [AWS Cloud Development Kit (CDK)](https://aws.amazon.com/cdk/) allows you to define infrastructure using Python code.

### 3. Serverless Framework

The [Serverless Framework](https://www.serverless.com/) simplifies deploying Lambda functions and related resources:

### 4. Lambda Layers

For large dependencies, you can use [Lambda Layers](https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html) to separate dependencies from your function code.

## Monitoring and Scaling

### CloudWatch Metrics and Logs

AWS Lambda automatically sends metrics to CloudWatch, including invocation count, duration, and errors. Set up CloudWatch Alarms to alert you of potential issues.

### Lambda Concurrency

Lambda automatically scales based on the number of incoming requests. For predictable scaling:
- **Reserved Concurrency**: Guarantees a certain number of concurrent executions
- **Provisioned Concurrency**: Keeps functions initialized for immediate response

### Cold Starts

The first invocation of a Lambda function (or after scaling) may experience a "cold start" delay. To mitigate this:
- Use Provisioned Concurrency
- Keep function size small
- Use warm-up pings for critical functions

### Cost Optimization

Lambda charges based on:
- Number of requests
- Duration × memory allocation

Optimize by:
- Tuning memory allocation
- Minimizing function duration
- Using CloudWatch Logs Insights to identify inefficient functions

## Summary

In this chapter, we've covered:

1. Why AWS Lambda is well-suited for deploying Strands Agents
2. How to structure your agent code for Lambda deployment
3. Creating a deployment package with all dependencies
4. Advanced deployment options like Serverless Framework and AWS CDK
5. Monitoring and scaling considerations for production deployments

With these tools and techniques, you can deploy Strands Agents to a production environment that is scalable, cost-effective, and easy to maintain.

In the next chapter, we'll explore advanced patterns for multi-agent systems that can work together to solve complex problems.

## Exercises

1. Modify the example agent to use a different Strands tool and deploy it to Lambda
2. Implement a basic authentication mechanism for your API Gateway endpoint
3. Set up CloudWatch Alarms to monitor your deployed agent
4. Create a Lambda Layer for the Strands dependencies
5. Implement a WebSocket-based solution for streaming agent responses