# Machine Learning and Deep Learning Model Deployment with Serverless

This notebook covers deploying ML models using AWS Lambda and Docker, including:
- Scikit-Learn model deployment
- Deep Learning models with ONNX (TensorFlow/Keras and PyTorch)

**Video**: https://www.youtube.com/watch?v=sHQaeVm5hT8

## Prerequisites

- AWS Account
- AWS CLI installed and configured
- Docker installed


## Part 1: Training a Scikit-Learn Model

First, we'll train a simple churn prediction model using Scikit-Learn. This model will be deployed to AWS Lambda in the following sections.


In [None]:
import pickle

import pandas as pd
import sklearn

from sklearn.feature_extraction import DictVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline


print(f'pandas=={pd.__version__}')
print(f'sklearn=={sklearn.__version__}')


In [None]:
# def load_data():
#     """
#     Loads and preprocesses the Telco customer churn dataset.
    
#     Returns:
#         pd.DataFrame: Preprocessed dataframe with cleaned column names and data types.
#     """
#     data_url = 'https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/chapter-03-churn-prediction/WA_Fn-UseC_-Telco-Customer-Churn.csv'

#     df = pd.read_csv(data_url)

#     # Normalize column names
#     df.columns = df.columns.str.lower().str.replace(' ', '_')

#     # Normalize categorical values
#     categorical_columns = list(df.dtypes[df.dtypes == 'object'].index)

#     for column in categorical_columns:
#         df[column] = df[column].str.lower().str.replace(' ', '_')

#     # Handle numeric conversion and missing values
#     df.totalcharges = pd.to_numeric(df.totalcharges, errors='coerce')
#     df.totalcharges = df.totalcharges.fillna(0)

#     # Convert churn to binary
#     df.churn = (df.churn == 'yes').astype(int)
    
#     return df


In [None]:
# def train_model(df):
#     """
#     Trains a logistic regression model for churn prediction.
    
#     Args:
#         df (pd.DataFrame): Preprocessed dataframe with features and target.
    
#     Returns:
#         sklearn.pipeline.Pipeline: Trained pipeline with DictVectorizer and LogisticRegression.
#     """
#     numerical_features = ['tenure', 'monthlycharges', 'totalcharges']

#     categorical_features = [
#         'gender',
#         'seniorcitizen',
#         'partner',
#         'dependents',
#         'phoneservice',
#         'multiplelines',
#         'internetservice',
#         'onlinesecurity',
#         'onlinebackup',
#         'deviceprotection',
#         'techsupport',
#         'streamingtv',
#         'streamingmovies',
#         'contract',
#         'paperlessbilling',
#         'paymentmethod',
#     ]

#     y_train = df.churn
#     train_dict = df[categorical_features + numerical_features].to_dict(orient='records')

#     pipeline = make_pipeline(
#         DictVectorizer(),
#         LogisticRegression(solver='liblinear')
#     )

#     pipeline.fit(train_dict, y_train)

#     return pipeline


In [None]:
# def save_model(pipeline, output_file):
#     """
#     Saves a trained pipeline to disk using pickle.
    
#     Args:
#         pipeline: Trained sklearn pipeline to save.
#         output_file (str): Path where the model will be saved.
#     """
#     with open(output_file, 'wb') as f_out:
#         pickle.dump(pipeline, f_out)


In [None]:
# # Load data and train model
# df = load_data()
# pipeline = train_model(df)
# save_model(pipeline, 'model.bin')

# print('Model saved to model.bin')


## Part 2: AWS Lambda Basics

AWS Lambda is a serverless compute service that runs code in response to events. Let's start with a simple Lambda function that returns a mock prediction.


### Invoking the Lambda Function

Once deployed, you can invoke the Lambda function in several ways:

#### Method 1: AWS CLI

First, create a JSON file with the customer data (e.g., `customer.json`):

```json
{
  "customer": {
    "gender": "female",
    "seniorcitizen": 0,
    "partner": "yes",
    "dependents": "no",
    "phoneservice": "no",
    "multiplelines": "no_phone_service",
    "internetservice": "dsl",
    "onlinesecurity": "no",
    "onlinebackup": "yes",
    "deviceprotection": "no",
    "techsupport": "no",
    "streamingtv": "no",
    "streamingmovies": "no",
    "contract": "month-to-month",
    "paperlessbilling": "yes",
    "paymentmethod": "electronic_check",
    "tenure": 1,
    "monthlycharges": 29.85,
    "totalcharges": 29.85
  }
}
```

Then invoke the function:

```bash
aws lambda invoke --function-name churn_prediction --cli-binary-format raw-in-base64-out --payload file://customer.json --region us-west-2 output.json && cat output.json
```

**Note:** Make sure to specify the `--region` parameter matching the region where your Lambda function is deployed. If you haven't configured AWS CLI defaults, you can also set the region using `aws configure` or by setting the `AWS_DEFAULT_REGION` environment variable.

The response will be saved to `output.json`.


#### Method 2: Using boto3 (Python)

You can also invoke the Lambda function programmatically using boto3.

#### Using `aws login` credentials

The code below uses `aws login` credentials. For a simpler alternative, use `aws configure` instead - then boto3 will work automatically without credential loading code. See [troubleshooting guide](aws-docs/troubleshooting.md) for details.

In [None]:
# import boto3
# import json
# import os
# from pathlib import Path


# def load_aws_login_credentials():
#     """
#     Loads credentials from aws login cache.
    
#     Returns:
#         dict: Credentials dict with access_key, secret_key, token, or None if not found.
#     """
#     login_cache_dir = Path.home() / '.aws' / 'login' / 'cache'
#     credential_files = list(login_cache_dir.glob('*.json'))
    
#     if not credential_files:
#         return None
    
#     try:
#         with open(credential_files[0]) as f:
#             creds_data = json.load(f)
#             access_token = creds_data.get('accessToken', {})
#             return {
#                 'aws_access_key_id': access_token.get('accessKeyId'),
#                 'aws_secret_access_key': access_token.get('secretAccessKey'),
#                 'aws_session_token': access_token.get('sessionToken')
#             }
#     except Exception:
#         return None


# # Load credentials once at module level
# _aws_creds = load_aws_login_credentials()
# if _aws_creds:
#     os.environ.update(_aws_creds)
#     print("✓ AWS credentials loaded from aws login cache")
# else:
#     print("⚠ Warning: No credentials found in aws login cache. Make sure you've run 'aws login'")

In [None]:
# def invoke_lambda_function(function_name, payload, region_name='us-west-2'):
#     """
#     Invokes an AWS Lambda function.
    
#     Args:
#         function_name (str): Lambda function name.
#         payload (dict): Request payload.
#         region_name (str): AWS region. Defaults to 'us-west-2'.
    
#     Returns:
#         dict: Lambda function response.
#     """
#     if _aws_creds:
#         session = boto3.Session(
#             aws_access_key_id=_aws_creds['aws_access_key_id'],
#             aws_secret_access_key=_aws_creds['aws_secret_access_key'],
#             aws_session_token=_aws_creds['aws_session_token'],
#             region_name=region_name
#         )
#     else:
#         session = boto3.Session(region_name=region_name)
    
#     lambda_client = session.client('lambda', region_name=region_name)
#     response = lambda_client.invoke(
#         FunctionName=function_name,
#         InvocationType='RequestResponse',
#         Payload=json.dumps(payload)
#     )
#     return json.loads(response['Payload'].read())

#### Alternative: Using `aws configure` (Simpler)

If you use `aws configure` instead of `aws login`, boto3 works automatically without credential loading code:

**Setup:** Run `aws configure` in terminal, then boto3 automatically uses credentials from `~/.aws/credentials`. See [troubleshooting guide](aws-docs/troubleshooting.md) for details.


In [None]:
import boto3
import json

def invoke_lambda_function(function_name, payload, region_name='us-east-1'):
    """
    Invokes an AWS Lambda function.
    
    Args:
        function_name (str): Lambda function name.
        payload (dict): Request payload.
        region_name (str): AWS region. Defaults to 'us-west-2'.
    
    Returns:
        dict: Lambda function response.
    """
    lambda_client = boto3.client('lambda', region_name=region_name)
    response = lambda_client.invoke(
        FunctionName=function_name,
        InvocationType='RequestResponse',
        Payload=json.dumps(payload)
    )
    return json.loads(response['Payload'].read())

In [None]:
customer_data = {
    "customer": {
        "gender": "female",
        "seniorcitizen": 0,
        "partner": "yes",
        "dependents": "no",
        "phoneservice": "no",
        "multiplelines": "no_phone_service",
        "internetservice": "dsl",
        "onlinesecurity": "no",
        "onlinebackup": "yes",
        "deviceprotection": "no",
        "techsupport": "no",
        "streamingtv": "no",
        "streamingmovies": "no",
        "contract": "month-to-month",
        "paperlessbilling": "yes",
        "paymentmethod": "electronic_check",
        "tenure": 1,
        "monthlycharges": 29.85,
        "totalcharges": 29.85
    }
}

In [None]:
result = invoke_lambda_function('churn_prediction', customer_data, region_name='us-east-1')
print(json.dumps(result, indent=2))

**Note:** You can also expose the Lambda function as a web service using API Gateway. See [unit 9.7 about API Gateway](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/09-serverless/07-api-gateway.md) for more details.


## Part 3: AWS Lambda with Docker: Running Locally

Scikit-Learn and its dependencies exceed the 250MB ZIP archive limit for Lambda. Docker containers solve this problem.

We'll set up the Lambda function using UV for dependency management, which provides a modern and efficient way to handle Python packages.

### Loading the Model and Lambda Function Script

In [None]:
import os
os.makedirs('lambda-sklearn', exist_ok=True)  # Create directory if it doesn't exist

In [None]:
%%writefile lambda-sklearn/lambda_function.py
import pickle

with open('model.bin', 'rb') as f_in:
    pipeline = pickle.load(f_in)

def predict_single(customer):
    result = pipeline.predict_proba(customer)[0, 1]
    return float(result)

def lambda_handler(event, context):
    customer = event['customer']
    prediction = predict_single(customer)
    
    return {
        'churn_probability': prediction,
        'churn': prediction >= 0.5
    }


### Setting Up Dependencies with UV

We'll use UV (a fast Python package manager) to manage dependencies systematically:

1. Create `requirements.in` - centralized dependency list
2. Initialize UV project - creates `pyproject.toml`
3. Add dependencies to `pyproject.toml` from `requirements.in`
4. Generate lock file - creates `uv.lock` for reproducible builds

In [None]:
%%writefile lambda-sklearn/requirements.in
pandas
scikit-learn

In [None]:
# Initialize UV project (creates pyproject.toml)
!cd lambda-sklearn && uv init --name churn-prediction-lambda --no-readme


In [None]:
# Add all dependencies from requirements.in to pyproject.toml
!cd lambda-sklearn && uv add $(grep -v '^#' requirements.in | xargs)

In [None]:
# Generate lock file
!cd lambda-sklearn && uv lock

### Dockerfile

The Dockerfile uses the AWS Lambda Python base image and UV to install dependencies:

- Base image: `public.ecr.aws/lambda/python:3.13`
- UV package manager: Copied from official UV Docker image
- Dependencies: Installed from `pyproject.toml` and `uv.lock` using UV
- Application files: `lambda_function.py` and `model.bin`

In [None]:
%%writefile lambda-sklearn/Dockerfile

# Use the official AWS Lambda base image for Python 3.13
FROM public.ecr.aws/lambda/python:3.13

# # Instead of installing 'uv' via curl/pip, we copy the binary directly from its official Docker image
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/

# Copy dependency definition files into the container
COPY pyproject.toml uv.lock ./

# Install dependencies system-wide inside the Lambda image
# AWS Lambda does NOT use virtual environments, so packages must go into the system site-packages
# We run a single command that does two things:
# a. 'uv export': Converts 'uv.lock' into a requirements.txt format in memory.
# b. 'uv pip install': Installs those requirements. 'uv pip install' installs them—like pip, but faster.
# The '--system' flag tells uv to install libraries globally (into /var/lang/lib/python3.13)
RUN uv pip install --system -r <(uv export --format requirements-txt)

# Copy the application code into the container
COPY lambda_function.py model.bin ./

# Set the entry point to the lambda_handler function
CMD ["lambda_function.lambda_handler"]


NOTE:

```text
[ EXTERNAL REGISTRIES ]                        [ LOCAL PROJECT ]
                 |                                     |
 1. BASE IMAGE   |                                     |
 +----------------------------------+                  |
 |  public.ecr.aws/lambda/python    |                  |
 |  (OS + Python 3.13 Runtime)      |                  |
 +---------------+------------------+                  |
                 |                                     |
 2. INJECT TOOLS | (Multi-stage)                       |
 +---------------v------------------+                  |
 | COPY /uv binary from ghcr.io...  |                  |
 | (Fast Python package manager)    |                  |
 +---------------+------------------+                  |
                 |                                     |
 3. DEPENDENCIES |                                     |
 +---------------v------------------+      +-----------v-----------+
 | COPY pyproject.toml & uv.lock    |<-----|  Dependency Files     |
 | RUN uv pip install --system ...  |      +-----------+-----------+
 | (Installs libs directly to OS)   |                  |
 +---------------+------------------+                  |
                 |                                     |
 4. APP CODE     |                                     |
 +---------------v------------------+      +-----------v-----------+
 | COPY lambda_function.py          |<-----|  Source Code & Model  |
 | COPY model.bin                   |      +-----------------------+
 +---------------+------------------+
                 |
 5. ENTRY POINT  |
 +---------------v------------------+
 | CMD [lambda_handler]             |
 | (Waits for AWS Invoke events)    |
 +----------------------------------+
                 |
        [ FINAL DOCKER IMAGE ]

In [None]:
# Build Docker container locally
!cd lambda-sklearn && docker build -t churn-prediction-lambda .

### Local Testing

In [None]:
# To run the Docker container locally, execute this command in your terminal:
# docker run -it --rm -p 8080:8080 churn-prediction-lambda
#
# Note: This is an interactive command that runs in the foreground.
# Run it in a separate terminal/bash session, not in this notebook.

In [None]:
import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'

result = requests.post(url, json=customer_data).json()
print(result)

## Part 4: AWS Lambda Deployment

### Creating ECR Repository

**Why:** ECR stores Docker images that Lambda pulls to run your function.

Create an Elastic Container Registry (ECR) repository:

```bash
aws ecr create-repository \
  --repository-name "churn-prediction-lambda" \
  --region "us-east-1"
```

### Building and Pushing Docker Image

**Why:** Lambda needs the container image in ECR to deploy the function.

You can build and push the Docker image manually using these commands:

```bash
# Set your ECR URL (from the repository creation response)
ECR_URL="YOUR_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com"

# Login to ECR
aws ecr get-login-password \
  --region "us-east-1" \
  | docker login \
  --username AWS \
  --password-stdin ${ECR_URL}

# Build, tag, and push
REMOTE_IMAGE_TAG="${ECR_URL}/churn-prediction-lambda:v1"

docker build -t churn-prediction-lambda .
docker tag churn-prediction-lambda ${REMOTE_IMAGE_TAG}
docker push ${REMOTE_IMAGE_TAG}
```

**However, this approach has limitations:**
- Requires manually setting the ECR URL each time
- Hard-coded values (region, repository name) that can be forgotten
- Repetitive commands that are error-prone
- No version management for image tags

**Better approach: Automated Script**

The `publish.sh` script (created in the cell below) automates this process and follow the principles:

- **Avoid Hard-Coded Values**: Uses named constants for configuration
- **Reusability**: Can be run with different versions: `./publish.sh 1`, `./publish.sh 2`, etc.
- **Error Handling**: Includes `set -e` to exit on errors
- **Auto-detection**: Automatically gets AWS account ID

This makes the deployment process more reliable, maintainable, and less error-prone.

In [None]:
%%writefile lambda-sklearn/publish.sh
#!/bin/bash
# Publish Docker image to AWS ECR
# 
# This script builds, tags, and pushes the Docker image to ECR.
# It follows clean code principles with meaningful variable names
# and avoids hard-coded values.

set -e  # Exit on error

# Configuration constants
readonly DEFAULT_REGION="us-east-1"
readonly REPOSITORY_NAME="churn-prediction-lambda"
readonly IMAGE_TAG_PREFIX="v"
readonly LOCAL_IMAGE_NAME="churn-prediction-lambda"

# Get AWS account ID and construct ECR URL
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
ECR_BASE_URL="${AWS_ACCOUNT_ID}.dkr.ecr.${DEFAULT_REGION}.amazonaws.com"
ECR_REPOSITORY_URI="${ECR_BASE_URL}/${REPOSITORY_NAME}"

# Generate image tag with version (defaults to v1 if not provided)
IMAGE_VERSION="${1:-1}"
REMOTE_IMAGE_TAG="${ECR_REPOSITORY_URI}:${IMAGE_TAG_PREFIX}${IMAGE_VERSION}"

echo "Building Docker image: ${LOCAL_IMAGE_NAME}"
docker build -t "${LOCAL_IMAGE_NAME}" .

echo "Tagging image: ${REMOTE_IMAGE_TAG}"
docker tag "${LOCAL_IMAGE_NAME}" "${REMOTE_IMAGE_TAG}"

echo "Authenticating Docker to ECR..."
aws ecr get-login-password --region "${DEFAULT_REGION}" \
  | docker login --username AWS --password-stdin "${ECR_BASE_URL}"

echo "Pushing image to ECR: ${REMOTE_IMAGE_TAG}"
docker push "${REMOTE_IMAGE_TAG}"

echo "✓ Successfully published ${REMOTE_IMAGE_TAG}"
echo "Image URI: ${REMOTE_IMAGE_TAG}"

In [None]:
# Make the script executable
!chmod +x lambda-sklearn/publish.sh

**Usage:**

```bash
cd lambda-sklearn
./publish.sh        # Publishes as v1
./publish.sh 2      # Publishes as v2
```

### Creating Lambda Function

#### Using AWS Console

1. Go to AWS Console → Lambda
2. Create function → Container image
3. Name: "churn-prediction-docker"
4. Select your container image
5. Create function
6. Increase timeout to 30 seconds (Configuration → General Configuration → Edit)

#### Using AWS CLI


##### Create IAM Role (if needed)

Lambda needs an IAM role to access AWS services (e.g., CloudWatch for logging).

**Requires IAM permissions.** If the role doesn't exist:

```bash
# Create trust policy - allows Lambda service to assume this role
cat > trust-policy.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

# Create role - defines who can use it
aws iam create-role \
  --role-name lambda-execution-role \
  --assume-role-policy-document file://trust-policy.json

# Attach execution policy - grants CloudWatch logging permissions
aws iam attach-role-policy \
  --role-name lambda-execution-role \
  --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
```


##### Get IAM Role ARN

Lambda function creation requires the role ARN to assign permissions.

Get the ARN of an existing Lambda execution role:

```bash
ROLE_ARN=$(aws iam get-role --role-name lambda-execution-role --query 'Role.Arn' --output text)
echo "Role ARN: $ROLE_ARN"
```


##### Get ECR Image URI

Lambda needs the full ECR image URI to pull the container image.

Construct the ECR image URI after publishing:

```bash
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
ECR_URI="${AWS_ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/churn-prediction-lambda:v1"
echo "ECR URI: $ECR_URI"
```


##### Create Lambda Function

Create the Lambda function with the container image:

```bash
aws lambda create-function \
  --function-name churn-prediction-docker \
  --package-type Image \
  --code ImageUri="${ECR_URI}" \
  --role "${ROLE_ARN}" \
  --timeout 30 \
  --memory-size 512 \
  --region us-east-1
```


### Invoking the Lambda Function: Test the deployed function and integrate it into applications

After creating the Lambda function, you can invoke it programmatically using Python and boto3.


The `invoke.py` script demonstrates:

1. **Initialize boto3 Lambda client** - Creates a client to interact with AWS Lambda
2. **Prepare customer data** - Formats input in the expected structure
3. **Invoke the function** - Calls Lambda with customer data
4. **Process the response** - Parses and displays prediction results

**Key parameters:**
- `FunctionName`: Specifies which Lambda function to invoke
- `InvocationType='RequestResponse'`: Synchronous invocation (waits for response)
- `Payload`: JSON-encoded input data


In [None]:
%%writefile lambda-sklearn/invoke.py
import boto3
import json

lambda_client = boto3.client('lambda')

customer_data = {
    "customer": {
        "gender": "female",
        "seniorcitizen": 0,
        "partner": "yes",
        "dependents": "no",
        "phoneservice": "no",
        "multiplelines": "no_phone_service",
        "internetservice": "dsl",
        "onlinesecurity": "no",
        "onlinebackup": "yes",
        "deviceprotection": "no",
        "techsupport": "no",
        "streamingtv": "no",
        "streamingmovies": "no",
        "contract": "month-to-month",
        "paperlessbilling": "yes",
        "paymentmethod": "electronic_check",
        "tenure": 1,
        "monthlycharges": 29.85,
        "totalcharges": 29.85
    }
}

response = lambda_client.invoke(
    FunctionName='churn-prediction-docker',
    InvocationType='RequestResponse',
    Payload=json.dumps(customer_data)
)

result = json.loads(response['Payload'].read())
print(json.dumps(result, indent=2))


### Updating Lambda Function

Deploy new code changes without recreating the function.

```bash
# Build and push new image version
REMOTE_IMAGE_TAG="${ECR_URL}/churn-prediction-lambda:v2"

docker build -t churn-prediction-lambda .
docker tag churn-prediction-lambda ${REMOTE_IMAGE_TAG}
docker push ${REMOTE_IMAGE_TAG}

# Update function to use new image
aws lambda update-function-code \
  --function-name churn-prediction-docker \
  --image-uri ${REMOTE_IMAGE_TAG} \
  --region us-east-1
```


## Part 5: TensorFlow/Keras Models with ONNX

Previously we used TF-lite for AWS Lambda. In this workshop, we'll use an alternative - ONNX (Open Neural Network Exchange).

We'll use the same Keras model as before. It was retrained for the newest TF version. You can see the [training process here](https://colab.research.google.com/drive/1GTkGkq1QKOtAL0wiMjYr-LOCpVgsAwyk?usp=sharing).

### Downloading the Model

Create the directory and download the Keras model:


In [None]:
import os

# Create lambda-keras directory
os.makedirs('lambda-keras', exist_ok=True)

# Download the Keras model
!cd lambda-keras && wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing-model-new.keras


### Converting to ONNX

The conversion happens in two steps:

1. **Convert Keras model to TensorFlow SavedModel format**
2. **Convert SavedModel to ONNX format**

#### STEP 1: Convert Keras model to TensorFlow SavedModel format

In [None]:
import sys
print("Python executable:", sys.executable)
print("Python path:", sys.path[0])

In [None]:
# Install TensorFlow (required for model conversion)
# Note: After installation, restart the kernel for the import to work
%pip install tensorflow


In [None]:
# Step 1: Convert Keras model to SavedModel format
# Note: Requires TensorFlow installed and kernel restarted after installation
# Alternative: Skip this step and download the pre-converted ONNX model (see Step 2)

from tensorflow import keras

model = keras.models.load_model('lambda-keras/clothing-model-new.keras')
model.export("lambda-keras/clothing-model-new_savedmodel")

print("Model converted to SavedModel format")


#### STEP 2: Convert SavedModel to ONNX

**Recommended:** Download the pre-converted ONNX model (avoids TensorFlow installation issues):

```bash
cd lambda-keras
wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing-model-new.onnx
```

**Alternative:** Convert manually. First install tf2onnx (install from GitHub as the latest release doesn't support numpy 2):

```bash
pip install git+https://github.com/onnx/tensorflow-onnx.git
```

Then convert:

```bash
python -m tf2onnx.convert \
    --saved-model lambda-keras/clothing-model-new_savedmodel \
    --opset 13 \
    --output lambda-keras/clothing-model-new.onnx
```

**Note:** To avoid version conflicts between TensorFlow and ONNX, consider using Docker for conversion (see README.md for Docker-based conversion steps).

##### Manual Conversion: Install tf2onnx

Install tf2onnx from GitHub (latest release doesn't support numpy 2):


In [None]:
# Install tf2onnx from GitHub
%pip install git+https://github.com/onnx/tensorflow-onnx.git


##### Manual Conversion: Convert SavedModel to ONNX

**⚠️ Version Conflict Warning:** tf2onnx may install an incompatible protobuf version. If you encounter import errors, use Docker for conversion (see README.md) or download the pre-converted model.

Convert the SavedModel format to ONNX format:


In [None]:
# Convert SavedModel to ONNX
!python -m tf2onnx.convert \
    --saved-model lambda-keras/clothing-model-new_savedmodel \
    --opset 13 \
    --output lambda-keras/clothing-model-new.onnx


##### Download Pre-converted Model (Recommended)

If you prefer to skip the conversion, download the pre-converted ONNX model:


In [None]:
# Download pre-converted ONNX model (recommended - avoids TensorFlow installation)
!cd lambda-keras && wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing-model-new.onnx


### Using ONNX Runtime

Now our models are saved in the ONNX format. Like with TF-lite, we only need ONNX-Runtime to run it.

Like in the module, we can't use TF for our preprocessing. That's why we will rely on `keras-image-helper` to do that.


In [None]:
%pip install onnxruntime
%pip install keras_image_helper

In [None]:
# Example: Using ONNX Runtime for inference
import onnxruntime as ort
from keras_image_helper import create_preprocessor

# Load ONNX model
onnx_model_path = "lambda-keras/clothing-model-new.onnx"
session = ort.InferenceSession(onnx_model_path, providers=["CPUExecutionProvider"])
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

# Get the image
preprocessor = create_preprocessor('xception', target_size=(299, 299))
url = 'http://bit.ly/mlbookcamp-pants'
X = preprocessor.from_url(url)

# Make predictions
result = session.run([output_name], {input_name: X})
predictions = result[0][0].tolist()

classes = ['dress', 'hat', 'longsleeve', 'outwear', 'pants', 'shirt', 'shoes', 'shorts', 'skirt', 't-shirt']
dict(zip(classes, predictions))


### Lambda Function for ONNX Model

This Lambda function uses ONNX Runtime to serve the converted Keras model. It loads the ONNX model, preprocesses images using `keras-image-helper`, and returns predictions for clothing categories.

In [None]:
%%writefile lambda-keras/lambda_function.py
import onnxruntime as ort
from keras_image_helper import create_preprocessor

preprocessor = create_preprocessor("xception", target_size=(299, 299))

session = ort.InferenceSession(
    "clothing-model-new.onnx", providers=["CPUExecutionProvider"]
)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

classes = [
    "dress",
    "hat",
    "longsleeve",
    "outwear",
    "pants",
    "shirt",
    "shoes",
    "shorts",
    "skirt",
    "t-shirt",
]


def predict(url):
    X = preprocessor.from_url(url)
    result = session.run([output_name], {input_name: X})
    float_predictions = result[0][0].tolist()
    return dict(zip(classes, float_predictions))


def lambda_handler(event, context):
    url = event["url"]
    result = predict(url)
    return result


### Dockerizing the Lambda Function

In [None]:
%%writefile lambda-keras/Dockerfile
FROM public.ecr.aws/lambda/python:3.13

RUN pip install onnxruntime keras-image-helper

COPY clothing-model-new.onnx clothing-model-new.onnx
COPY lambda_function.py ./

CMD ["lambda_function.lambda_handler"]


### Building and Running the Docker Container

Build the Docker image:

```bash
docker build -t clothing-lambda-keras lambda-keras
```

Run it locally:

```bash
docker run -it --rm -p 8080:8080 clothing-lambda-keras
```


### Testing the Lambda Function

Test the function locally:


In [None]:
import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'

request = {
    "url": "http://bit.ly/mlbookcamp-pants"
}

result = requests.post(url, json=request).json()
print(result)


## Part 6: PyTorch Models with ONNX

With PyTorch, we can do the same:
- Convert a model to ONNX
- Serve it with the same code as before

Here's a [model we trained with PyTorch](https://colab.research.google.com/drive/1_kvvbi_msBuTFkkdLxMEpB3mj-Jhh-Bc?usp=sharing).

In PyTorch, when we train a model, we can save it directly to ONNX:

```python
torch.onnx.export(
    model,
    input,
    onnx_path,
    ...
)
```

### Downloading Pre-converted PyTorch ONNX Model

In [None]:
# Download PyTorch ONNX model
import os
os.makedirs('lambda-pytorch', exist_ok=True)

!cd lambda-pytorch && wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing_classifier_mobilenet_v2_latest.onnx


### Lambda Function for PyTorch ONNX Model

This Lambda function uses ONNX Runtime to serve the PyTorch model. It uses a custom preprocessor for PyTorch's image format.


In [None]:
%%writefile lambda-pytorch/lambda_function.py
import onnxruntime as ort
from keras_image_helper import create_preprocessor
import numpy as np

def preprocess_pytorch(X):
    X = X / 255.0
    mean = np.array([0.485, 0.456, 0.406]).reshape(1, 3, 1, 1)
    std = np.array([0.229, 0.224, 0.225]).reshape(1, 3, 1, 1)
    X = X.transpose(0, 3, 1, 2)
    X = (X - mean) / std
    return X.astype(np.float32)

preprocessor = create_preprocessor(preprocess_pytorch, target_size=(224, 224))

session = ort.InferenceSession(
    "clothing_classifier_mobilenet_v2_latest.onnx", 
    providers=["CPUExecutionProvider"]
)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

classes = [
    "dress", "hat", "longsleeve", "outwear", "pants",
    "shirt", "shoes", "shorts", "skirt", "t-shirt"
]

def predict(url):
    X = preprocessor.from_url(url)
    result = session.run([output_name], {input_name: X})
    float_predictions = result[0][0].tolist()
    return dict(zip(classes, float_predictions))

def lambda_handler(event, context):
    url = event["url"]
    result = predict(url)
    return result


### Dockerizing the Lambda Function


In [None]:
%%writefile lambda-pytorch/Dockerfile
FROM public.ecr.aws/lambda/python:3.13

RUN pip install onnxruntime keras-image-helper

COPY clothing_classifier_mobilenet_v2_latest.onnx clothing_classifier_mobilenet_v2_latest.onnx
COPY lambda_function.py ./

CMD ["lambda_function.lambda_handler"]


### Building and Running the Docker Container

Build the Docker image:

```bash
docker build -t clothing-lambda-onnx lambda-pytorch
```

Run it locally:

```bash
docker run -it --rm -p 8080:8080 clothing-lambda-onnx
```


### Testing the Lambda Function

Test the function locally:


In [None]:
import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'

request = {
    "url": "http://bit.ly/mlbookcamp-pants"
}

result = requests.post(url, json=request).json()
print(result)
