# Machine Learning and Deep Learning Model Deployment with Serverless

This notebook covers deploying ML models using AWS Lambda and Docker, including:
- Scikit-Learn model deployment
- Deep Learning models with ONNX (TensorFlow/Keras and PyTorch)

**Video**: https://www.youtube.com/watch?v=sHQaeVm5hT8

## Prerequisites

- AWS Account
- AWS CLI installed and configured
- Docker installed


## Part 1: Training a Scikit-Learn Model

First, we'll train a simple churn prediction model using Scikit-Learn. This model will be deployed to AWS Lambda in the following sections.


In [1]:
import pickle

import pandas as pd
import sklearn

from sklearn.feature_extraction import DictVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline


print(f'pandas=={pd.__version__}')
print(f'sklearn=={sklearn.__version__}')


pandas==2.3.3
sklearn==1.6.1


In [2]:
# def load_data():
#     """
#     Loads and preprocesses the Telco customer churn dataset.
    
#     Returns:
#         pd.DataFrame: Preprocessed dataframe with cleaned column names and data types.
#     """
#     data_url = 'https://raw.githubusercontent.com/alexeygrigorev/mlbookcamp-code/master/chapter-03-churn-prediction/WA_Fn-UseC_-Telco-Customer-Churn.csv'

#     df = pd.read_csv(data_url)

#     # Normalize column names
#     df.columns = df.columns.str.lower().str.replace(' ', '_')

#     # Normalize categorical values
#     categorical_columns = list(df.dtypes[df.dtypes == 'object'].index)

#     for column in categorical_columns:
#         df[column] = df[column].str.lower().str.replace(' ', '_')

#     # Handle numeric conversion and missing values
#     df.totalcharges = pd.to_numeric(df.totalcharges, errors='coerce')
#     df.totalcharges = df.totalcharges.fillna(0)

#     # Convert churn to binary
#     df.churn = (df.churn == 'yes').astype(int)
    
#     return df


In [3]:
# def train_model(df):
#     """
#     Trains a logistic regression model for churn prediction.
    
#     Args:
#         df (pd.DataFrame): Preprocessed dataframe with features and target.
    
#     Returns:
#         sklearn.pipeline.Pipeline: Trained pipeline with DictVectorizer and LogisticRegression.
#     """
#     numerical_features = ['tenure', 'monthlycharges', 'totalcharges']

#     categorical_features = [
#         'gender',
#         'seniorcitizen',
#         'partner',
#         'dependents',
#         'phoneservice',
#         'multiplelines',
#         'internetservice',
#         'onlinesecurity',
#         'onlinebackup',
#         'deviceprotection',
#         'techsupport',
#         'streamingtv',
#         'streamingmovies',
#         'contract',
#         'paperlessbilling',
#         'paymentmethod',
#     ]

#     y_train = df.churn
#     train_dict = df[categorical_features + numerical_features].to_dict(orient='records')

#     pipeline = make_pipeline(
#         DictVectorizer(),
#         LogisticRegression(solver='liblinear')
#     )

#     pipeline.fit(train_dict, y_train)

#     return pipeline


In [4]:
# def save_model(pipeline, output_file):
#     """
#     Saves a trained pipeline to disk using pickle.
    
#     Args:
#         pipeline: Trained sklearn pipeline to save.
#         output_file (str): Path where the model will be saved.
#     """
#     with open(output_file, 'wb') as f_out:
#         pickle.dump(pipeline, f_out)


In [5]:
# # Load data and train model
# df = load_data()
# pipeline = train_model(df)
# save_model(pipeline, 'model.bin')

# print('Model saved to model.bin')


## Part 2: AWS Lambda Basics

AWS Lambda is a serverless compute service that runs code in response to events. Let's start with a simple Lambda function that returns a mock prediction.


### Invoking the Lambda Function

Once deployed, you can invoke the Lambda function in several ways:

#### Method 1: AWS CLI

First, create a JSON file with the customer data (e.g., `customer.json`):

```json
{
  "customer": {
    "gender": "female",
    "seniorcitizen": 0,
    "partner": "yes",
    "dependents": "no",
    "phoneservice": "no",
    "multiplelines": "no_phone_service",
    "internetservice": "dsl",
    "onlinesecurity": "no",
    "onlinebackup": "yes",
    "deviceprotection": "no",
    "techsupport": "no",
    "streamingtv": "no",
    "streamingmovies": "no",
    "contract": "month-to-month",
    "paperlessbilling": "yes",
    "paymentmethod": "electronic_check",
    "tenure": 1,
    "monthlycharges": 29.85,
    "totalcharges": 29.85
  }
}
```

Then invoke the function:

```bash
aws lambda invoke --function-name churn_prediction --cli-binary-format raw-in-base64-out --payload file://customer.json --region us-west-2 output.json && cat output.json
```

**Note:** Make sure to specify the `--region` parameter matching the region where your Lambda function is deployed. If you haven't configured AWS CLI defaults, you can also set the region using `aws configure` or by setting the `AWS_DEFAULT_REGION` environment variable.

The response will be saved to `output.json`.


#### Method 2: Using boto3 (Python)

You can also invoke the Lambda function programmatically using boto3.

#### Using `aws login` credentials

The code below uses `aws login` credentials. For a simpler alternative, use `aws configure` instead - then boto3 will work automatically without credential loading code. See [troubleshooting guide](aws-docs/troubleshooting.md) for details.

In [None]:
# import boto3
# import json
# import os
# from pathlib import Path


# def load_aws_login_credentials():
#     """
#     Loads credentials from aws login cache.
    
#     Returns:
#         dict: Credentials dict with access_key, secret_key, token, or None if not found.
#     """
#     login_cache_dir = Path.home() / '.aws' / 'login' / 'cache'
#     credential_files = list(login_cache_dir.glob('*.json'))
    
#     if not credential_files:
#         return None
    
#     try:
#         with open(credential_files[0]) as f:
#             creds_data = json.load(f)
#             access_token = creds_data.get('accessToken', {})
#             return {
#                 'aws_access_key_id': access_token.get('accessKeyId'),
#                 'aws_secret_access_key': access_token.get('secretAccessKey'),
#                 'aws_session_token': access_token.get('sessionToken')
#             }
#     except Exception:
#         return None


# # Load credentials once at module level
# _aws_creds = load_aws_login_credentials()
# if _aws_creds:
#     os.environ.update(_aws_creds)
#     print("✓ AWS credentials loaded from aws login cache")
# else:
#     print("⚠ Warning: No credentials found in aws login cache. Make sure you've run 'aws login'")

✓ AWS credentials loaded from aws login cache


In [None]:
# def invoke_lambda_function(function_name, payload, region_name='us-west-2'):
#     """
#     Invokes an AWS Lambda function.
    
#     Args:
#         function_name (str): Lambda function name.
#         payload (dict): Request payload.
#         region_name (str): AWS region. Defaults to 'us-west-2'.
    
#     Returns:
#         dict: Lambda function response.
#     """
#     if _aws_creds:
#         session = boto3.Session(
#             aws_access_key_id=_aws_creds['aws_access_key_id'],
#             aws_secret_access_key=_aws_creds['aws_secret_access_key'],
#             aws_session_token=_aws_creds['aws_session_token'],
#             region_name=region_name
#         )
#     else:
#         session = boto3.Session(region_name=region_name)
    
#     lambda_client = session.client('lambda', region_name=region_name)
#     response = lambda_client.invoke(
#         FunctionName=function_name,
#         InvocationType='RequestResponse',
#         Payload=json.dumps(payload)
#     )
#     return json.loads(response['Payload'].read())

#### Alternative: Using `aws configure` (Simpler)

If you use `aws configure` instead of `aws login`, boto3 works automatically without credential loading code:

**Setup:** Run `aws configure` in terminal, then boto3 automatically uses credentials from `~/.aws/credentials`. See [troubleshooting guide](aws-docs/troubleshooting.md) for details.


In [None]:
import boto3
import json

def invoke_lambda_function(function_name, payload, region_name='us-east-1'):
    """
    Invokes an AWS Lambda function.
    
    Args:
        function_name (str): Lambda function name.
        payload (dict): Request payload.
        region_name (str): AWS region. Defaults to 'us-west-2'.
    
    Returns:
        dict: Lambda function response.
    """
    lambda_client = boto3.client('lambda', region_name=region_name)
    response = lambda_client.invoke(
        FunctionName=function_name,
        InvocationType='RequestResponse',
        Payload=json.dumps(payload)
    )
    return json.loads(response['Payload'].read())

In [4]:
customer_data = {
    "customer": {
        "gender": "female",
        "seniorcitizen": 0,
        "partner": "yes",
        "dependents": "no",
        "phoneservice": "no",
        "multiplelines": "no_phone_service",
        "internetservice": "dsl",
        "onlinesecurity": "no",
        "onlinebackup": "yes",
        "deviceprotection": "no",
        "techsupport": "no",
        "streamingtv": "no",
        "streamingmovies": "no",
        "contract": "month-to-month",
        "paperlessbilling": "yes",
        "paymentmethod": "electronic_check",
        "tenure": 1,
        "monthlycharges": 29.85,
        "totalcharges": 29.85
    }
}

result = invoke_lambda_function('churn_prediction', customer_data, region_name='us-east-1')
print(json.dumps(result, indent=2))

{
  "churn_probability": 0.56,
  "churn": true
}


**Note:** You can also expose the Lambda function as a web service using API Gateway. See [unit 9.7 about API Gateway](https://github.com/DataTalksClub/machine-learning-zoomcamp/blob/master/09-serverless/07-api-gateway.md) for more details.


## Part 3: AWS Lambda with Docker

Scikit-Learn and its dependencies exceed the 250MB ZIP archive limit for Lambda. Docker containers solve this problem.

We'll set up the Lambda function using UV for dependency management, which provides a modern and efficient way to handle Python packages.

### Loading the Model and Lambda Function Script

In [None]:
import os
os.makedirs('lambda-sklearn', exist_ok=True)  # Create directory if it doesn't exist

In [3]:
%%writefile lambda-sklearn/lambda_function.py
import pickle

with open('model.bin', 'rb') as f_in:
    pipeline = pickle.load(f_in)

def predict_single(customer):
    result = pipeline.predict_proba(customer)[0, 1]
    return float(result)

def lambda_handler(event, context):
    customer = event['customer']
    prediction = predict_single(customer)
    
    return {
        'churn_probability': prediction,
        'churn': prediction >= 0.5
    }


Writing lambda-sklearn/lambda_function.py


### Setting Up Dependencies with UV

We'll use UV (a fast Python package manager) to manage dependencies systematically:

1. Create `requirements.in` - centralized dependency list
2. Initialize UV project - creates `pyproject.toml`
3. Add dependencies to `pyproject.toml` from `requirements.in`
4. Generate lock file - creates `uv.lock` for reproducible builds

In [5]:
%%writefile lambda-sklearn/requirements.in
pandas
scikit-learn

Writing lambda-sklearn/requirements.in


In [8]:
# Initialize UV project (creates pyproject.toml)
!cd lambda-sklearn && uv init --name churn-prediction-lambda --no-readme


Initialized project `[36mchurn-prediction-lambda[39m`


In [11]:
# Add all dependencies from requirements.in to pyproject.toml
!cd lambda-sklearn && uv add $(grep -v '^#' requirements.in | xargs)

Using CPython [36m3.13.9[39m
Creating virtual environment at: [36m.venv[39m
[2K[37m⠙[0m [2mchurn-prediction-lambda==0.1.0                                                [0m

[2K[2mResolved [1m11 packages[0m [2min 1.03s[0m[0m                                        [0m
[2K[37m⠙[0m [2mPreparing packages...[0m (0/4)                                                   [37m⠋[0m [2mPreparing packages...[0m (0/0)                                                   
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/4)-------------------[0m[0m     0 B/15.83 MiB           [1A
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/4)-------------------[0m[0m     0 B/15.83 MiB           [1A
[2mnumpy               [0m [32m[30m[2m------------------------------[0m[0m     0 B/15.83 MiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/4)-------------------[0m[0m     0 B/34.04 MiB           [2A
[2mnumpy               [0m [32m[30m[2m------------------------------[0m[0m     0 B/15.83 MiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/4)-------------------[0m[0m 16.00 KiB/34.04 MiB         [2A
[2mnumpy               [0m [32m[30m

In [12]:
# Generate lock file
!cd lambda-sklearn && uv lock

[2mResolved [1m11 packages[0m [2min 0.72ms[0m[0m


### Dockerfile

The Dockerfile uses the AWS Lambda Python base image and UV to install dependencies:

- Base image: `public.ecr.aws/lambda/python:3.13`
- UV package manager: Copied from official UV Docker image
- Dependencies: Installed from `pyproject.toml` and `uv.lock` using UV
- Application files: `lambda_function.py` and `model.bin`

In [13]:
%%writefile lambda-sklearn/Dockerfile

# Use the official AWS Lambda base image for Python 3.13
FROM public.ecr.aws/lambda/python:3.13

# # Instead of installing 'uv' via curl/pip, we copy the binary directly from its official Docker image
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/

# Copy dependency definition files into the container
COPY pyproject.toml uv.lock ./

# Install dependencies system-wide inside the Lambda image
# AWS Lambda does NOT use virtual environments, so packages must go into the system site-packages
RUN uv pip install --system -r <(uv export --format requirements-txt)

# Copy the application code into the container
COPY lambda_function.py model.bin ./

# Set the entry point to the lambda_handler function
CMD ["lambda_function.lambda_handler"]


Writing lambda-sklearn/Dockerfile


NOTE:

```text
[ EXTERNAL REGISTRIES ]                        [ LOCAL PROJECT ]
                 |                                     |
 1. BASE IMAGE   |                                     |
 +----------------------------------+                  |
 |  public.ecr.aws/lambda/python    |                  |
 |  (OS + Python 3.13 Runtime)      |                  |
 +---------------+------------------+                  |
                 |                                     |
 2. INJECT TOOLS | (Multi-stage)                       |
 +---------------v------------------+                  |
 | COPY /uv binary from ghcr.io...  |                  |
 | (Fast Python package manager)    |                  |
 +---------------+------------------+                  |
                 |                                     |
 3. DEPENDENCIES |                                     |
 +---------------v------------------+      +-----------v-----------+
 | COPY pyproject.toml & uv.lock    |<-----|  Dependency Files     |
 | RUN uv pip install --system ...  |      +-----------+-----------+
 | (Installs libs directly to OS)   |                  |
 +---------------+------------------+                  |
                 |                                     |
 4. APP CODE     |                                     |
 +---------------v------------------+      +-----------v-----------+
 | COPY lambda_function.py          |<-----|  Source Code & Model  |
 | COPY model.bin                   |      +-----------------------+
 +---------------+------------------+
                 |
 5. ENTRY POINT  |
 +---------------v------------------+
 | CMD [lambda_handler]             |
 | (Waits for AWS Invoke events)    |
 +----------------------------------+
                 |
        [ FINAL DOCKER IMAGE ]

In [15]:
# Build Docker container locally
!cd lambda-sklearn && docker build -t churn-prediction-lambda .

[1A[1B[0G[?25l[+] Building 0.0s (0/1)                                          docker:default
[?25h[1A[0G[?25l[+] Building 0.2s (2/4)                                          docker:default
[34m => [internal] load build definition from Dockerfile                       0.0s
[0m[34m => => transferring dockerfile: 783B                                       0.0s
[0m => [internal] load metadata for public.ecr.aws/lambda/python:3.13         0.2s
 => [internal] load metadata for ghcr.io/astral-sh/uv:latest               0.2s
[34m => [auth] astral-sh/uv:pull token for ghcr.io                             0.0s
[0m[?25h[1A[1A[1A[1A[1A[1A[0G[?25l[+] Building 0.3s (2/4)                                          docker:default
[34m => [internal] load build definition from Dockerfile                       0.0s
[0m[34m => => transferring dockerfile: 783B                                       0.0s
[0m => [internal] load metadata for public.ecr.aws/lambda/python:3.13         0.

### Local Testing

In [22]:
# To run the Docker container locally, execute this command in your terminal:
# docker run -it --rm -p 8080:8080 churn-prediction-lambda
#
# Note: This is an interactive command that runs in the foreground.
# Run it in a separate terminal/bash session, not in this notebook.

In [23]:
import requests

url = 'http://localhost:8080/2015-03-31/functions/function/invocations'

request = {
    "url": "http://bit.ly/mlbookcamp-pants"
}

result = requests.post(url, json=request).json()
print(result)

{'errorMessage': "'customer'", 'errorType': 'KeyError', 'requestId': '0f2a226a-9b76-49d9-b13e-28acb3e5c7aa', 'stackTrace': ['  File "/var/task/lambda_function.py", line 11, in lambda_handler\n    customer = event[\'customer\']\n']}


## Part 4: AWS Lambda Deployment

### Creating ECR Repository

First, create an Elastic Container Registry (ECR) repository:

```bash
aws ecr create-repository \
  --repository-name "churn-prediction-lambda" \
  --region "eu-west-1"
```

### Building and Pushing Docker Image

```bash
# Set your ECR URL (from the repository creation response)
ECR_URL="YOUR_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com"

# Login to ECR
aws ecr get-login-password \
  --region "eu-west-1" \
  | docker login \
  --username AWS \
  --password-stdin ${ECR_URL}

# Build, tag, and push
REMOTE_IMAGE_TAG="${ECR_URL}/churn-prediction-lambda:v1"

docker build -t churn-prediction-lambda .
docker tag churn-prediction-lambda ${REMOTE_IMAGE_TAG}
docker push ${REMOTE_IMAGE_TAG}
```

### Creating Lambda Function

1. Go to AWS Console → Lambda
2. Create function → Container image
3. Name: "churn-prediction-docker"
4. Select your container image
5. Create function
6. Increase timeout to 30 seconds (Configuration → General Configuration → Edit)

### Updating Lambda Function

```bash
REMOTE_IMAGE_TAG="${ECR_URL}/churn-prediction-lambda:v2"

docker build -t churn-prediction-lambda .
docker tag churn-prediction-lambda ${REMOTE_IMAGE_TAG}
docker push ${REMOTE_IMAGE_TAG}

aws lambda update-function-code \
  --function-name churn-prediction-docker \
  --image-uri ${REMOTE_IMAGE_TAG} \
  --region eu-west-1
```


## Part 5: TensorFlow/Keras Models with ONNX

Instead of TF-lite, we'll use ONNX (Open Neural Network Exchange) for deploying deep learning models. ONNX provides better compatibility and is easier to work with.

### Downloading the Model

```bash
wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing-model-new.keras
```

### Converting to ONNX

The conversion happens in two steps:

1. **Convert Keras model to TensorFlow SavedModel format**
2. **Convert SavedModel to ONNX format**


In [None]:
# Step 1: Convert Keras model to SavedModel format
from tensorflow import keras

model = keras.models.load_model('clothing-model-new.keras')
model.export("clothing-model-new_savedmodel")

print("Model converted to SavedModel format")


**Step 2: Convert SavedModel to ONNX**

This step should be done in a Docker container to avoid version conflicts. The command is:

```bash
python -m tf2onnx.convert \
    --saved-model clothing-model-new_savedmodel \
    --opset 13 \
    --output clothing-model-new.onnx
```

Or download the pre-converted model:

```bash
wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing-model-new.onnx
```


### Using ONNX Runtime

Once we have the ONNX model, we can use ONNX Runtime for inference:


In [None]:
import onnxruntime as ort

def load_onnx_model(model_path):
    """
    Loads an ONNX model and returns the inference session and input/output names.
    
    Args:
        model_path (str): Path to the ONNX model file.
    
    Returns:
        tuple: (session, input_name, output_name)
    """
    session = ort.InferenceSession(model_path, providers=["CPUExecutionProvider"])
    
    inputs = session.get_inputs()
    outputs = session.get_outputs()
    
    input_name = inputs[0].name
    output_name = outputs[0].name
    
    return session, input_name, output_name


# Load the model (if available)
# onnx_model_path = "clothing-model-new.onnx"
# session, input_name, output_name = load_onnx_model(onnx_model_path)


In [None]:
from keras_image_helper import create_preprocessor

def create_image_preprocessor():
    """
    Creates a preprocessor for Xception model input format.
    
    Returns:
        Preprocessor object for image preprocessing.
    """
    preprocessor = create_preprocessor('xception', target_size=(299, 299))
    return preprocessor


# Example usage (if model is available):
# preprocessor = create_image_preprocessor()
# url = 'http://bit.ly/mlbookcamp-pants'
# X = preprocessor.from_url(url)


In [None]:
def predict_clothing_class(session, input_name, output_name, preprocessor, image_url):
    """
    Makes a clothing classification prediction using ONNX model.
    
    Args:
        session: ONNX Runtime inference session.
        input_name (str): Name of the input tensor.
        output_name (str): Name of the output tensor.
        preprocessor: Image preprocessor object.
        image_url (str): URL of the image to classify.
    
    Returns:
        dict: Dictionary mapping class names to prediction probabilities.
    """
    classes = [
        'dress',
        'hat',
        'longsleeve',
        'outwear',
        'pants',
        'shirt',
        'shoes',
        'shorts',
        'skirt',
        't-shirt'
    ]
    
    X = preprocessor.from_url(image_url)
    result = session.run([output_name], {input_name: X})
    predictions = result[0][0].tolist()
    
    return dict(zip(classes, predictions))


# Example usage (if model is available):
# result = predict_clothing_class(session, input_name, output_name, preprocessor, url)
# print(result)


### Lambda Function for ONNX Model

Here's the complete Lambda function for serving the ONNX model:

```python
import onnxruntime as ort
from keras_image_helper import create_preprocessor

preprocessor = create_preprocessor("xception", target_size=(299, 299))

session = ort.InferenceSession(
    "clothing-model-new.onnx", providers=["CPUExecutionProvider"]
)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

classes = [
    "dress",
    "hat",
    "longsleeve",
    "outwear",
    "pants",
    "shirt",
    "shoes",
    "shorts",
    "skirt",
    "t-shirt",
]


def predict(url):
    X = preprocessor.from_url(url)
    result = session.run([output_name], {input_name: X})
    float_predictions = result[0][0].tolist()
    return dict(zip(classes, float_predictions))


def lambda_handler(event, context):
    url = event["url"]
    result = predict(url)
    return result
```

**Dockerfile:**
```dockerfile
FROM public.ecr.aws/lambda/python:3.13

RUN pip install onnxruntime keras-image-helper

COPY clothing-model-new.onnx clothing-model-new.onnx
COPY lambda_function.py ./

CMD ["lambda_function.lambda_handler"]
```


## Part 6: PyTorch Models with ONNX

PyTorch models can also be converted to ONNX and served the same way. The main difference is in the preprocessing format.

### Downloading PyTorch ONNX Model

```bash
wget https://github.com/DataTalksClub/machine-learning-zoomcamp/releases/download/dl-models/clothing_classifier_mobilenet_v2_latest.onnx
```

### PyTorch Preprocessing

PyTorch uses a different image format (NCHW instead of NHWC):


In [None]:
import numpy as np
from keras_image_helper import create_preprocessor


def preprocess_pytorch(X):
    """
    Preprocesses images for PyTorch models (NCHW format).
    
    Args:
        X: Input image array with shape (1, 299, 299, 3), dtype=float32, values in [0, 255]
    
    Returns:
        np.ndarray: Preprocessed image in NCHW format (batch, channels, height, width)
    """
    # Normalize to [0, 1]
    X = X / 255.0

    # ImageNet normalization constants
    mean = np.array([0.485, 0.456, 0.406]).reshape(1, 3, 1, 1)
    std = np.array([0.229, 0.224, 0.225]).reshape(1, 3, 1, 1)

    # Convert NHWC → NCHW
    # from (batch, height, width, channels) → (batch, channels, height, width)
    X = X.transpose(0, 3, 1, 2)

    # Normalize
    X = (X - mean) / std

    return X.astype(np.float32)


# Create preprocessor with PyTorch preprocessing
preprocessor_pytorch = create_preprocessor(preprocess_pytorch, target_size=(224, 224))


### Lambda Function for PyTorch ONNX Model

The Lambda function structure is similar, but uses the PyTorch preprocessor:

```python
import onnxruntime as ort
from keras_image_helper import create_preprocessor
import numpy as np

def preprocess_pytorch(X):
    X = X / 255.0
    mean = np.array([0.485, 0.456, 0.406]).reshape(1, 3, 1, 1)
    std = np.array([0.229, 0.224, 0.225]).reshape(1, 3, 1, 1)
    X = X.transpose(0, 3, 1, 2)
    X = (X - mean) / std
    return X.astype(np.float32)

preprocessor = create_preprocessor(preprocess_pytorch, target_size=(224, 224))

session = ort.InferenceSession(
    "clothing_classifier_mobilenet_v2_latest.onnx", 
    providers=["CPUExecutionProvider"]
)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

classes = [
    "dress", "hat", "longsleeve", "outwear", "pants",
    "shirt", "shoes", "shorts", "skirt", "t-shirt"
]

def predict(url):
    X = preprocessor.from_url(url)
    result = session.run([output_name], {input_name: X})
    float_predictions = result[0][0].tolist()
    return dict(zip(classes, float_predictions))

def lambda_handler(event, context):
    url = event["url"]
    result = predict(url)
    return result
```


## Summary

This notebook covered:

1. **Scikit-Learn Model Deployment**
   - Training a churn prediction model
   - Basic AWS Lambda function creation
   - Docker containerization to overcome size limitations
   - ECR deployment and AWS Lambda container image deployment

2. **Deep Learning Model Deployment with ONNX**
   - Converting TensorFlow/Keras models to ONNX format
   - Converting PyTorch models to ONNX format
   - Using ONNX Runtime for efficient inference
   - Docker-based deployment for deep learning models

### Key Takeaways

- **Docker solves size limitations**: Scikit-Learn dependencies exceed Lambda's ZIP limit
- **ONNX is a better alternative to TF-lite**: Easier to work with and better compatibility
- **Local testing is crucial**: Test Docker containers locally before deploying
- **ONNX Runtime is lightweight**: Only need ONNX Runtime, not full TensorFlow/PyTorch

### Next Steps

- Explore API Gateway integration for web service exposure
- Implement monitoring and logging with CloudWatch
- Consider using AWS Step Functions for complex ML workflows
- Explore other serverless services like AWS Batch for training
