## Homework

In this homework, we'll deploy the Straight vs Curly Hair Type model we trained in the previous homework.

Download the model files from here:

* [https://github.com/alexeygrigorev/large-datasets/releases/download/hairstyle/hair_classifier_v1.onnx.data](https://github.com/alexeygrigorev/large-datasets/releases/download/hairstyle/hair_classifier_v1.onnx.data)
* [https://github.com/alexeygrigorev/large-datasets/releases/download/hairstyle/hair_classifier_v1.onnx](https://github.com/alexeygrigorev/large-datasets/releases/download/hairstyle/hair_classifier_v1.onnx)

With wget:
```bash
PREFIX="https://github.com/alexeygrigorev/large-datasets/releases/download/hairstyle"
DATA_URL="${PREFIX}/hair_classifier_v1.onnx.data"
MODEL_URL="${PREFIX}/hair_classifier_v1.onnx"
wget ${DATA_URL}
wget ${MODEL_URL}
```

### Question 1

To be able to use this model, we need to know the name of the input and output nodes.

What's the name of the output:

* `output`  <---
* `sigmoid`
* `softmax`
* `prediction`


In [1]:
!pip install onnxruntime

Collecting onnxruntime
  Downloading onnxruntime-1.23.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Collecting coloredlogs (from onnxruntime)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting flatbuffers (from onnxruntime)
  Downloading flatbuffers-25.9.23-py2.py3-none-any.whl.metadata (875 bytes)
Collecting protobuf (from onnxruntime)
  Downloading protobuf-6.33.2-cp39-abi3-manylinux2014_x86_64.whl.metadata (593 bytes)
Collecting sympy (from onnxruntime)
  Downloading sympy-1.14.0-py3-none-any.whl.metadata (12 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime)
  Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy->onnxruntime)
  Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB)
Downloading onnxruntime-1.23.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (17.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [3

In [2]:
import onnxruntime as ort
import numpy as np

# 1. Load the ONNX model
model_path = 'hair_classifier_v1.onnx'
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])

# 2. Get Input and Output names
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

print("--- Model Information ---")
print(f"Input name: {input_name}")
print(f"Output name: {output_name}")  
print("-------------------------")

# 3. Make a prediction
# Prepare dummy input data matching the shape (1, 3, 200, 200)
# Note: ONNX Runtime expects numpy arrays, usually float32
dummy_input = np.random.randn(1, 3, 200, 200).astype(np.float32)

# Run inference
result = session.run([output_name], {input_name: dummy_input})

print(f"Prediction result shape: {result[0].shape}")
print(f"Prediction raw output: {result[0]}")

--- Model Information ---
Input name: input
Output name: output
-------------------------
Prediction result shape: (1, 1)
Prediction raw output: [[-9.432108]]


### Preparing the image

You'll need some code for downloading and resizing images. You can use this code:

In [3]:
from io import BytesIO
from urllib import request

from PIL import Image

def download_image(url):
    with request.urlopen(url) as resp:
        buffer = resp.read()
    stream = BytesIO(buffer)
    img = Image.open(stream)
    return img


def prepare_image(img, target_size):
    if img.mode != 'RGB':
        img = img.convert('RGB')
    img = img.resize(target_size, Image.NEAREST)
    return img

In [4]:
!pip install pillow



### Question 2: Target size

Let's download and resize this image:

[https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg](https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg)

Based on the previous homework, what should be the target size for the image?

* 64x64
* 128x128
* 200x200  <--
* 256x256

In [5]:
# 1. Define variables
url = 'https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg'
TARGET_SIZE = (200, 200)  # <--- The answer

# 2. Run functions
img = download_image(url)
img_prepared = prepare_image(img, TARGET_SIZE)

print(f"Final image size: {img_prepared.size}")

Final image size: (200, 200)


Based on the `Homework 8`, the model was designed with an input shape of `(3, 200, 200)`.

Therefore, the target size is `200x200`.

**Explanation**

In the "Model" section of your previous homework, the instructions specified:

* "The shape for input should be `(3, 200, 200)`"
* "Next, create a convolutional layer... `kernel_size=(3,3)`"

### Question 3

Now we need to turn the image into numpy array and pre-process it.

> Tip: Check the previous homework. What was the pre-processing we did there?

After the pre-processing, what's the value in the first pixel, the R channel?

* `-10.73`
* `-1.073`  <--
* `1.073`
* `10.73`


In [6]:
# 1. Define the URL and preprocessing variables
url = 'https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg'
target_size = (200, 200)

# 2. Download and Resize (using the functions you provided)
img = download_image(url)
img = prepare_image(img, target_size)

# 3. Convert to Numpy and Pre-process
x = np.array(img, dtype='float32')
x = x / 255.0  # Scale to [0, 1]

# 4. Normalize (R channel specific)
# The image array x is currently shaped (200, 200, 3).
# The Red channel is at index 0 in the last dimension.
R_pixel_value = x[0, 0, 0] 

# values from Homework 8
mean = 0.485
std = 0.229

R_normalized = (R_pixel_value - mean) / std

print(f"Original R value (0-1): {R_pixel_value}")
print(f"Normalized R value: {R_normalized}")

Original R value (0-1): 0.239215686917305
Normalized R value: -1.0732940435409546


### Question 4

Now let's apply this model to this image. What's the output of the model?

* `0.09`  <---
* `0.49`
* `0.69`
* `0.89`


In [8]:
def preprocess(img):
    # Convert to numpy array and scale to [0, 1]
    x = np.array(img, dtype='float32') / 255.0
    
    # Normalize with ImageNet stats (from Homework 8)
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    x = (x - mean) / std
    
    # Transpose to (Channels, Height, Width) -> (3, 200, 200)
    x = x.transpose((2, 0, 1))
    
    # Add batch dimension -> (1, 3, 200, 200)
    x = np.expand_dims(x, axis=0)
    
    return x.astype(np.float32)

# 2. Preparation
url = 'https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg'
img = download_image(url)
img_prepared = prepare_image(img, (200, 200))
input_data = preprocess(img_prepared)

# 3. Load Model and Run Inference
model_path = 'hair_classifier_v1.onnx'
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])

input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

outputs = session.run([output_name], {input_name: input_data})
raw_output = outputs[0][0][0]  # Get the scalar value

# 4. Apply Sigmoid to get Probability
probability = 1 / (1 + np.exp(-raw_output))

print(f"Raw Output (Logit): {raw_output}")
print(f"Probability: {probability}")

Raw Output (Logit): 0.09156627207994461
Probability: 0.5228756070137024


## Prepare the lambda code

Now you need to copy all the code into a separate python file. You will need to use this file for the next two questions.

Tip: you can test this file locally with `ipython` or Jupyter Notebook by importing the file and invoking the function from this file.

### Docker

For the next two questions, we'll use a Docker image that we already prepared. This is the Dockerfile that we used for creating the image:
```bash
FROM public.ecr.aws/lambda/python:3.13

COPY hair_classifier_empty.onnx.data .
COPY hair_classifier_empty.onnx .
```
Note that it uses Python 3.13.
The docker image is published to [agrigorev/model-2024-hairstyle:v3](https://hub.docker.com/r/agrigorev/model-2024-hairstyle/tags).

A few notes:

* The image already contains a model and it's not the same model as the one we used for questions 1-4.

### Question 5

Download the base image `agrigorev/model-2025-hairstyle:v1`. You can do it with docker pull.

So what's the size of this base image?

* 88 Mb
* 208 Mb
* 608 Mb   <-- Actual 782
* 1208 Mb

You can get this information when running `docker images` - it'll be in the "SIZE" column.



### Question 6

Now let's extend this docker image, install all the required libraries and add the code for lambda.

You don't need to include the model in the image. It's already included. The name of the file with the model is `hair_classifier_empty.onnx` and it's in the current workdir in the image (see the Dockerfile above for the reference). The provided model requires the same preprocessing for images regarding target size and rescaling the value range than used in homework 8.

Now run the container locally.

Score this image: [https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg](https://habrastorage.org/webt/yf/_d/ok/yf_dokzqy3vcritme8ggnzqlvwa.jpeg)

What's the output from the model?

* -1.0
* -0.10
* 0.10  <--
* 1.0

#### Step 6.1: Create lambda_function.py

In [None]:
import numpy as np
import onnxruntime as ort
from urllib import request
from PIL import Image
from io import BytesIO

# 1. Initialize the model once (Global scope for "Warm Start")
# The instructions say the model is in the current workdir
model_path = 'hair_classifier_empty.onnx'
session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

def download_image(url):
    with request.urlopen(url) as resp:
        buffer = resp.read()
    stream = BytesIO(buffer)
    img = Image.open(stream)
    return img

def prepare_image(img, target_size):
    if img.mode != 'RGB':
        img = img.convert('RGB')
    img = img.resize(target_size, Image.NEAREST)
    return img

def preprocess(img):
    x = np.array(img, dtype='float32') / 255.0
    mean = np.array([0.485, 0.456, 0.406], dtype='float32')
    std = np.array([0.229, 0.224, 0.225], dtype='float32')
    x = (x - mean) / std
    x = x.transpose((2, 0, 1))
    return np.expand_dims(x, axis=0).astype(np.float32)

def predict(url):
    img = download_image(url)
    img_prepared = prepare_image(img, (200, 200))
    input_data = preprocess(img_prepared)
    
    outputs = session.run([output_name], {input_name: input_data})
    return float(outputs[0][0][0])

# 2. The Lambda Handler
def lambda_handler(event, context):
    url = event['url']
    result = predict(url)
    return result

#### Step 6.2: Create a Dockerfile

```Dockerfile
# Use the image provided in the homework as the base
FROM agrigorev/model-2024-hairstyle:v3

# Install dependencies (Keras image requires these)
RUN pip install pillow onnxruntime numpy

# Copy your script into the container
COPY lambda_function.py .

# Set the default command to your handler
CMD [ "lambda_function.lambda_handler" ]
```


#### Step 6.3: Build and Run
Open your terminal and run these commands to build your custom image and start it.

```bash
# 1. Build the image (don't miss the dot at the end)
docker build -t hair-model .

# 2. Run the container
docker run -it --rm -p 8080:8080 hair-model
```