## serve

In this notebook we serve a version of the trained model using a Spell model server.

Inspecting the [hyperparameter searches results page](https://web.spell.ml/spell-org/hyper-searches/24) and looking at per model metrics for the best models to come out of that search, it looks like the model from [run 952](https://web.spell.ml/spell-org/runs/952) performed best. We'll grab this model and turn it into a server.

In [1]:
!mkdir ../server/

In [1]:
%%writefile ../server/serve.py
import torch
from torch import nn
import torchvision

import numpy as np
import base64
from PIL import Image
import io

from spell.serving import BasePredictor


# Inlining the model definition in the server code for simplicity. In a production setting, we
# recommend creating a model module and importing that instead.
class CIFAR10Model(nn.Module):
    def __init__(
        self,
        conv1_filters=32, conv1_dropout=0.25,
        conv2_filters=64, conv2_dropout=0.25,
        dense_layer=512, dense_dropout=0.5
    ):
        super().__init__()
        self.cnn_block_1 = nn.Sequential(*[
            nn.Conv2d(3, conv1_filters, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(conv1_filters, conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(conv1_dropout)
        ])
        self.cnn_block_2 = nn.Sequential(*[
            nn.Conv2d(conv2_filters, conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(conv2_filters, conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(conv2_dropout)
        ])
        self.flatten = lambda inp: torch.flatten(inp, 1)
        self.head = nn.Sequential(*[
            nn.Linear(conv2_filters * 8 * 8, dense_layer),
            nn.ReLU(),
            nn.Dropout(dense_dropout),
            nn.Linear(dense_layer, 10)
        ])

    def forward(self, X):
        X = self.cnn_block_1(X)
        X = self.cnn_block_2(X)
        X = self.flatten(X)
        X = self.head(X)
        return X


class Predictor(BasePredictor):
    def __init__(self):
        self.clf = CIFAR10Model()
        # self.clf.load_state_dict(torch.load("/model/checkpoints/epoch_20.pth"))
        # TODO: use GPU instead of CPU
        self.clf.load_state_dict(torch.load("/model/checkpoints/model_final.pth", map_location="cpu"))
        self.clf.eval()

        self.transform_test = torchvision.transforms.Compose([
            torchvision.transforms.Resize((32, 32)),
            torchvision.transforms.ToTensor(),
            torchvision.transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
        ])

        self.labels = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

    def predict(self, payload):
        img = base64.b64decode(payload['image'])
        img = Image.open(io.BytesIO(img), formats=[payload['format']])
        img_tensor = self.transform_test(img)
        # batch_size=1
        img_tensor_batch = img_tensor[np.newaxis]

        scores = self.clf(img_tensor_batch)
        class_match_idx = scores.argmax()
        class_match = self.labels[class_match_idx]

        return {'class': class_match}


Overwriting ../server/serve.py


In [1]:
!spell server serve \
    --serving-group t4-node-group-prod \
    --min-pods 1 --max-pods 1 \
    --pip Pillow==8.0.0 \
    --validate \
    -- cifar10:v1 ../server/serve.py

[0m✨ Preparing uncommitted changes…
[0mEnumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Delta compression using up to 12 threads
Compressing objects: 100% (6/6), done.
Writing objects: 100% (7/7), 4.39 KiB | 4.39 MiB/s, done.
Total 7 (delta 2), reused 0 (delta 0)
To git.spell.ml:spell-org/85c7c7fd7d33215a68be4489dec82505ae6908ad.git
 * [new branch]      HEAD -> br_9baf7939e2ce3047895cda7c847e43e0694749a9
💫 Starting server cifar10…
[0m[0m

Make an example `base64` string and save it to disk (modify this code to make your own test image):

In [3]:
import base64
from PIL import Image
import json
import io

dog = Image.open("/Users/alekseybilogur/Desktop/dog.jpg")
# dog = dog.resize((32, 32))
img_byte_arr = io.BytesIO()
dog.save(img_byte_arr, format='JPEG')
base64_dog = base64.b64encode(img_byte_arr.getvalue())

with open("../test.json", "w") as fp:
    fp.write(json.dumps({"image": base64_dog.decode('utf8'), "format": "JPEG"}))

In [2]:
# !echo test.json >> ../.gitignore

Try the model server out:

In [4]:
!curl -X POST -H "Content-Type: application/json" \
   --data @../test.json \
   https://spell-org.spell-org.spell.services/spell-org/cifar10/predict

{"class":"Dog"}