## serve

In this notebook we serve a version of the trained model using a Spell model server.

Inspecting the [hyperparameter searches results page](https://web.spell.ml/spell-org/hyper-searches/24) and looking at per model metrics for the best models to come out of that search, it looks like the model from [run 952](https://web.spell.ml/spell-org/runs/952) performed best. We'll grab this model and turn it into a server.

In [1]:
!mkdir ../server/

In [2]:
%%writefile ../server/serve.py
import torch
from torch import nn
from torch import optim

# Inlining the model definition in the server code for simplicity. In a production setting, we
# recommend creating a model module and importing that instead.
class CIFAR10Model(nn.Module):
    def __init__(
        self,
        conv1_filters=32, conv1_dropout=0.25,
        conv2_filters=64, conv2_dropout=0.25,
        dense_layer=512, dense_dropout=0.5
    ):
        super().__init__()
        self.cnn_block_1 = nn.Sequential(*[
            nn.Conv2d(3, self.conv1_filters, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(self.conv1_filters, self.conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(self.conv1_dropout)
        ])
        self.cnn_block_2 = nn.Sequential(*[
            nn.Conv2d(self.conv2_filters, self.conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.Conv2d(self.conv2_filters, self.conv2_filters, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Dropout(self.conv2_dropout)
        ])
        self.flatten = lambda inp: torch.flatten(inp, 1)
        self.head = nn.Sequential(*[
            nn.Linear(self.conv2_filters * 8 * 8, self.dense_layer),
            nn.ReLU(),
            nn.Dropout(self.dense_dropout),
            nn.Linear(self.dense_layer, 10)
        ])
    
    def forward(self, X):
        X = self.cnn_block_1(X)
        X = self.cnn_block_2(X)
        X = self.flatten(X)
        X = self.head(X)
        return X


from spell.serving import BasePredictor
class Predictor(BasePredictor):
    def __init__(self):
        self.clf = CIFAR10Model()
        self.clf.load_state_dict(torch.load("/model/checkpoints/epoch_20.pth"))
        self.clf.eval()

    def predict(self, payload):
        return "Hello World!"

Writing ../server/serve.py


In [4]:
!spell model create cifar10 runs/952

In [13]:
!spell server serve \
    --serving-group t4-node-group-prod \
    --github-ref server \
    --min-pods 1 --max-pods 1 \
    --validate \
    -- cifar10:v1 ../server/serve.py

[0m✨ Preparing uncommitted changes…
[0mEnumerating objects: 17, done.
Counting objects: 100% (17/17), done.
Delta compression using up to 12 threads
Compressing objects: 100% (11/11), done.
Writing objects: 100% (14/14), 3.39 KiB | 3.39 MiB/s, done.
Total 14 (delta 5), reused 0 (delta 0)
To git.spell.ml:spell-org/85c7c7fd7d33215a68be4489dec82505ae6908ad.git
 * [new branch]      HEAD -> br_5a81d8b8a4e7c7963c84ed7701b2dc7796313974
💫 Starting server cifar10…
[0m[0m

In [8]:
!spell owner

[0m  aleksey
[0m  spellrun
[0m  external-gcp
[0m➔ [32mspell-org[0m
[0m  external-aws
[0m[0m