# BentoML TensorFlow2 MNIST Tutorial

Link to source code: https://github.com/bentoml/BentoML/tree/main/examples/tensorflow2_keras/

The code is based on the TensorFlow2 example code here: https://www.tensorflow.org/tutorials/quickstart/advanced

Install required dependencies:

In [None]:
%pip install -r requirements.txt

If you are running MacOS use the following pip command:

In [None]:
%pip install -r requirements-macos.txt

## Define the model

First let's initiate the dataset we'll be using and then create a Model which we will use to train.

In [None]:
import tensorflow as tf

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model

import bentoml

print("TensorFlow version:", tf.__version__)

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 28, 28, 1).astype("float32") / 255
x_test = x_test.reshape(10000, 28, 28, 1).astype("float32") / 255

# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]


class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = Conv2D(32, 3, activation="relu")
        self.flatten = Flatten()
        self.d1 = Dense(128, activation="relu")
        self.d2 = Dense(10)

    @tf.function(input_signature=[tf.TensorSpec([None, 28, 28, 1], tf.float32)])
    def call(self, x):
        x = self.conv1(x)
        x = self.flatten(x)
        x = self.d1(x)
        return self.d2(x)


# Create an instance of the model
model = MyModel()

In [None]:
model(x_test[0:1])

In [None]:
model.compile(
    optimizer=tf.keras.optimizers.Adam(),  # Optimizer
    # Loss function to minimize
    loss=tf.keras.losses.SparseCategoricalCrossentropy(),
    # List of metrics to monitor
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],
)

## Training and Saving the model

Then we initialize some simple tensorflow helper functions and create the training and testing methods

### Training the model

As provided by TensorFlow, we train and test the model.

In [None]:
history = model.fit(
    x_train,
    y_train,
    batch_size=64,
    epochs=2,
    # We pass some validation for
    # monitoring validation loss and metrics
    # at the end of each epoch
    validation_data=(x_val, y_val),
)

### Saving the model

Finally, we make one call to the bentoml library to save this tensorflow model to be used later as part of the prediction service that we will create.

In [None]:
bentoml.tensorflow.save_model(
    "tensorflow_mnist",
    model,
    signatures={"__call__": {"batchable": True, "batch_dim": 0}},
)

## Create a BentoML Service for serving the model

Note: using `%%writefile` here because `bentoml.Service` instance must be created in a separate `.py` file

Even though we have only one model, we can create as many api endpoints as we want. Here we create two end points `predict_ndarray` and `predict_image`

In [None]:
%%writefile service.py

import bentoml
import numpy as np
from bentoml.io import Image, NumpyNdarray
from PIL.Image import Image as PILImage

mnist_runner = bentoml.tensorflow.get("tensorflow_mnist:latest").to_runner()

svc = bentoml.Service(
    name="tensorflow_mnist_demo",
    runners=[mnist_runner],
)

@svc.api(input=Image(), output=NumpyNdarray(dtype="float32"))
async def predict_image(f: PILImage) -> "np.ndarray":
    assert isinstance(f, PILImage)
    arr = np.array(f)/255.0
    assert arr.shape == (28, 28)

    # We are using greyscale image and our PyTorch model expect one
    # extra channel dimension
    arr = np.expand_dims(arr, (0, 3)).astype("float32") # reshape to [1, 28, 28, 1]
    return await mnist_runner.async_run(arr)

Start a dev model server to test out the service defined above

In [None]:
!bentoml serve service.py:svc --reload

Now you can use something like:

`curl -H "Content-Type: multipart/form-data" -F'fileobj=@samples/0.png;type=image/png' http://127.0.0.1:3000/predict_image`
    
to send an image to the digit recognition service.

We can also do a simple local benchmark if [locust](https://locust.io/) is installed:

In [None]:
!locust --headless -u 500 -r 10 --run-time 10m --host http://127.0.0.1:3000

## Build a Bento for distribution and deployment

A `bentofile` is already created in this directory for building a Bento for the service:

```yaml
service: "service:svc"
description: "file: ./README.md"
labels:
  owner: bentoml-team
  stage: demo
include:
- "*.py"
exclude:
- "tests/"
python:
  lock_packages: False
  packages:
    - tensorflow
    - Pillow
```

Note that we exclude `tests/` from the bento using exclude.

Simply run `bentoml build` from current directory to build a Bento with the latest version of the `tensorflow_mnist` model. This may take a while when running for the first time for BentoML to resolve all dependency versions:

In [None]:
!bentoml build

Starting a dev server with the Bento build:

In [None]:
!bentoml serve tensorflow2_demo:latest