# BentoML TensorFlow2 MNIST Tutorial

Link to source code: https://github.com/bentoml/gallery/tree/main/tensorflow2_mnist/

The code is based on the TensorFlow2 example code here: https://www.tensorflow.org/tutorials/quickstart/advanced

Install required dependencies:

In [1]:
#!pip install -r requirements.txt
!pip install -r https://raw.githubusercontent.com/bentoml/gallery/main/tensorflow2/requirements.txt   --user





If you are running MacOS use the following pip command:

In [None]:
!pip install -r requirements-macos.txt

## Define the model

First let's initiate the dataset we'll be using and then create a Model which we will use to train.

In [2]:
import tensorflow as tf

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model

import bentoml

print("TensorFlow version:", tf.__version__)

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Add a channels dimension
x_train = x_train[..., tf.newaxis].astype("float32")
x_test = x_test[..., tf.newaxis].astype("float32")

train_ds = (
    tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(10000).batch(32)
)

test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)


class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = Conv2D(32, 3, activation="relu")
        self.flatten = Flatten()
        self.d1 = Dense(128, activation="relu")
        self.d2 = Dense(10)

    def call(self, x):
        x = self.conv1(x)
        x = self.flatten(x)
        x = self.d1(x)
        return self.d2(x)


# Create an instance of the model
model = MyModel()

TensorFlow version: 2.9.1


## Training and Saving the model

Then we initialize some simple tensorflow helper functions and create the training and testing methods

In [3]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

optimizer = tf.keras.optimizers.Adam()

train_loss = tf.keras.metrics.Mean(name="train_loss")
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name="train_accuracy")

test_loss = tf.keras.metrics.Mean(name="test_loss")
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name="test_accuracy")

### Training and Testing TF Steps

Now we assemble our TensorFlow2 training and testing steps. We use @tf.function as the new way (a part of TensorFlow2) to initialize a TensorFlow session.


In [4]:
@tf.function
def train_step(images, labels):
    with tf.GradientTape() as tape:
        # training=True is only needed if there are layers with different
        # behavior during training versus inference (e.g. Dropout).
        predictions = model(images, training=True)
        loss = loss_object(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss(loss)
    train_accuracy(labels, predictions)


@tf.function
def test_step(images, labels):
    # training=False is only needed if there are layers with different
    # behavior during training versus inference (e.g. Dropout).
    predictions = model(images, training=False)
    t_loss = loss_object(labels, predictions)

    test_loss(t_loss)
    test_accuracy(labels, predictions)

### Training the model

As provided by TensorFlow, we train and test the model.

In [5]:
EPOCHS = 5

for epoch in range(EPOCHS):
    # Reset the metrics at the start of the next epoch
    train_loss.reset_states()
    train_accuracy.reset_states()
    test_loss.reset_states()
    test_accuracy.reset_states()

    for images, labels in train_ds:
        train_step(images, labels)

    for test_images, test_labels in test_ds:
        test_step(test_images, test_labels)

    print(
        f"Epoch {epoch + 1}, "
        f"Loss: {train_loss.result()}, "
        f"Accuracy: {train_accuracy.result() * 100}, "
        f"Test Loss: {test_loss.result()}, "
        f"Test Accuracy: {test_accuracy.result() * 100}"
    )

Epoch 1, Loss: 0.14150120317935944, Accuracy: 95.79332733154297, Test Loss: 0.0640590563416481, Test Accuracy: 97.93999481201172
Epoch 2, Loss: 0.042139071971178055, Accuracy: 98.67832946777344, Test Loss: 0.0599837489426136, Test Accuracy: 98.13999938964844
Epoch 3, Loss: 0.022173399105668068, Accuracy: 99.30833435058594, Test Loss: 0.05138981714844704, Test Accuracy: 98.5199966430664
Epoch 4, Loss: 0.013054100796580315, Accuracy: 99.56999969482422, Test Loss: 0.05493428558111191, Test Accuracy: 98.36000061035156
Epoch 5, Loss: 0.008808165788650513, Accuracy: 99.7066650390625, Test Loss: 0.05580809339880943, Test Accuracy: 98.3499984741211


### Saving the model

Finally, we make one call to the bentoml library to save this tensorflow model to be used later as part of the prediction service that we will create.

In [7]:
#The "bentoml.tensorflow.save" method is being deprecated. Use "bentoml.tensorflow.save_model" instead
#bentoml.tensorflow.save("tensorflow_mnist", model)
bentoml.tensorflow.save_model("tensorflow_mnist", model)

  function_map = {k: getattr(m, k, None) for k in dir(m)}
  function_map = {k: getattr(m, k, None) for k in dir(m)}


INFO:tensorflow:Assets written to: C:\Users\hwang\AppData\Local\Temp\tmp129_j3v0bentoml_model_tensorflow_mnist\assets


INFO:tensorflow:Assets written to: C:\Users\hwang\AppData\Local\Temp\tmp129_j3v0bentoml_model_tensorflow_mnist\assets


Model(tag="tensorflow_mnist:qmnbuwh7kgrdx3c4", path="C:\Users\hwang\bentoml\models\tensorflow_mnist\qmnbuwh7kgrdx3c4\")

## Create a BentoML Service for serving the model

Note: using `%%writefile` here because `bentoml.Service` instance must be created in a separate `.py` file

Even though we have only one model, we can create as many api endpoints as we want. Here we create two end points `predict_ndarray` and `predict_image`

In [21]:
%%writefile service.py

import bentoml
import numpy as np
from bentoml.io import Image, NumpyNdarray
from PIL.Image import Image as PILImage

mnist_runner = bentoml.tensorflow.load_runner(
    "tensorflow_mnist:latest"
)

svc = bentoml.Service(
    name="tensorflow_mnist_demo",
    runners=[
        mnist_runner,
    ],
)


@svc.api(
    input=NumpyNdarray(dtype="float32", enforce_dtype=True),
    output=NumpyNdarray(dtype="float32"),
)
async def predict_ndarray(inp: "np.ndarray") -> "np.ndarray":
    assert inp.shape == (28, 28)
    # We are using greyscale image and our PyTorch model expect one
    # extra channel dimension
    inp = np.expand_dims(inp, 2)
    return await mnist_runner.async_run(inp)


@svc.api(input=Image(), output=NumpyNdarray(dtype="float32"))
async def predict_image(f: PILImage) -> "np.ndarray":
    assert isinstance(f, PILImage)
    arr = np.array(f)/255.0
    assert arr.shape == (28, 28)

    # We are using greyscale image and our PyTorch model expect one
    # extra channel dimension
    arr = np.expand_dims(arr, 2).astype("float32")
    return await mnist_runner.async_run(arr)


Overwriting service.py


In [10]:
import imageio

im = imageio.imread('./samples/0.png')
print(im.shape)
#print (im)
b = im.tolist()
print (b)

(28, 28)
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 64, 185, 254, 134, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 90, 241, 241, 213, 91, 0, 13, 247, 203, 146, 242, 249, 206, 28, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 214, 253, 179, 179, 249, 84, 80, 89, 14, 0, 23, 168, 253, 135, 30, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 133, 249, 185, 9, 37, 228, 93, 0, 0, 0, 0, 0, 13, 166, 253, 192, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 241, 253, 67, 0, 0, 37, 91, 0, 0, 0, 0, 0, 0, 14, 201, 224, 25, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 241, 184, 11, 0, 0, 0, 0, 0, 

In [9]:
im = imageio.imread('./samples/5.png')
print(im.shape)
#b = list(im)
b = im.tolist()
print (b)

(28, 28)
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 59, 9, 0, 0, 0, 23, 50, 89, 156, 156, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 162, 229, 254, 220, 214, 214, 214, 230, 247, 253, 253, 253, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 175, 253, 253, 254, 253, 253, 253, 253, 254, 253, 253, 210, 137, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 124, 247, 253, 237, 214, 213, 213, 71, 131, 177, 168, 87, 9, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 61, 229, 253, 237, 58, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 255, 254

Start a dev model server to test out the service defined above

In [None]:
# http://127.0.0.1:3000
#!bentoml serve service.py:svc --reload
!bentoml serve service.py:svc

Now you can use something like:

`curl -H "Content-Type: multipart/form-data" -F'fileobj=@samples/0.png;type=image/png' http://127.0.0.1:3000/predict_image`
    
to send an image to the digit recognition service.

We can also do a simple local benchmark if [locust](https://locust.io/) is installed:

In [None]:
!locust --headless -u 100 -r 1000 --run-time 10m --host http://127.0.0.1:3000

## Build a Bento for distribution and deployment

A `bentofile` is already created in this directory for building a Bento for the service:

```yaml
service: "service:svc"
description: "file: ./README.md"
labels:
  owner: bentoml-team
  stage: demo
include:
- "*.py"
exclude:
- "tests/"
python:
  lock_packages: False
  packages:
    - tensorflow
    - Pillow
```

Note that we exclude `tests/` from the bento using exclude.

Simply run `bentoml build` from current directory to build a Bento with the latest version of the `tensorflow_mnist` model. This may take a while when running for the first time for BentoML to resolve all dependency versions:

In [None]:
!bentoml build

Starting a dev server with the Bento build:

In [None]:
!bentoml serve tensorflow2_demo:latest