In [None]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import sklearn
import sys
import tensorflow as tf
from tensorflow import keras
import time

In [None]:
assert sys.version_info >= (3, 5) # Python ≥3.5 required
assert tf.__version__ >= "2.0"    # TensorFlow ≥2.0 required

In [None]:
tf.__version__

![](https://pbs.twimg.com/media/C4vf8SQUcAALCyl.jpg)

# Download fashion-MNIST data

And prepare train, valid, test datasets

In [None]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_train_full = X_train_full / 255.
X_test = X_test / 255.
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

In [None]:
X_train = X_train.reshape(-1,28,28,1)
X_valid = X_valid.reshape(-1,28,28,1)
X_test = X_test.reshape(-1,28,28,1)

# Define and train the convolutional neural network for images classification

I define the small model as I don't have a GPU on my laptop and moreover test accuracy is not the issue in this notebook.

In [None]:
model = keras.models.Sequential([
    keras.layers.Conv2D(8, kernel_size=3, activation='relu', padding='same', input_shape=(28,28,1)),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.Conv2D(16, kernel_size=3, activation='relu', padding='same'),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.Conv2D(16, kernel_size=3, activation='relu', padding='same'),
    keras.layers.Flatten(),
    keras.layers.Dense(10, activation='softmax')
])


model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd",
              metrics=["accuracy"])

In [None]:
model.summary()

In [None]:
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid))

In [None]:
model.evaluate(X_test, y_test)

# Save the model

We trained our model and now we want to use it with TensorFlow serving. However before running the server, we have to save our model.

As we can use multiple model architecuters and train the same architecture multiple times, we have to name our model with its unique model version. However, newer models should have bigger versions numbers, as tf server by default runs the model with highest version.

In [None]:
all_models_path = 'models'
MODEL_NAME = "fashion_mnist_conv"

You can name your model with current timestamp. Then you will be sure, that newest version has the highest version number.

In [None]:
model_version = ###
model_path = os.path.join(all_models_path, MODEL_NAME, str(model_version))
os.makedirs(model_path)

In [None]:
model_version

In tf 2.0 there is an easy way to [save](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/saved_model/save) the tf.keras.model.

In [None]:
###

# CLI to inspect and execute SavedModel

You can use the [SavedModel Command Line Interface (CLI)](https://www.tensorflow.org/guide/saved_model#cli_to_inspect_and_execute_savedmodel) to inspect and execute a SavedModel. For example, you can use the CLI to inspect the model's SignatureDefs. The CLI enables you to quickly confirm that the input Tensor dtype and shape match the model. Moreover, if you want to test your model, you can use the CLI to do a sanity check by passing in sample inputs in various formats (for example, Python expressions) and then fetching the output.

## Overview of commands

The SavedModel CLI supports the following two commands on a MetaGraphDef in a SavedModel:

 - show, which shows a computation on a MetaGraphDef in a SavedModel.
 - run, which runs a computation on a MetaGraphDef.


### show command

A SavedModel contains one or more MetaGraphDefs, identified by their tag-sets. To serve a model, you might wonder what kind of SignatureDefs are in each model, and what are their inputs and outputs. The show command let you examine the contents of the SavedModel in hierarchical order. Here's the syntax:

```bash
saved_model_cli show [-h] --dir DIR [--all] [--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY]
```

**Try different saved_model_cli formulas**

In [None]:
###

In [None]:
###

In [None]:
###

In [None]:
###

### run command

Invoke the run command to run a graph computation, passing inputs and then displaying (and optionally saving) the outputs. Here's the syntax:

```bash
saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
                           SIGNATURE_DEF_KEY [--inputs INPUTS]
                           [--input_exprs INPUT_EXPRS]
                           [--input_examples INPUT_EXAMPLES] [--outdir OUTDIR]
                           [--overwrite] [--tf_debug]
```

The run command provides the following three ways to pass inputs to the model:

 - *inputs* option enables you to pass numpy ndarray in files.
 - *input_exprs* option enables you to pass Python expressions.
 - *input_examples* option enables you to pass tf.train.Example.

Here we will use the *inputs* option.

To pass input data in files, specify the --inputs option, which takes the following general format:

```bash
--inputs <input_key>=<filename>
```

**Input layer name**

In order to pass the testing data to our trained model, we have to know the name of its input layer and pass it to *saved_model_cli* as *input_key*.

In [None]:
input_name = ###
input_name

**Prepare small testing dataset**

We want to test our model. Take 3 images from the tesing dataset, and [save it](https://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html) as *saved_model_cli* takes the *filename* as argument.

In [None]:
X_query = ###
y_query = ###
###

**saved_model_cli run**

Specify arguments and run testing data.

In [None]:
###

# Prepare docker server with our trained model

To this end, one of the easiest ways to serve machine learning models is by using TensorFlow Serving with Docker. Docker is a tool that packages software into units called containers that include everything needed to run the software.

In the following subsection we will prepare the docker image that serves our model and try to get the classifications for testing data.

First, we have to run the docker with the proper image. We can do it in two steps.


1. Download the docker image
```bash
sudo docker pull tensorflow/serving
```

2. Run the image
```bash
sudo docker run -it --rm -p 8501:8501 \
   -v "`pwd`/models/fashion_mnist_conv:/models/fashion_mnist_conv" \
   -e MODEL_NAME=fashion_mnist_conv \
   tensorflow/serving
```

### REST API

TensorFlow ModelServer also supports [RESTful APIs](https://www.tensorflow.org/tfx/serving/api_rest).

The request and response is a JSON object. The composition of this object depends on the request type or verb. 

Below we will show how to use REST API, together with tf serving, and then make an example client that sends the test image to docker and gets the classification answer.

In [None]:
import json
import requests

#### [Model status API](https://www.tensorflow.org/tfx/serving/api_rest#model_status_api)

This API returns the status of a model in the ModelServer.


```bash
GET http://host:port/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]
```

*/versions/${MODEL_VERSION}* is optional. If omitted status for **all** versions is returned in the response.

In [None]:
SERVER_URL = ###

response = requests.get(SERVER_URL)
response.raise_for_status()
response = response.json()

response

#### [Model Metadata API](https://www.tensorflow.org/tfx/serving/api_rest#model_metadata_api)

This API returns the metadata of a model in the ModelServer.

```bash
GET http://host:port/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]/metadata
```

*/versions/${MODEL_VERSION}* is optional. If omitted the model metadata for the **latest** version is returned in the response.

In [None]:
SERVER_URL = ###

response = requests.get(SERVER_URL)
response.raise_for_status()
response = response.json()

response

#### [Predict API](https://www.tensorflow.org/tfx/serving/api_rest#predict_api)

This API closely follows the PredictionService.Predict gRPC API.

```bash
POST http://host:port/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]:predict
```

*/versions/${MODEL_VERSION}* is optional. If omitted the **latest** version is used.


**Request format**

The request body for predict API must be JSON object formatted as follows:

```python
{
  // (Optional) Serving signature to use.
  // If unspecifed default serving signature is used.
  "signature_name": <string>,

  // Input Tensors in row ("instances") or columnar ("inputs") format.
  // A request can have either of them but NOT both.
  "instances": <value>|<(nested)list>|<list-of-objects>
  "inputs": <value>|<(nested)list>|<object>
}
```

**Examples**

1. Row representation

```python
{
 "instances": [
   {
     "tag": "foo",
     "signal": [1, 2, 3, 4, 5],
     "sensor": [[1, 2], [3, 4]]
   },
   {
     "tag": "bar",
     "signal": [3, 4, 1, 2, 5],
     "sensor": [[4, 5], [6, 8]]
   }
 ]
}
```

2. Columnar representation

```python
{
 "inputs": {
   "tag": ["foo", "bar"],
   "signal": [[1, 2, 3, 4, 5], [3, 4, 1, 2, 5]],
   "sensor": [[[1, 2], [3, 4]], [[4, 5], [6, 8]]]
 }
}
```

**Prepare the json with input data**

We already created some small array with 3 test images. Pass them to json (in representation that you prefer) and post this json to the server.

In [None]:
input_data_json = ###
print(input_data_json[:200] + "..." + input_data_json[-200:])

In [None]:
SERVER_URL = ###
            
response = requests.post(SERVER_URL, data=input_data_json)
response.raise_for_status()
response = response.json()

response

In [None]:
y_proba = np.array(response["predictions"])
y_proba.round(2)

In [None]:
np.argmax(y_proba, axis=-1), y_new

#### Prepare the function that queries the server for the whole testing dataset and returns the network accuracy

And compare it with test accuracy that we computed earlier.

In [None]:
def query_for_answers(X_test, SERVER_URL, batch_size=16):
    ###

In [None]:
query_for_answers(X_test, SERVER_URL, batch_size=128)

# Images sources

Images and code fragments used in this notebook comes from the following web pages and papers:

1. https://github.com/ageron/tf2_course/blob/master/04_deploy_and_distribute_tf2.ipynb
2. https://twitter.com/tensorflow/status/832008382408126464