# Deploy and Distribute TensorFlow

In this notebook you will learn how to deploy TensorFlow models to TensorFlow Serving (TFS), using the REST API or the gRPC API, and how to train a model across multiple devices.

## Imports

In [1]:
%matplotlib inline

In [2]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import sklearn
import sys
import tensorflow as tf
from tensorflow import keras
import time

In [3]:
print("python", sys.version)
for module in mpl, np, pd, sklearn, tf, keras:
    print(module.__name__, module.__version__)

python 3.7.2 (default, Dec 29 2018, 06:19:36) 
[GCC 7.3.0]
matplotlib 3.1.1
numpy 1.17.4
pandas 0.25.3
sklearn 0.21.3
tensorflow 2.0.0
tensorflow_core.keras 2.2.4-tf


In [4]:
assert sys.version_info >= (3, 5) # Python ≥3.5 required
assert tf.__version__ >= "2.0"    # TensorFlow ≥2.0 required

![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)

## Exercise 1 – Deploying a Model to TensorFlow Serving

## Save/Load a `SavedModel`

In [5]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_train_full = X_train_full / 255.
X_test = X_test / 255.
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

In [6]:
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd",
              metrics=["accuracy"])
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid))

Train on 55000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f5f8857e780>

In [7]:
MODEL_NAME = "my_fashion_mnist"
!rm -rf {MODEL_NAME}

In [8]:
import time

model_version = int(time.time())
model_path = os.path.join(MODEL_NAME, str(model_version))
os.makedirs(model_path)

In [9]:
tf.saved_model.save(model, model_path)

Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: my_fashion_mnist/1585547208/assets


In [10]:
for root, dirs, files in os.walk(MODEL_NAME):
    indent = '    ' * root.count(os.sep)
    print('{}{}/'.format(indent, os.path.basename(root)))
    for filename in files:
        print('{}{}'.format(indent + '    ', filename))

my_fashion_mnist/
    1585547208/
        saved_model.pb
        assets/
        variables/
            variables.index
            variables.data-00000-of-00001


In [11]:
!saved_model_cli show --dir {model_path}

The given SavedModel contains the following tag-sets:
serve


In [12]:
!saved_model_cli show --dir {model_path} --tag_set serve

The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "__saved_model_init_op"
SignatureDef key: "serving_default"


In [13]:
!saved_model_cli show --dir {model_path} --tag_set serve \
                      --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['flatten_input'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 28, 28)
      name: serving_default_flatten_input:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['dense_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 10)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict


In [14]:
!saved_model_cli show --dir {model_path} --all


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['flatten_input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 28, 28)
        name: serving_default_flatten_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['dense_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict


**Warning**: as you can see, the method name is empty. This is [a bug](https://github.com/tensorflow/tensorflow/issues/25235), hopefully it will be fixed shortly. In the meantime, you must use `keras.experimental.export()` instead of `tf.saved_model.save()`:

In [16]:
#!rm -rf {MODEL_NAME}
model_path = keras.experimental.export(model, MODEL_NAME).decode("utf-8")
!saved_model_cli show --dir {model_path} --all

AttributeError: module 'tensorflow_core.keras.experimental' has no attribute 'export'

Let's write a few test instances to a `npy` file so we can pass them easily to our model:

In [29]:
X_new = X_test[:3]
np.save("my_fashion_mnist_tests.npy", X_new, allow_pickle=False)

In [30]:
input_name = model.input_names[0]
input_name

'flatten_input'

And now let's use `saved_model_cli` to make predictions for the instances we just saved:

In [31]:
!saved_model_cli run --dir {model_path} --tag_set serve \
                     --signature_def serving_default    \
                     --inputs {input_name}=my_fashion_mnist_tests.npy

2020-03-07 12:34:39.409555: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-07 12:34:39.437947: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2496000000 Hz
2020-03-07 12:34:39.438558: I tensorflow/compiler/xla/service/service.cc:162] XLA service 0x55919a8ded60 executing computations on platform Host. Devices:
2020-03-07 12:34:39.438579: I tensorflow/compiler/xla/service/service.cc:169]   StreamExecutor device (0): <undefined>, <undefined>
W0307 12:34:39.439529 139625108653888 deprecation.py:323] From /home/thomas/projects/tf2_course/tf2/lib/python3.7/site-packages/tensorflow/python/tools/saved_model_cli.py:339: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.loa

In [32]:
!ls -a

.					 .gitignore
..					 images
01_neural_nets_with_keras.ipynb		 .ipynb_checkpoints
02_low_level_tensorflow_api.ipynb	 LICENSE
03_loading_and_preprocessing_data.ipynb  my_fashion_mnist
04_deploy_and_distribute_tf2.ipynb	 my_fashion_mnist_tests.npy
05_cnns.ipynb				 README.md
06_rnns.ipynb				 requirements.txt
datasets				 tf2
.git


## TensorFlow Serving

Install [Docker](https://docs.docker.com/install/) if you don't have it already. Then run:

```bash
docker pull tensorflow/serving

docker run -it --rm -p 8501:8501 \
   -v "`pwd`/my_fashion_mnist:/models/my_fashion_mnist" \
   -e MODEL_NAME=my_fashion_mnist \
   tensorflow/serving
```

Once you are finished using it, press Ctrl-C to shut down the server.

In [34]:
import json

input_data_json = json.dumps({
    "signature_name": "serving_default",
    "instances": X_new.tolist(),
})
print(input_data_json[:200] + "..." + input_data_json[-200:])

{"signature_name": "serving_default", "instances": [[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0,... 0.0, 0.3843137254901961, 0.6235294117647059, 0.2784313725490196, 0.0, 0.0, 0.26666666666666666, 0.6901960784313725, 0.6431372549019608, 0.22745098039215686, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]]}


Now let's use TensorFlow Serving's REST API to make predictions:

In [45]:
import requests

SERVER_URL = 'http://localhost:8501/v1/models/my_fashion_mnist:predict'
            
response = requests.post(SERVER_URL, data=input_data_json)
response.raise_for_status()
response = response.json()
print(response)

{'predictions': [[8.05725267e-06, 5.12280576e-06, 1.4628461e-05, 4.79938217e-06, 7.32980607e-06, 0.157669753, 2.99060212e-05, 0.170503423, 0.00410152553, 0.667655528], [0.000101461497, 1.1052872e-06, 0.930430233, 5.66832e-06, 0.0157977, 5.26175659e-10, 0.0535956919, 2.95285909e-12, 6.81881502e-05, 1.67117764e-10], [1.42899917e-05, 0.999940276, 8.01393344e-06, 1.62423603e-05, 1.92198258e-05, 1.22311022e-10, 8.4890722e-08, 5.79693676e-07, 1.30570345e-06, 4.19321777e-09]]}


In [36]:
response.keys()

dict_keys(['predictions'])

In [37]:
y_proba = np.array(response["predictions"])
y_proba.round(2)

array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.16, 0.  , 0.17, 0.  , 0.67],
       [0.  , 0.  , 0.93, 0.  , 0.02, 0.  , 0.05, 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]])

### Using Serialized Examples

In [38]:
serialized = []
for image in X_new:
    image_data = tf.train.FloatList(value=image.ravel())
    features = tf.train.Features(
        feature={
            "image": tf.train.Feature(float_list=image_data),
        }
    )
    example = tf.train.Example(features=features)
    serialized.append(example.SerializeToString())

In [39]:
[data[:100]+b'...' for data in serialized]

[b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...',
 b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xd1\xd0P=\x87\x86\x86>\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc9\xc8H>\x99\x98\x18>\x00\x00\x00\x00\x00\x00...',
 b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x81\x80\x80;\x00\x00\x00\x00\x87\x86\x86>\xb2\xb1

In [40]:
def parse_images(serialized):
    expected_features = {
        "image": tf.io.FixedLenFeature([28 * 28], dtype=tf.float32)
    }
    examples = tf.io.parse_example(serialized, expected_features)
    return tf.reshape(examples["image"], (-1, 28, 28))

In [41]:
parse_images(serialized)

<tf.Tensor: id=57824, shape=(3, 28, 28), dtype=float32, numpy=
array([[[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)>

In [42]:
serialized_inputs = keras.layers.Input(shape=[], dtype=tf.string)
images = keras.layers.Lambda(lambda serialized: parse_images(serialized))(serialized_inputs)
y_proba = model(images)
ser_model = keras.models.Model(inputs=[serialized_inputs], outputs=[y_proba])

In [47]:
SER_MODEL_NAME = "my_ser_fashion_mnist"
!rm -rf {SER_MODEL_NAME}
#ser_model_path = keras.experimental.export(ser_model, SER_MODEL_NAME).decode("utf-8")
model_version = int(time.time())
model_path = os.path.join(SER_MODEL_NAME, str(model_version))
os.makedirs(model_path)
tf.saved_model.save(ser_model, model_path)
!saved_model_cli show --dir {model_path} --all

INFO:tensorflow:Assets written to: my_ser_fashion_mnist/1583614439/assets

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_1'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: serving_default_input_1:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['sequential'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict


```bash
docker run -it --rm -p 8500:8500 -p 8501:8501 \
   -v "`pwd`/my_ser_fashion_mnist:/models/my_ser_fashion_mnist" \
   -e MODEL_NAME=my_ser_fashion_mnist \
   tensorflow/serving
```

In [48]:
import base64
import json

ser_input_data_json = json.dumps({
    "signature_name": "serving_default",
    "instances": [{"b64": base64.b64encode(data).decode("utf-8")}
                  for data in serialized],
})
print(ser_input_data_json[:200] + "..." + ser_input_data_json[-200:])

{"signature_name": "serving_default", "instances": [{"b64": "CtMYCtAYCgVpbWFnZRLGGBLDGArAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...7j4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxcTEPqCfHz+Pjo4+AAAAAAAAAACJiIg+sbAwP6WkJD/p6Gg+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="}]}


In [49]:
import requests

SER_SERVER_URL = 'http://localhost:8501/v1/models/my_ser_fashion_mnist:predict'
            
response = requests.post(SER_SERVER_URL, data=ser_input_data_json)
response.raise_for_status()
response = response.json()

In [50]:
response.keys()

dict_keys(['predictions'])

In [51]:
y_proba = np.array(response["predictions"])
y_proba.round(2)

array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.16, 0.  , 0.17, 0.  , 0.67],
       [0.  , 0.  , 0.93, 0.  , 0.02, 0.  , 0.05, 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]])

In [52]:
!python3 -m pip install --no-deps tensorflow-serving-api

[33mYou are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [53]:
import grpc
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

channel = grpc.insecure_channel('localhost:8500')
predict_service = prediction_service_pb2_grpc.PredictionServiceStub(channel)

request = predict_pb2.PredictRequest()
request.model_spec.name = SER_MODEL_NAME
request.model_spec.signature_name = "serving_default"
input_name = ser_model.input_names[0]
request.inputs[input_name].CopyFrom(tf.compat.v1.make_tensor_proto(serialized))

result = predict_service.Predict(request, 10.0)

ModuleNotFoundError: No module named 'tensorflow_serving'

In [36]:
result

NameError: name 'result' is not defined

In [37]:
output_name = ser_model.output_names[0]
output_name

'sequential'

In [38]:
shape = [dim.size for dim in result.outputs[output_name].tensor_shape.dim]
shape

NameError: name 'result' is not defined

In [39]:
y_proba = np.array(result.outputs[output_name].float_val).reshape(shape)
y_proba.round(2)

NameError: name 'result' is not defined

![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)

## Exercise 2 – Distributed Training

In [40]:
keras.backend.clear_session()

In [41]:
distribution = tf.distribute.MirroredStrategy()

with distribution.scope():
    model = keras.models.Sequential([
        keras.layers.Flatten(input_shape=[28, 28]),
        keras.layers.Dense(100, activation="relu"),
        keras.layers.Dense(10, activation="softmax")
    ])
    model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd",
                  metrics=["accuracy"])



In [42]:
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid), batch_size=25)

Train on 55000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fab251a8390>