# Deploy and Distribute TensorFlow

In this notebook you will learn how to deploy TensorFlow models to TensorFlow Serving (TFS), using the REST API or the gRPC API, and how to train a model across multiple devices.

## Imports

In [46]:
%matplotlib inline

In [47]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import sklearn
import sys
import tensorflow as tf
from tensorflow import keras
import time

In [48]:
print("python", sys.version)
for module in mpl, np, pd, sklearn, tf, keras:
    print(module.__name__, module.__version__)

python 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
matplotlib 3.0.2
numpy 1.15.4
pandas 0.24.1
sklearn 0.20.1
tensorflow 2.0.0-dev20190126
tensorflow.python.keras.api._v2.keras 2.2.4-tf


In [49]:
assert sys.version_info >= (3, 5) # Python ≥3.5 required
assert tf.__version__ >= "2.0"    # TensorFlow ≥2.0 required

![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)

## Exercise 1 – Deploying a Model to TensorFlow Serving

## Save/Load a `SavedModel`

In [76]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_train_full = X_train_full / 255.
X_test = X_test / 255.
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

In [77]:
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd",
              metrics=["accuracy"])
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid))

Train on 55000 samples, validate on 5000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x15f4f6b70>

In [None]:
MODEL_NAME = "my_fashion_mnist"
!rm -rf {MODEL_NAME}

In [None]:
import time

model_version = int(time.time())
model_path = os.path.join(MODEL_NAME, str(model_version))
os.makedirs(model_path)

In [None]:
tf.saved_model.save(model, model_path)

In [None]:
for root, dirs, files in os.walk(MODEL_NAME):
    indent = '    ' * root.count(os.sep)
    print('{}{}/'.format(indent, os.path.basename(root)))
    for filename in files:
        print('{}{}'.format(indent + '    ', filename))

In [None]:
!saved_model_cli show --dir {model_path}

In [None]:
!saved_model_cli show --dir {model_path} --tag_set serve

In [None]:
!saved_model_cli show --dir {model_path} --tag_set serve \
                      --signature_def serving_default

In [None]:
!saved_model_cli show --dir {model_path} --all

**Warning**: as you can see, the method name is empty. This is [a bug](https://github.com/tensorflow/tensorflow/issues/25235), hopefully it will be fixed shortly. In the meantime, you must use `keras.experimental.export()` instead of `tf.saved_model.save()`:

In [None]:
!rm -rf {MODEL_NAME}
model_path = keras.experimental.export(model, MODEL_NAME).decode("utf-8")
!saved_model_cli show --dir {model_path} --all

Let's write a few test instances to a `npy` file so we can pass them easily to our model:

In [52]:
X_new = X_test[:3]
np.save("my_fashion_mnist_tests.npy", X_new, allow_pickle=False)

In [53]:
input_name = model.input_names[0]
input_name

'flatten_1_input'

And now let's use `saved_model_cli` to make predictions for the instances we just saved:

In [54]:
!saved_model_cli run --dir {model_path} --tag_set serve \
                     --signature_def serving_default    \
                     --inputs {input_name}=my_fashion_mnist_tests.npy

Traceback (most recent call last):
  File "/Users/a.boyko/anaconda3/envs/ml.crash-course/bin/saved_model_cli", line 11, in <module>
    sys.exit(main())
  File "/Users/a.boyko/anaconda3/envs/ml.crash-course/lib/python3.6/site-packages/tensorflow/python/tools/saved_model_cli.py", line 909, in main
    args.func(args)
  File "/Users/a.boyko/anaconda3/envs/ml.crash-course/lib/python3.6/site-packages/tensorflow/python/tools/saved_model_cli.py", line 643, in run
    init_tpu=args.init_tpu, tf_debug=args.tf_debug)
  File "/Users/a.boyko/anaconda3/envs/ml.crash-course/lib/python3.6/site-packages/tensorflow/python/tools/saved_model_cli.py", line 316, in run_saved_model_with_feed_dict
    (input_key_name, '"' + '", "'.join(inputs_tensor_info.keys()) + '"'))
ValueError: "flatten_1_input" is not a valid input key. Please choose from "flatten_input", or use --show option.


## TensorFlow Serving

Install [Docker](https://docs.docker.com/install/) if you don't have it already. Then run:

```bash
docker pull tensorflow/serving

docker run -it --rm -p 8501:8501 \
   -v "`pwd`/my_fashion_mnist:/models/my_fashion_mnist" \
   -e MODEL_NAME=my_fashion_mnist \
   tensorflow/serving
```

Once you are finished using it, press Ctrl-C to shut down the server.

In [55]:
import json

input_data_json = json.dumps({
    "signature_name": "serving_default",
    "instances": X_new.tolist(),
})
print(input_data_json[:200] + "..." + input_data_json[-200:])

{"signature_name": "serving_default", "instances": [[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0,... 0.0, 0.3843137254901961, 0.6235294117647059, 0.2784313725490196, 0.0, 0.0, 0.26666666666666666, 0.6901960784313725, 0.6431372549019608, 0.22745098039215686, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]]}


Now let's use TensorFlow Serving's REST API to make predictions:

In [56]:
import requests

SERVER_URL = 'http://localhost:8501/v1/models/my_fashion_mnist:predict'
            
response = requests.post(SERVER_URL, data=input_data_json)
response.raise_for_status()
response = response.json()

In [57]:
response.keys()

dict_keys(['predictions'])

In [58]:
y_proba = np.array(response["predictions"])
y_proba.round(2)

array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.14, 0.  , 0.22, 0.02, 0.62],
       [0.  , 0.  , 0.93, 0.  , 0.02, 0.  , 0.05, 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]])

### Using Serialized Examples

In [59]:
serialized = []
for image in X_new:
    image_data = tf.train.FloatList(value=image.ravel())
    features = tf.train.Features(
        feature={
            "image": tf.train.Feature(float_list=image_data),
        }
    )
    example = tf.train.Example(features=features)
    serialized.append(example.SerializeToString())

In [60]:
[data[:100]+b'...' for data in serialized]

[b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00...',
 b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xd1\xd0P=\x87\x86\x86>\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc9\xc8H>\x99\x98\x18>\x00\x00\x00\x00\x00\x00...',
 b'\n\xd3\x18\n\xd0\x18\n\x05image\x12\xc6\x18\x12\xc3\x18\n\xc0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x81\x80\x80;\x00\x00\x00\x00\x87\x86\x86>\xb2\xb1

In [61]:
def parse_images(serialized):
    expected_features = {
        "image": tf.io.FixedLenFeature([28 * 28], dtype=tf.float32)
    }
    examples = tf.io.parse_example(serialized, expected_features)
    return tf.reshape(examples["image"], (-1, 28, 28))

In [62]:
parse_images(serialized)

<tf.Tensor: id=150876, shape=(3, 28, 28), dtype=float32, numpy=
array([[[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        ...,
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.],
        [0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)>

In [63]:
serialized_inputs = keras.layers.Input(shape=[], dtype=tf.string)
images = keras.layers.Lambda(lambda serialized: parse_images(serialized))(serialized_inputs)
y_proba = model(images)
ser_model = keras.models.Model(inputs=[serialized_inputs], outputs=[y_proba])

In [64]:
SER_MODEL_NAME = "my_ser_fashion_mnist"
!rm -rf {SER_MODEL_NAME}
ser_model_path = keras.experimental.export(ser_model, SER_MODEL_NAME).decode("utf-8")
!saved_model_cli show --dir {ser_model_path} --all


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: init_1
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_1'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: input_1:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['sequential_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: sequential_1/dense_3/Softmax:0
  Method name is: tensorflow/serving/predict


```bash
docker run -it --rm -p 8500:8500 -p 8501:8501 \
   -v "`pwd`/my_ser_fashion_mnist:/models/my_ser_fashion_mnist" \
   -e MODEL_NAME=my_ser_fashion_mnist \
   tensorflow/serving
```

In [65]:
import base64
import json

ser_input_data_json = json.dumps({
    "signature_name": "serving_default",
    "instances": [{"b64": base64.b64encode(data).decode("utf-8")}
                  for data in serialized],
})
print(ser_input_data_json[:200] + "..." + ser_input_data_json[-200:])

{"signature_name": "serving_default", "instances": [{"b64": "CtMYCtAYCgVpbWFnZRLGGBLDGArAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA...7j4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxcTEPqCfHz+Pjo4+AAAAAAAAAACJiIg+sbAwP6WkJD/p6Gg+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="}]}


In [66]:
import requests

SER_SERVER_URL = 'http://localhost:8501/v1/models/my_ser_fashion_mnist:predict'
            
response = requests.post(SER_SERVER_URL, data=ser_input_data_json)
response.raise_for_status()
response = response.json()

In [67]:
response.keys()

dict_keys(['predictions'])

In [68]:
y_proba = np.array(response["predictions"])
y_proba.round(2)

array([[0.06, 0.1 , 0.11, 0.13, 0.09, 0.1 , 0.11, 0.1 , 0.14, 0.05],
       [0.03, 0.1 , 0.23, 0.1 , 0.03, 0.04, 0.12, 0.2 , 0.06, 0.09],
       [0.06, 0.22, 0.21, 0.06, 0.1 , 0.07, 0.07, 0.12, 0.03, 0.06]])

In [69]:
!python3 -m pip install --no-deps tensorflow-serving-api



In [70]:
import grpc
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

channel = grpc.insecure_channel('localhost:8500')
predict_service = prediction_service_pb2_grpc.PredictionServiceStub(channel)

request = predict_pb2.PredictRequest()
request.model_spec.name = SER_MODEL_NAME
request.model_spec.signature_name = "serving_default"
input_name = ser_model.input_names[0]
request.inputs[input_name].CopyFrom(tf.compat.v1.make_tensor_proto(serialized))

result = predict_service.Predict(request, 10.0)

In [71]:
result

outputs {
  key: "sequential_1"
  value {
    dtype: DT_FLOAT
    tensor_shape {
      dim {
        size: 3
      }
      dim {
        size: 10
      }
    }
    float_val: 0.06312903016805649
    float_val: 0.09762337058782578
    float_val: 0.10542511194944382
    float_val: 0.13347412645816803
    float_val: 0.0905236005783081
    float_val: 0.10180860012769699
    float_val: 0.11381368339061737
    float_val: 0.10417938977479935
    float_val: 0.1358037292957306
    float_val: 0.054219383746385574
    float_val: 0.034878116101026535
    float_val: 0.0964287593960762
    float_val: 0.22820086777210236
    float_val: 0.10249563306570053
    float_val: 0.029482640326023102
    float_val: 0.043995846062898636
    float_val: 0.11502762883901596
    float_val: 0.20161381363868713
    float_val: 0.055391065776348114
    float_val: 0.09248568117618561
    float_val: 0.05966225266456604
    float_val: 0.220796599984169
    float_val: 0.2118290215730667
    float_val: 0.05706249922513962
 

In [72]:
output_name = ser_model.output_names[0]
output_name

'sequential_1'

In [73]:
shape = [dim.size for dim in result.outputs[output_name].tensor_shape.dim]
shape

[3, 10]

In [74]:
y_proba = np.array(result.outputs[output_name].float_val).reshape(shape)
y_proba.round(2)

array([[0.06, 0.1 , 0.11, 0.13, 0.09, 0.1 , 0.11, 0.1 , 0.14, 0.05],
       [0.03, 0.1 , 0.23, 0.1 , 0.03, 0.04, 0.12, 0.2 , 0.06, 0.09],
       [0.06, 0.22, 0.21, 0.06, 0.1 , 0.07, 0.07, 0.12, 0.03, 0.06]])

![Exercise](https://c1.staticflickr.com/9/8101/8553474140_c50cf08708_b.jpg)

## Exercise 2 – Distributed Training

In [75]:
keras.backend.clear_session()

In [78]:
distribution = tf.distribute.MirroredStrategy()

with distribution.scope():
    model = keras.models.Sequential([
        keras.layers.Flatten(input_shape=[28, 28]),
        keras.layers.Dense(100, activation="relu"),
        keras.layers.Dense(10, activation="softmax")
    ])
    model.compile(loss="sparse_categorical_crossentropy", optimizer="sgd",
                  metrics=["accuracy"])

W0303 20:38:10.316191 140735694128000 cross_device_ops.py:979] Not all devices in `tf.distribute.Strategy` are visible to TensorFlow.


In [88]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_train_full = X_train_full / 255.
X_test = X_test / 255.
X_valid, X_train = X_train_full[:32*32*4], X_train_full[32*32*4:]
y_valid, y_train = y_train_full[:32*32*4], y_train_full[32*32*4:]

I belive this is brocken

In [95]:
# model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid), batch_size=1)  # working
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid[:32*32*4], y_valid[:32*32*4]),
          steps_per_epoch=32, batch_size=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1659575c0>