<a href="https://colab.research.google.com/github/LxYuan0420/Awesome-Graph-Neural-Networks/blob/master/notebooks/6_6_Model_Deploying_Using_tensorflow_serving.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**6-6 Model Deploying Using tensorflow-serving**

There are multiple ways to deploy and run the trained models which saved with the original tensorflow format.

For example:

We can load and run the model in the web browser using javascript through tensorflow-js.

We can load and run the TensorFlow model on mobile and embeded devices through tensorflow-lite.

We can use tensorflow-serving to load the model that providing network interface API service and to acquire the prediction results from the model through sending network requests in arbitrary programming languages.

We can predict using the TensorFlow model in Java or spark (scala) through the TensorFlow for Java port.

This section introduces model deploying by tensorflow serving and using spark (scala) to implement the TensorFlow models.

**0. Introduction to model deploying by tensorflow serving**

The necessary steps of model deploying using tensorflow serving are:

(1) Prepare the protobuf model file.

(2) Install the tensorflow serving.

(3) Start the tensorflow serving service.

(4) Send the request to the API service to obtain the prediction.

You may use the following link for testing (tf_serving, in Chinese) https://colab.research.google.com/drive/1vS5LAYJTEn-H0GDb1irzIuyRB8E3eWc8

In [1]:
import tensorflow as tf
from tensorflow.keras import *

**1. Prepare the protobuf Model File**

Here we train a simple linear regression model with `tf.keras` and save it as protobuf file.

In [2]:
n = 800

X = tf.random.uniform([n,2], minval=-10, maxval=10)
w0 = tf.constant([[2.0], [1.0]])
b0 = tf.constant(3.0)

Y = X@w0 + b0 + tf.random.normal([n,1], mean=0.0, stddev=2.0)

In [4]:
inputs = tf.keras.Input(shape=(2,), name='input')
outputs = tf.keras.layers.Dense(1, name="outputs")(inputs)
my_model = models.Model(inputs=inputs, outputs=outputs)

my_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input (InputLayer)           [(None, 2)]               0         
_________________________________________________________________
outputs (Dense)              (None, 1)                 3         
Total params: 3
Trainable params: 3
Non-trainable params: 0
_________________________________________________________________


In [5]:
my_model.compile(loss="mse", optimizer="rmsprop", metrics=["mae"])
my_model.fit(X, Y, batch_size=8, epochs=100)

tf.print("w = ", my_model.layers[1].kernel)
tf.print("b = ", my_model.layers[1].bias)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

In [6]:
# save the model as pb format
export_path = "../my_model/"
version = "1"
my_model.save(export_path+version, save_format='tf')

INFO:tensorflow:Assets written to: ../my_model/1/assets


In [7]:
# check the saved model file
!ls {export_path+version}

assets	saved_model.pb	variables


In [8]:
# Check the info of the model file
!saved_model_cli show --dir {export_path+str(version)} --all


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 2)
        name: serving_default_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['outputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
W0212 17:07:55.066845 139931175499648 deprecation.py:506] From /usr/local/lib/python2.7/dist-packages/tensorflow_core/python/ops/resource_variab

2. Installing tensorflow serving
Two methods for installing tensorflow serving: Using Docker images, or using apt.

Docker image is the simplest way of installation and we recommend it.

Docker is a container that provides independent environment for various programs.

The companies that are using TensorFlow usually use Docker to install tensorflow serving by operation experts, so the algorithm engineers don't have to worry about the installation.

The installation of Docker on different OS are shown below (in Chinese).

Windows: https://www.runoob.com/docker/windows-docker-install.html

MacOs: https://www.runoob.com/docker/macos-docker-install.html

CentOS: https://www.runoob.com/docker/centos-docker-install.html

After successful installation of Docker, run the following command to load the tensorflow/serving image.

docker pull tensorflow/serving

**3. Starting tensorflow serving Servic**e

In [9]:
!docker run -t --rm -p 8501:8501 \
    -v "../my_model/" \
    -e MODEL_NAME=my_loaded_model \
    tensorflow/serving & >server.log 2>&1

/bin/bash: docker: command not found


**4. Sending request to the API service**

The request could be sent through http function in any kind of the programming languages. We demonstrate request sending using the curl command in Linux and the requests library in Python.

In [None]:
!curl -d '{"instances": [1.0, 2.0, 5.0]}' \
    -X POST http://localhost:8501/v1/models/linear_model:predict

In [None]:
import json,requests

data = json.dumps({"signature_name": "serving_default", "instances": [[1.0, 2.0], [5.0,7.0]]})
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/linear_model:predict', 
        data=data, headers=headers)
predictions = json.loads(json_response.text)["predictions"]
print(predictions)