### 19. Training and Deploying TensorFlow Models at Scale

Time to put models where they belong, in production! This could be as simple as running the model on a batch of data and perhaps writing a script that runs this model every night. But often it is not. 

Usually we may want something that we can deploy on live data, update and scale as it grows. 

### Serving a TensorFlow Model

As the infra grows, it may be preferable to wrap our model in a small service whose sole role is to make predictions and have the rest of the infrastructure query it (e.g. through a REST API). This decouples your model from the rest of the infrastructure, making it possible to easily switch model versions or scale the service up as needed.

#### Using TensorFlow Serving

Let’s suppose we have trained an MNIST model using `tf.keras`, and you want to deploy it to TF Serving. The first thing we have to do is export this model to TensorFlow’s `SavedModel` format. 

In [1]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.mnist.load_data()
X_train_full = X_train_full[..., np.newaxis].astype(np.float32) / 255.
X_test = X_test[..., np.newaxis].astype(np.float32) / 255.
X_valid, X_train = X_train_full[:5000], X_train_full[5000:]
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_new = X_test[:3]

In [2]:
np.random.seed(42)
tf.random.set_seed(42)

model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28, 1]),
    keras.layers.Dense(100, activation="relu"),
    keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
              optimizer=keras.optimizers.SGD(lr=1e-2),
              metrics=["accuracy"])
model.fit(X_train, y_train, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x24ca0a04b88>

In [3]:
import os

model_version = "0001"
model_name = "my_mnist_model"
model_path = os.path.join(model_name, model_version)
tf.saved_model.save(model, model_path)

Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: my_mnist_model\0001\assets


**Note**: Since a SavedModel saves the computation graph, it can only be used with models that are based exclusively on TensorFlow operations.

### Deploying a Model to a Mobile or Embedded Device

For these use cases, use TFLite. To reduce the model size, TFLite’s model converter can take a SavedModel and compress it to a much lighter format based on FlatBuffers.