# Create a Servable ML Model with Tensorflow
In this notebook we will create, save, load, and employ models with Tensorflow. We will work through how to structure code to create models that can be saved and used for inference in the cloud or at the edge with applications such as dashboards, games, anomaly detection, and much more.

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import pandas as pd
from sklearn.model_selection import train_test_split
import os
print("Tensorflow version:", tf.version.VERSION)
print("Numpy version:", np.version.version)

Tensorflow version: 2.0.0
Numpy version: 1.16.5


In [2]:
MODEL_DIR = 'models' # We will save model to this directory
DATA_PATH = os.path.join('data', 'data.csv') 

Tensorflow 2 includes a [SavedModel](https://www.tensorflow.org/guide/saved_model "Tensorflow SavedModel docs") format that can be utilised for transfer learning and/or inference in applications. In order for a model to be saveable, it has to be of the type `tf.Module`. Keras models satisfy this criterium and are thus relatively simple to save. Before jumping into a specific example of a Keras model, we will, however, address the components of a Tensorflow SavedModel with a general example using pure Tensorflow.

# General Example in Pure Tensorflow

## An Example Model
A Tensorflow SavedModel can be created from a `tf.Module`, so let us create a very simple example model using this object structure. The saved model will include any `tf.Modules`s, any methods with the `@tf.function` decorator, and any `tf.Variable`, but will not include any Python code or functionality.

In [3]:
class LinearScaler(tf.Module):
    '''
    LinearScaler is a very simple linear function that takes in a variable, multiplies it
     by a weight and adds a bias before returning the result.
    '''
    def __init__(self):
        super(LinearScaler, self).__init__()
        self.bias = tf.Variable(1.)
        self.weight = tf.Variable(2.)
    
    # Uncomment to set input signature
    @tf.function#(input_signature=[tf.TensorSpec([], tf.float32)])
    def __call__(self, x):
        '''
        Linearly rescale y = x * weight + bias
        :param x: The variable to be linearly scaled
        :type x: tf.float32
        :output dict: "y" = x * weight + bias
        '''
        return {"y" : x * self.weight + self.bias}

    @tf.function(input_signature=[tf.TensorSpec([], tf.float32), tf.TensorSpec([], tf.float32)])
    def calibrate(self, weight, bias):
        '''
        Set the parameters of the linear function.
        :param weight: Weight of the linear function
        :type weight: tf.float32
        :param bias: Bias of the linear function
        :type bias: tf.float32
        '''
        self.weight.assign(weight)
        self.bias.assign(bias)

model = LinearScaler()

Let us make sure that is works by evaluating it. Note that by evaluating the model, it is compiled into a graph - a step that is needed before we save

In [4]:
assert model(tf.constant(4.))["y"].numpy() == 9 #  2 * 4 + 1
model.calibrate(weight=5, bias=2)
assert model(tf.constant(4.))["y"].numpy() == 22 # 5 * 4 + 2

Fantastic! Now we have a working model. 

## Save and Load
Let us go right ahead and save our model using the `tf.saved_model.save` method:

In [6]:
no_signatures_path = os.path.join(MODEL_DIR, 'no_signatures')
#model = LinearScaler() # Get a fresh model
#model(tf.constant(4.))
tf.saved_model.save(model, no_signatures_path) # Save the function to a dir

INFO:tensorflow:Assets written to: models\no_signatures\assets


Loading the model also only takes a single line with the `tf.saved_model.load` method:

In [7]:
loaded_model = tf.saved_model.load(no_signatures_path)

What we have done so far will work perfectly well in many cases but, in some cases, the loaded model might behave differently from our expectations. Let us look at some characteristics of the model as it is now.<br><br>
Firstly, the parameters of the loaded model are the same as the original at the time we saved it:

In [8]:
assert model.weight.numpy() == loaded_model.weight.numpy()
print("Weight in loaded model:", loaded_model.weight.numpy())
print("Bias in loaded model:", loaded_model.bias.numpy())

Weight in loaded model: 5.0
Bias in loaded model: 2.0


Specifically, the weight and bias parameters to not attain the values of a newly initialised model but the values that we set just before saving.<br><br>
The loaded model will also evaluate inputs like the original model:

In [10]:
assert model(tf.constant(4.))["y"].numpy() == loaded_model(tf.constant(4.))["y"].numpy()
assert model(tf.constant(22.))["y"].numpy() == loaded_model(tf.constant(22.))["y"].numpy()
print(loaded_model(tf.constant(22.)))

{'y': <tf.Tensor: id=282, shape=(), dtype=float32, numpy=112.0>}


However, this is where the simillarity ends. Our original model will happily evaluate a different kind of input. Here for instance a Tensor of several floats:

In [11]:
print(model(tf.constant([1., 2., 3.]))) # Passing a Tensor of floats

{'y': <tf.Tensor: id=294, shape=(3,), dtype=float32, numpy=array([ 7., 12., 17.], dtype=float32)>}


But when providing the same input to our loaded model, we are presented with a ValueError

In [12]:
try:
    print(loaded_model(tf.constant([1., 2., 3.]))) # Passing a Tensor of floats
except ValueError as e:
    print("We got a ValueError:\n", e)

We got a ValueError:
 Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (1 total):
    * Tensor("x:0", shape=(3,), dtype=float32)
  Keyword arguments: {}

Expected these arguments to match one of the following 1 option(s):

Option 1:
  Positional arguments (1 total):
    * TensorSpec(shape=(), dtype=tf.float32, name='x')
  Keyword arguments: {}


It is not that we did anything wrong. It is just that we did not pay much attention to the *input* of our model. When we evaluated the model before saving it we caused the class we defined to be compiled into a `tf.Graph` object. The Graph object must assume a fixed structure of the input, i.e. an input signature. When we pass a new input to our original model, the graph can get recompiled to fit that input, but our saved and loaded model represents a fixed Graph that can only accept the expected input and will throw a ValueError otherwise. In other words, we should specify the input signature that our model should expect. We did specify the input signature implicitly when evaluating the model the first time, but we might want to do it explicitly.

## Save with an Input Signature
There are two was to specify an input signature to a Tensorflow method. One way is to use the `@tf.function` decorator. The `@tf.function` decorator takes an input_signature as a kwarg, and we can pass it a list of `tf.TensorSpec`s to template the input. We actually already did this in the example model for the `.calibrate` method to show that we are expecting two constant tensors as input, but for the `__call__` method, the signature was commented out.<br>
The second method is to explicitly compile the `tf.Graph` and, in the process, passing an input signature. We can do this by invoking the `.get_concrete_function` method on the `__call__` method:

In [13]:
model_with_signature = LinearScaler() # Get a fresh instance of the function
input_signature_array = tf.TensorSpec([None], tf.float32) # Note! Specifices a Tensor array of floats
call = model_with_signature.__call__.get_concrete_function(input_signature_array) # Compile Graph

When we save the compiled graph, we will pass this compilation

In [14]:
with_signature_path = os.path.join(MODEL_DIR, 'with_signature')
tf.saved_model.save(model_with_signature, with_signature_path, signatures=call) # Save the function

INFO:tensorflow:Assets written to: models\with_signature\assets


Let us load up the model and whether it can handle the input we specified

In [15]:
loaded_model_with_signature = tf.saved_model.load(with_signature_path)
print(loaded_model_with_signature(tf.constant([3., 4., 5.])))

{'y': <tf.Tensor: id=536, shape=(3,), dtype=float32, numpy=array([ 7.,  9., 11.], dtype=float32)>}


That is one challenge solved; we now know how to explicitly define the input signature. The model works with the specified input, but any other type of input will cause a ValueError

In [16]:
try:
    print(loaded_model_with_signature(tf.constant(3.)))
except ValueError as e:
    print("We got a ValueError:", e)

We got a ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (1 total):
    * Tensor("x:0", shape=(), dtype=float32)
  Keyword arguments: {}

Expected these arguments to match one of the following 1 option(s):

Option 1:
  Positional arguments (1 total):
    * TensorSpec(shape=(None,), dtype=tf.float32, name='x')
  Keyword arguments: {}


## Save with Multiple Input Signatures
We might want to save and serve our model with multiple kinds of inputs. For instance in our example model, we would like to serve the model to single inputs as well as a list of inputs. Everything else, like the weights and biases, should be the same, but the input should be flexible. One way to do this might be to save two or more seperate models, but that would create multiple duplicates of the same weights and, as a result, extra operational overhead. Fortunately, there is a better way.<br>
A loaded model actually has a dictionary, `.signatures`, mapping the input signatures that can be evaluated in the model. We only specified a single input signature, so there should be only one entry in the dictionary:

In [17]:
print(list(loaded_model_with_signature.signatures.keys()))

['serving_default']


There is indeed only a single entry, which is the default `"serving_default"` entry. When we call the loaded model and do not specify a key to this dictionary, it will look up `"serving_default"` and decide whether the input signature matches the input we used to call the model.<br>
We could also have supplied the key and created an object of our model expecting the specified input:

In [18]:
inference_object_array = loaded_model_with_signature.signatures["serving_default"]
print(inference_object_array(tf.constant(6.)))

{'y': <tf.Tensor: id=542, shape=(), dtype=float32, numpy=13.0>}


We can use this functionality to create one saved model that serves multiple kinds of inputs. We do this by supplying a dictionary mapping keys to Graphs to the `tf.saved_model.save` method

In [19]:
model_multiple_signatures = LinearScaler() # Get a fresh instance of the function

# Input signatures
input_signature_single = tf.TensorSpec(None, tf.float32) # Specifices a single Tensor float
input_signature_array = tf.TensorSpec([None], tf.float32) # Specifices a Tensor array of floats

# Compiled Graphs
call_single = model_multiple_signatures.__call__.get_concrete_function(input_signature_single)
call_array = model_multiple_signatures.__call__.get_concrete_function(input_signature_array)

# Input signature dictionary
signatures = {"serving_default": call_single,
              "array_input": call_array}

# Save the model
multiple_signatures_path = os.path.join(MODEL_DIR, "multiple_signatures")
tf.saved_model.save(model_multiple_signatures, multiple_signatures_path, signatures=signatures)

INFO:tensorflow:Assets written to: models\multiple_signatures\assets


We can load up the model and create two different inference objects 

In [20]:
loaded_model_multiple_signatures = tf.saved_model.load(multiple_signatures_path)
inference_object_single = loaded_model_multiple_signatures.signatures["serving_default"]
inference_object_array = loaded_model_multiple_signatures.signatures["array_input"]

The inference objects still point to the same model, so if we change the parameters of the model, it will apply to both

In [21]:
# Evaluate two types of inputs
print(inference_object_array(tf.constant([3., 4., 5.])))
print(inference_object_single(tf.constant(5.)))

# Change parameters of the model
loaded_model_multiple_signatures.calibrate(tf.constant(3.), tf.constant(4.))

# Evaluate the same input again - note the difference in both!
print(inference_object_array(tf.constant([3., 4., 5.])))
print(inference_object_single(tf.constant(5.)))

{'y': <tf.Tensor: id=840, shape=(3,), dtype=float32, numpy=array([ 7.,  9., 11.], dtype=float32)>}
{'y': <tf.Tensor: id=842, shape=(), dtype=float32, numpy=11.0>}
{'y': <tf.Tensor: id=855, shape=(3,), dtype=float32, numpy=array([13., 16., 19.], dtype=float32)>}
{'y': <tf.Tensor: id=857, shape=(), dtype=float32, numpy=19.0>}


Note that this also means that our model is mutable even during inference. In fact the SavedModel format can be used for inference as well as retraining or transfer learning cases.<br>
Now we know almost everythin there is to know about the SavedModel format. Let's just take a quick look inside one of the directories created each time we save a model.

## Inside a Saved Model
Let us have a look at what is in one of the directories we created.

In [22]:
os.listdir(multiple_signatures_path)

['assets', 'saved_model.pb', 'variables']

The saved model consists of several elements.<br>
- `saved_model.pb` contains the model architecture that is used to rebuild the function
- The directory `variables` contain one or more data files holding the values of the parameters in the model at the time it was saved. For large models with billions of parameters, these data files can grow large. A `variables.index` file maps the stored parameters to their right spot in the function
- The directory `assets` holds additional artefacts needed to recreate the function, but should be empty in our case

# A Keras Example
Now that we are clear on the basics, let us take a look at a more realistic workflow. Defining, saving, and serving a model built with Keras.<br>
Our model starts with a bit of data and something to be modelled. I have prepared a small dataset consisting of weather observations from a station that observes temperature, relative humidity, air pressure, and whether it is raining or not. Our task is to build a model that predicts whether it rains or not given the temperature, relative humidity, and air pressure. Our target is not to build an awesome or precise model. It is to train, save, and then serve the model.<br>
## Example Data
First order of business, let us have a look at the example data.

In [23]:
df = pd.read_csv('data/data.csv')
feature_cols = ['pressure', 'temperature', 'humidity']
label_col = ['rain']
print("Example observations")
print(df.sample(10))
print()
print("Statistics")
print(df[feature_cols].describe())

Example observations
      pressure  rain  temperature  humidity
80       999.2     0          2.5      48.0
1141     977.8     0          7.7      74.0
1753    1015.5     0         -2.8      64.0
243     1015.9     0          1.3      82.0
1248     989.6     0          6.6      90.0
1367     996.9     0         -4.1      69.0
1198    1003.8     0         -1.1      88.0
1858    1003.6     0          6.7      71.0
587     1008.1     0          4.3      63.0
2993    1014.1     0         11.1      29.0

Statistics
          pressure  temperature     humidity
count  3151.000000  3151.000000  3151.000000
mean   1005.235989     4.737417    71.054268
std      15.113994     3.998234    20.464912
min     951.100000    -6.400000    22.000000
25%     996.150000     2.200000    55.000000
50%    1005.700000     5.000000    75.000000
75%    1016.500000     6.700000    89.000000
max    1036.400000    20.100000    97.000000


The data is of relatively high quality, all we need to do is to reduce the numerical difference and align the variances. Specifically, we will standardise (z norm) the features. If this CONTINUE HERE

In [79]:
class ZNorm(tf.Module):
    '''
    
    '''
    def __init__(self, num_features):
        super(ZNorm, self).__init__()
        
        self.std_devs = tf.Variable(tf.ones([num_features]), dtype=tf.float32)
        self.means = tf.Variable(tf.zeros([num_features]), dtype=tf.float32)
    
    
    @tf.function
    def __call__(self, x):
        '''
        Compute the Z norm of input features
        :param x: Tensor of length num_features
        :type x: Tensor of floats
        :output: x' = (x - mean/std_dev
        '''
        return {"x_prime" : tf.divide(tf.subtract(x, self.means), self.std_devs)}

    @tf.function(input_signature=[tf.TensorSpec([None], tf.float32), tf.TensorSpec([None], tf.float32)])
    def calibrate(self, means, standard_deviations):
        '''
        
        '''
        self.std_devs.assign(standard_deviations)
        self.means.assign(means)

In [80]:
standardiser = ZNorm(3)

In [81]:
standardiser(tf.constant([1.2, 2., -1.], dtype=tf.float32))["x_prime"]

<tf.Tensor: id=54997, shape=(3,), dtype=float32, numpy=array([ 1.2,  2. , -1. ], dtype=float32)>

In [82]:
means = [np.mean(df[s].values) for s in feature_cols]
std_devs = [np.std(df[s].values) for s in feature_cols]

In [83]:
standardiser.calibrate(means=means, standard_deviations=std_devs)

In [84]:
standardiser(tf.constant([1.2, 2., -1.]))["x_prime"]

<tf.Tensor: id=55008, shape=(3,), dtype=float32, numpy=array([-66.44142   ,  -0.68476504,  -3.5214276 ], dtype=float32)>

In [91]:
def standardise(r):
    x = tf.constant(r.values, dtype=tf.float32)
    xp = standardiser(x)["x_prime"].numpy()
    return pd.Series(xp.tolist())

In [93]:
features = df[feature_cols].apply(standardise, axis=1)

In [94]:
features.describe()

Unnamed: 0,0,1,2
count,3151.0,3151.0,3151.0
mean,2e-06,-1.480776e-08,2.985508e-08
std,1.000159,1.000159,1.000159
min,-3.582413,-2.786026,-2.397374
25%,-0.601258,-0.634735,-0.7846023
50%,0.030708,0.06568522,0.1928353
75%,0.74539,0.4909403,0.8770417
max,2.062262,3.842952,1.268017


In [96]:
X = features.values
y = df[label_col].values

In [133]:
def mymodel(X,y):

    tf.random.set_seed(seed=0)

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42, shuffle=True)
    
    # Create Keras model
    model = keras.Sequential(name="mymodel", layers=[
        keras.layers.InputLayer(input_shape=(3), name="input"),
        keras.layers.Dense(6, activation="sigmoid", name="dense"),
        keras.layers.Dense(1, activation="sigmoid", name="output")
    ])

    # Print model architecture
    model.summary()

    # Compile model with optimizer
    model.compile(optimizer=keras.optimizers.Adam(0.05),
                  loss="binary_crossentropy",
                  metrics=["accuracy"])

    # Train model
    model.fit(x=[X_train], y=[y_train], batch_size=50, epochs=5)

    # Test model
    test_loss, test_acc = model.evaluate(x=[X_test], y=[y_test], verbose=2)
    
    print("Test accuracy: ")
    print(test_acc)
    return model

In [134]:
keras_model = mymodel(X, y)

Model: "mymodel"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 6)                 24        
_________________________________________________________________
output (Dense)               (None, 1)                 7         
Total params: 31
Trainable params: 31
Non-trainable params: 0
_________________________________________________________________
Train on 2111 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
1040/1 - 0s - loss: 0.3760 - accuracy: 0.9029
Test accuracy: 
0.9028846


In [None]:
keras_model_path = os.path.join(MODEL_DIR, 'keras_model')
tf.saved_model.save(model, keras_model_path)