# Deploying a TensorFlow Estimator with Flask

The notebook is build up from the great article by Sarah Robinson: [Comparing regression and classification on US elections data with TensorFlow Estimators](https://cloud.google.com/blog/big-data/2018/03/comparing-regression-and-classification-on-us-elections-data-with-tensorflow-estimators) published on the Google Cloud Big Data and Machine Learning blog.

This notebook consists of 3 steps to show the basic mechanics:
1. Training the model
2. Exporting the trained model
3. Load the exported model

The final part investigates deeper into the Tensorflow API for the SavedPredictor and gives some recommendation to implement a more generic mechanism.

The exported model is later used in a standalone Python web-service based on [Flask](http://flask.pocoo.org/), a very lightweight framework for building API in Python.

In [1]:
import os
import tensorflow as tf
import numpy as np


# Check that we have correct TensorFlow version installed
tf_version = tf.__version__
print("TensorFlow version: {}".format(tf_version))
assert "1.5" <= tf_version, "TensorFlow r1.5 or later is needed"

TensorFlow version: 1.6.0


In [2]:
tf.logging.set_verbosity(tf.logging.INFO)

train_file = "./data/regression-train.csv"
test_file = "./data/regression-test.csv"

## Part 1 - Training a Basic Linear Model

In [3]:
numerical_feature_names = [
    'PctUnder18',
    'PctOver65',
    'PctFemale',
    'PctWhite',
    'PctBachelors',
    'PctDem',
    'PctGop'
]

feature_columns = [tf.feature_column.numeric_column(k) for k in numerical_feature_names]

def my_input_fn(file_path, repeat_count=200):
    def decode_csv(line):
        parsed_line = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]])
        label = parsed_line[-1]  # Last element is the label
        features = parsed_line[:-1] # Everything but last elements are the features
        d = dict(zip(numerical_feature_names, features)), label
        return d

    dataset = (tf.data.TextLineDataset(file_path)  # Read text file
               .map(decode_csv))  # Transform each elem by applying decode_csv fn
    dataset = dataset.shuffle(buffer_size=256)
    dataset = dataset.repeat(repeat_count)  # Repeats dataset this # times
    dataset = dataset.batch(8)  # Batch size to use
    return dataset

In [4]:
feature_columns

[_NumericColumn(key='PctUnder18', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctOver65', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctFemale', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctWhite', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctBachelors', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctDem', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='PctGop', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

In [5]:
classifier = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir='./tmp-training-checkpoints')

# Run training for 7 epochs (7 times through our entire dataset)
# You can experiment with this value for your own dataset
classifier.train(
    input_fn=lambda: my_input_fn(train_file, 7))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': './tmp-training-checkpoints', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x10fe54780>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into ./tmp-training-checkpoints/model.ckpt.
INFO:tensorflow:loss = 1.3773992, step

<tensorflow.python.estimator.canned.linear.LinearRegressor at 0x10fe544a8>

In [6]:
results = classifier.evaluate(input_fn=lambda: my_input_fn(test_file, 1))
for key in sorted(results):
  print('%s: %s' % (key, results[key]))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-03-20-15:10:09
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./tmp-training-checkpoints/model.ckpt-2188
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2018-03-20-15:10:09
INFO:tensorflow:Saving dict for global step 2188: average_loss = 0.0013813323, global_step = 2188, loss = 0.010978901
average_loss: 0.0013813323
global_step: 2188
loss: 0.010978901


In [7]:
# Generate predictions on 3 counties
prediction_input = {
    'PctUnder18': [23.9, 25.7, 10.6],
    'PctOver65': [17.6,24.7,15.8],
    'PctFemale': [50.0,48.5,53.5],
    'PctWhite':[0.965, 0.97, 0.75],
    'PctBachelors':[12.7, 17.0, 49.8],
    'PctDem': [0.3227832512315271, 0.09475032010243278, 0.6346801346801347],
    'PctGop': [0.6545566502463054, 0.8911651728553138, 0.3468013468013468]
}

def test_input_fn():
   dataset = tf.data.Dataset.from_tensors(prediction_input)
   return dataset

# Predict all our prediction_input
pred_results = classifier.predict(input_fn=test_input_fn)

In [8]:
# Actual values for the raw prediction data:
# 1) 23% Clinton
# 2) 5% Clinton
# 3) 69% Clinton

for pred in enumerate(pred_results):
    print(pred)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./tmp-training-checkpoints/model.ckpt-2188
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
(0, {'predictions': array([0.22835457], dtype=float32)})
(1, {'predictions': array([0.02758891], dtype=float32)})
(2, {'predictions': array([0.6796286], dtype=float32)})


## Step 2 - Export the Trained Model

Persist the model as a TF-SavedModel

In [9]:
def create_input_spec(columns):
    feature_spec = {
        'PctUnder18': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctOver65': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctFemale': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctWhite': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctBachelors': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctDem': tf.placeholder(dtype=tf.float32, shape=[1,]),
        'PctGop': tf.placeholder(dtype=tf.float32, shape=[1,])
    }
    return feature_spec
    

In [10]:
%%bash
rm -rf ./saved_model

In [11]:
saved_model_dir = '../saved_model/'
serving_input_receiver = tf.estimator.export.build_raw_serving_input_receiver_fn(create_input_spec(feature_columns))
export_dir = classifier.export_savedmodel(export_dir_base=saved_model_dir, serving_input_receiver_fn=serving_input_receiver, as_text=True)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict']
INFO:tensorflow:Signatures EXCLUDED from export because they cannot be be served via TensorFlow Serving APIs:
INFO:tensorflow:'serving_default' : Regression input must be a single string Tensor; got {'PctUnder18': <tf.Tensor 'Placeholder:0' shape=(?,) dtype=float32>, 'PctOver65': <tf.Tensor 'Placeholder_1:0' shape=(?,) dtype=float32>, 'PctFemale': <tf.Tensor 'Placeholder_2:0' shape=(?,) dtype=float32>, 'PctWhite': <tf.Tensor 'Placeholder_3:0' shape=(?,) dtype=float32>, 'PctBachelors': <tf.Tensor 'Placeholder_4:0' shape=(?,) dtype=float32>, 'PctDem': <tf.Tensor 'Placeholder_5:0' shape=(?,) dtype=float32>, 'PctGop': <tf.Tensor 'Placeholder_6:0' shape=(?,) dtype=float32>}
INFO:tensorflow:'regression' : Regression input m

In [12]:
export_dir = export_dir.decode("utf-8") 
print("Model exported to: %s" % export_dir)
print("Model version: %s" % export_dir.split('/')[-1])

Model exported to: ../saved_model/1521558615
Model version: 1521558615


## Step 3 - Load the Trained model
Load thesaved model from the given directroy

In [13]:
predict_fn = tf.contrib.predictor.from_saved_model(export_dir, signature_def_key='predict')

INFO:tensorflow:Restoring parameters from b'../saved_model/1521558615/variables/variables'


In [14]:
predict_fn

SavedModelPredictor with feed tensors {'PctBachelors': <tf.Tensor 'Placeholder_4:0' shape=(?,) dtype=float32>, 'PctWhite': <tf.Tensor 'Placeholder_3:0' shape=(?,) dtype=float32>, 'PctUnder18': <tf.Tensor 'Placeholder:0' shape=(?,) dtype=float32>, 'PctGop': <tf.Tensor 'Placeholder_6:0' shape=(?,) dtype=float32>, 'PctDem': <tf.Tensor 'Placeholder_5:0' shape=(?,) dtype=float32>, 'PctFemale': <tf.Tensor 'Placeholder_2:0' shape=(?,) dtype=float32>, 'PctOver65': <tf.Tensor 'Placeholder_1:0' shape=(?,) dtype=float32>} and fetch_tensors {'predictions': <tf.Tensor 'linear/linear_model/weighted_sum:0' shape=(?, 1) dtype=float32>}

In [15]:
prediction_input_single = { 
        'PctUnder18': [23.9],
        'PctOver65': [17.6],
        'PctFemale': [50.0],
        'PctWhite': [0.965],
        'PctBachelors': [12.7],
        'PctDem': [0.3227832512315271],
        'PctGop': [0.6545566502463054]
}

pred_results = predict_fn(prediction_input_single)

In [16]:
pred_results

{'predictions': array([[0.22835457]], dtype=float32)}

In [17]:
for pred in enumerate(pred_results['predictions']):
    print(pred)

(0, array([0.22835457], dtype=float32))


In [18]:
prediction_input = {
    'PctUnder18': [23.9, 25.7, 10.6],
    'PctOver65': [17.6,24.7,15.8],
    'PctFemale': [50.0,48.5,53.5],
    'PctWhite':[0.965, 0.97, 0.75],
    'PctBachelors':[12.7, 17.0, 49.8],
    'PctDem': [0.3227832512315271, 0.09475032010243278, 0.6346801346801347],
    'PctGop': [0.6545566502463054, 0.8911651728553138, 0.3468013468013468]
}
pred_results = predict_fn(prediction_input)

In [19]:
pred_results

{'predictions': array([[0.22835457],
        [0.02758891],
        [0.6796286 ]], dtype=float32)}

In [20]:
for pred in enumerate(pred_results['predictions']):
    print(pred)

(0, array([0.22835457], dtype=float32))
(1, array([0.02758891], dtype=float32))
(2, array([0.6796286], dtype=float32))


## Final Thoughts

In [21]:
predict_fn.fetch_tensors

{'predictions': <tf.Tensor 'linear/linear_model/weighted_sum:0' shape=(?, 1) dtype=float32>}

In [22]:
predict_fn.feed_tensors

{'PctBachelors': <tf.Tensor 'Placeholder_4:0' shape=(?,) dtype=float32>,
 'PctDem': <tf.Tensor 'Placeholder_5:0' shape=(?,) dtype=float32>,
 'PctFemale': <tf.Tensor 'Placeholder_2:0' shape=(?,) dtype=float32>,
 'PctGop': <tf.Tensor 'Placeholder_6:0' shape=(?,) dtype=float32>,
 'PctOver65': <tf.Tensor 'Placeholder_1:0' shape=(?,) dtype=float32>,
 'PctUnder18': <tf.Tensor 'Placeholder:0' shape=(?,) dtype=float32>,
 'PctWhite': <tf.Tensor 'Placeholder_3:0' shape=(?,) dtype=float32>}