# Model Deployment using Tensorflow

## 1. Introduction
In this workbook, we will train a simple Tensorflow model and deploy that for inference. 
In this example, we use TensorFlow's [premade estimator iris data example](https://www.tensorflow.org/tutorials/estimator/premade) and add MLflow tracking.
This example trains a `tf.estimator.DNNClassifier` on the [iris dataset](https://archive.ics.uci.edu/ml/datasets/iris) and predicts on a validation set.
We then demonstrate how to load the saved model back as a generic `mlflow.pyfunc`, allowing us to make predictions.


## 2. Imports and Dependencies.
The few packages needed are loaded next. Particularly, `tensorflow`, `mlflow` will be majorly used in this tutorial. `requests` package will be used for performing query. `json` is used to post and get response from the server.

In [1]:
import os
import sys
import mlflow
import numpy as np
import mlflow.tensorflow
from mlflow import pyfunc
import tensorflow as tf
import pandas as pd
import tempfile
import shutil

# Suppress warnings
import warnings
warnings.filterwarnings("ignore")

2021-11-14 20:53:26.210552: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-11-14 20:53:26.210591: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


## MLflow for experiment tracking and model deployment

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles four primary functions:

- Tracking experiments to record and compare parameters and results (MLflow Tracking).
- Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).
- Providing a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations (MLflow Model Registry).

More information [here](https://www.mlflow.org/docs/latest/index.html#)



![image.png](https://www.mlflow.org/docs/latest/_images/scenario_4.png)

- localhost maps to the server on which the current notebook is running

- Tracking server maps to the server at environment variable `TRACKING_URL` that can be printed using `os.environ.get("TRACKING_URL")`

- Create an mlflow client that communicates with the tracking server

In [2]:
from mlflow import pyfunc

# Setting a tracking uri to log the mlflow logs in a particular location tracked by 
from mlflow.tracking import MlflowClient
tracking_uri = os.environ.get("TRACKING_URL")
client = MlflowClient(tracking_uri=tracking_uri)
mlflow.set_tracking_uri(tracking_uri)

## Create an experiment in mlflow database using mlflow client

- Get the list of all the experiments (Click on **Experiments** tab on the sidebar to see the list)
- Create a new experiment named *numpy_deployment* if it doesn't exist
- Set *numpy_deployment* as the new experiment under which different **runs** are tracked

## MLflow Entity Hierarchy

- Experiment 1
    - Run 1
        - Parameters
        - Metrics
        - Artifacts
            - Folder 1
                - File 1
                - File 2
            - Folder 2 
    - Run 2
    - Run 3

- Experiment 2
- Experiment 3        

In [3]:
# Setting a tracking project experiment name to keep the experiments organized
experiments = client.list_experiments()
experiment_names = []
for exp in experiments:
    experiment_names.append(exp.name)
experiment_name = "tf_deployment"
if experiment_name not in experiment_names:
    mlflow.create_experiment(experiment_name)
mlflow.set_experiment(experiment_name)


## Python Class for inference

- ModelWrapper is derived from mlflow.pyfunc.PythonModel [more info](https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html)
- load_context() member function is used to load the model. In this case, it loads a tensorflow model weights and estimator
- predict member function takes a. input and outputs classification
- An object of this class will be saved as a pickle file in blob storage

In [4]:
## Model Wrapper that takes 
class ModelWrapper(mlflow.pyfunc.PythonModel):
    def load_context(self,context):
        import numpy as np
        import tensorflow as tf
        self.model = tf.saved_model.load(context.artifacts['model_path'])
        print("Model initialized")
    
    def predict(self, context, dfeval):
        predictions = self.model.signatures["predict"](dfeval)
        return predictions


## Register a model using mlflow

- Log user-defined parameters in a remote database through a remote server
- Create a model_wrapper object using ModelWrapper() class in the above cell
- Create a default conda environment that need to be installed on the Docker conatiner that serves a REST API
- Save the model object as a pickle file and conda environment as artifacts (files) in S3 or Blob Storage

# 3. Some utility functions

In [5]:
# Some utility functions to load the data and create training functions

def load_data(y_name="Species"):
    """Returns the iris dataset as (train_x, train_y), (test_x, test_y)."""
    train_path = tf.keras.utils.get_file(TRAIN_URL.split("/")[-1], TRAIN_URL)
    test_path = tf.keras.utils.get_file(TEST_URL.split("/")[-1], TEST_URL)

    train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
    train_x, train_y = train, train.pop(y_name)

    test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
    test_x, test_y = test, test.pop(y_name)

    return (train_x, train_y), (test_x, test_y)


def train_input_fn(features, labels, batch_size):
    """An input function for training"""
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle, repeat, and batch the examples.
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)

    # Return the dataset.
    return dataset


def eval_input_fn(features, labels, batch_size):
    """An input function for evaluation or prediction"""
    features = dict(features)
    if labels is None:
        # No labels, use only features.
        inputs = features
    else:
        inputs = (features, labels)

    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices(inputs)

    # Batch the examples
    assert batch_size is not None, "batch_size must not be None"
    dataset = dataset.batch(batch_size)

    # Return the dataset.
    return dataset


In [6]:
batch_size = 100
train_steps = 1000
TRAIN_URL = "http://download.tensorflow.org/data/iris_training.csv"
TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"

CSV_COLUMN_NAMES = ["SepalLength", "SepalWidth", "PetalLength", "PetalWidth", "Species"]
SPECIES = ["Setosa", "Versicolor", "Virginica"]

# Fetch the data
(train_x, train_y), (test_x, test_y) = load_data()

# Feature columns describe how to use the input.
my_feature_columns = []
for key in train_x.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))

In [7]:
# Two hidden layers of 10 nodes each.
hidden_units = [10, 10]

# Build 2 hidden layer DNN with 10, 10 units respectively.
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    hidden_units=hidden_units,
    # The model must choose between 3 classes.
    n_classes=3,
)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpjrockdiq', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


2021-11-14 20:53:28.242913: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-11-14 20:53:28.242948: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-11-14 20:53:28.242968: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (knrphwgg-6889997f5d-w24t4): /proc/driver/nvidia/version does not exist
2021-11-14 20:53:28.243217: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [8]:
# Train the Model.
estimator = classifier.train(
    input_fn=lambda: train_input_fn(train_x, train_y, batch_size),
    steps=train_steps,
)

Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpjrockdiq/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 2.0530477, step = 0
INFO:tensorflow:global_step/sec: 737.546
INFO:tensorflow:loss = 1.6492424, step = 100 (0.136 sec)
INFO:tensorflow:global_step/sec: 1106.08
INFO:tensorflow:loss = 1.3105888, step = 200 (0.090 sec)
INFO:tensorflow:global_step/sec

In [9]:
# Evaluate the model.
eval_result = classifier.evaluate(
    input_fn=lambda: eval_input_fn(test_x, test_y, batch_size)
)

print("\nTest set accuracy: {accuracy:0.3f}\n".format(**eval_result))

# Generate predictions from the model
expected = ["Setosa", "Versicolor", "Virginica"]
predict_x = {
    "SepalLength": [5.1, 5.9, 6.9],
    "SepalWidth": [3.3, 3.0, 3.1],
    "PetalLength": [1.7, 4.2, 5.4],
    "PetalWidth": [0.5, 1.5, 2.1],
}

predictions = classifier.predict(
    input_fn=lambda: eval_input_fn(predict_x, labels=None, batch_size=batch_size)
)

old_predictions = []
template = '\nPrediction is "{}" ({:.1f}%), expected "{}"'

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-11-14T20:53:30
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpjrockdiq/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.17968s
INFO:tensorflow:Finished evaluation at 2021-11-14-20:53:30
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.53333336, average_loss = 1.2849653, global_step = 1000, loss = 1.2849653
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpjrockdiq/model.ckpt-1000

Test set accuracy: 0.533



In [10]:
for pred_dict, expec in zip(predictions, expected):
    class_id = pred_dict["class_ids"][0]
    probability = pred_dict["probabilities"][class_id]

    print(template.format(SPECIES[class_id], 100 * probability, expec))

    old_predictions.append(SPECIES[class_id])

# Creating output tf.Variables to specify the output of the saved model.
feat_specifications = {
    "SepalLength": tf.Variable([], dtype=tf.float64, name="SepalLength"),
    "SepalWidth": tf.Variable([], dtype=tf.float64, name="SepalWidth"),
    "PetalLength": tf.Variable([], dtype=tf.float64, name="PetalLength"),
    "PetalWidth": tf.Variable([], dtype=tf.float64, name="PetalWidth"),
}

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpjrockdiq/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.

Prediction is "Setosa" (45.9%), expected "Setosa"

Prediction is "Virginica" (55.6%), expected "Versicolor"

Prediction is "Virginica" (65.1%), expected "Virginica"


In [11]:
# checkpointing and logging the model in mlflow
receiver_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(feat_specifications)
artifact_path = './tf-model/'
shutil.rmtree(artifact_path)
saved_estimator_path = classifier.export_saved_model('/tmp', receiver_fn).decode("utf-8")
shutil.move(saved_estimator_path, artifact_path)
model_artifacts = {"model_path" : artifact_path}
env = mlflow.tensorflow.get_default_conda_env()
model_wrapper = ModelWrapper()
with mlflow.start_run():
    mlflow.log_param('batch_size', batch_size)
    mlflow.log_param('train_steps', train_steps)
    mlflow.log_param('csv_column_names', CSV_COLUMN_NAMES)
    mlflow.log_param('species', SPECIES)
    mlflow.pyfunc.log_model("tf_model", python_model=model_wrapper, artifacts=model_artifacts, conda_env=env)


INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:Signatures EXCLUDED from export because they cannot be be served via TensorFlow Serving APIs:
INFO:tensorflow:'serving_default' : Classification input must be a single string Tensor; got {'SepalLength': <tf.Tensor 'Placeholder:0' shape=(None,) dtype=float64>, 'SepalWidth': <tf.Tensor 'Placeholder_1:0' shape=(None,) dtype=float64>, 'PetalLength': <tf.Tensor 'Placeholder_2:0' shape=(None

In [20]:
artifact_path

'./tf-model/'

## 4. Deploying the model
The above code logs a model in the experiments tab. For more info please refer [here](https://rocketml.gitbook.io/rocketml-user-guide/experiments). After deploying the model, we can obtain the model url for performing query as shown below.

## 5. Query from the server

There are two methods to perform query... The first is using `requests` library and the other using `curl` shell command.

In [22]:
import requests
import json

url = "http://127.0.0.1:5000/invocations"
headers = {"Content-Type":"application/json; format=pandas-split"}

# First case, run inference on single data point
predict_data = [[5.1, 3.3, 1.7, 0.5], [5.9, 3.0, 4.2, 1.5], [6.9, 3.1, 5.4, 2.1]]
df = pd.DataFrame(
    data=predict_data,
    columns=["SepalLength", "SepalWidth", "PetalLength", "PetalWidth"],
)

print(df)
response = requests.post(url,data=df.to_json(orient="split",index=False),headers=headers)
if response.status_code == 200:
    output = response.json()
    print(response)
else:
    print(response.status_code)
    print("REST API deployment is in progress -- please try again in a few minutes!")

   SepalLength  SepalWidth  PetalLength  PetalWidth
0          5.1         3.3          1.7         0.5
1          5.9         3.0          4.2         1.5
2          6.9         3.1          5.4         2.1
400
REST API deployment is in progress -- please try again in a few minutes!


In [23]:
print(response.json()['stack_trace'])

Traceback (most recent call last):
  File "/home/ubuntu/.conda/envs/mlflow-fd702d538505a3f80cc3fa48a53c9859f694d90f/lib/python3.7/site-packages/mlflow/pyfunc/scoring_server/__init__.py", line 303, in transformation
    raw_predictions = model.predict(data)
  File "/home/ubuntu/.conda/envs/mlflow-fd702d538505a3f80cc3fa48a53c9859f694d90f/lib/python3.7/site-packages/mlflow/pyfunc/__init__.py", line 608, in predict
    return self._model_impl.predict(data)
  File "/home/ubuntu/.conda/envs/mlflow-fd702d538505a3f80cc3fa48a53c9859f694d90f/lib/python3.7/site-packages/mlflow/pyfunc/model.py", line 296, in predict
    return self.python_model.predict(self.context, model_input)
  File "/tmp/ipykernel_2159/3779823312.py", line 10, in predict
  File "/home/ubuntu/.conda/envs/mlflow-fd702d538505a3f80cc3fa48a53c9859f694d90f/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1707, in __call__
    return self._call_impl(args, kwargs)
  File "/home/ubuntu/.conda/envs/mlflow-fd702d538

In [30]:
pyfunc_model = pyfunc.load_model('')


ResourceNotFoundError: The specified blob does not exist.
RequestId:5d7e63d1-901e-0041-1d95-d97cd0000000
Time:2021-11-14T20:23:35.9277894Z
ErrorCode:BlobNotFound