# Tensorflow Serving

* Context: clothing prediction
* tf serving is especially created for serving tf models
* a library written in C++
* is only for inference
* gets the already preprocessed data (image)
* we need to create a "Gateway" that does the preprocessing

![workflow](Screenshot_01.png)

* For the gateway we will use Flask

## TensorFlow Serving

* Convert our trained model to the format of tf serving (saved_model format)
* Run TF-Serving locally using docker
* Invoking the model from Jupyter

Use a model from week8:

In [3]:
import os
import tensorflow as tf
from tensorflow import keras

In [5]:
path ="../week8"
model = keras.models.load_model(os.path.join(path, "xception_v4_05_0.850.h5"))
model

<tensorflow.python.keras.engine.functional.Functional at 0x7f892413b400>

In [6]:
tf.saved_model.save(model, 'clothing_model') # model, folder name

INFO:tensorflow:Assets written to: clothing_model/assets


In [14]:
!ls -lrh clothing_model

total 3,5M
drwxr-xr-x 2 frauke frauke 4,0K Mai  4 14:13 variables
-rw-rw-r-- 1 frauke frauke 3,5M Mai  4 14:13 saved_model.pb
drwxr-xr-x 2 frauke frauke 4,0K Mai  4 14:13 assets


In [15]:
!tree clothing_model

[01;34mclothing_model[00m
├── [01;34massets[00m
├── saved_model.pb
└── [01;34mvariables[00m
    ├── variables.data-00000-of-00001
    └── variables.index

2 directories, 3 files


In [16]:
!ls -lRh clothing_model

clothing_model:
total 3,5M
drwxr-xr-x 2 frauke frauke 4,0K Mai  4 14:13 assets
-rw-rw-r-- 1 frauke frauke 3,5M Mai  4 14:13 saved_model.pb
drwxr-xr-x 2 frauke frauke 4,0K Mai  4 14:13 variables

clothing_model/assets:
total 0

clothing_model/variables:
total 83M
-rw-rw-r-- 1 frauke frauke 83M Mai  4 14:13 variables.data-00000-of-00001
-rw-rw-r-- 1 frauke frauke 15K Mai  4 14:13 variables.index


In [18]:
!saved_model_cli show --dir clothing_model --all


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_13'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 299, 299, 3)
        name: serving_default_input_13:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['dense_7'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict

Defined Functions:
  Function Name: '__call__'
    Option #1
      Callable with:
        Argument #1
          inputs: Tensor

* We are interested in ```signature_def['serving_default']``` -> save names of input and output in ```model-description.txt```
* Now use docker to run tf-serving locally using this model:
    * We use the official image from tensorflow
    * ```
       docker run -it --rm \ 
       -p 8500:8500 \
       -v "$(pdw)/clothing_model:/models/clothing-model/1 \
       -e MODEL_NAME="clothing-model" \
       tensorflow/serving:2.7.0
       ```
    * the image maps port 8500
    * we are mounting a "volume" - the model folder
    * the name of this folder has to be the same as the ```MODEL_NAME```
    * ```tensorflow/serving:2.7.0``` is the image name
 * Now send something to this model: code in notebook ```tf-serving-connect.ipynb```

## Creating a Preprocessing Service

* Convert the notebook to a python script
* wrap the script into a flask app

## Run everything locally with Docker-Compose

* Prepare the images
* Install docker-compose (to run two services on one machine)
* Run the service
* Test the service

## Introduction to Kubernetes

* The anatomy of a Kubernetes cluster

## Deploy a simple service to Kubernetes

* Install kunectl
* Set up a local Kubernetes cluster with Kind
* Create a Deployment
* Create a service

## Deploy to EKS

* Create a EKS cluster on AWS
* Publish the image to ECR
* Configure kubectl

## Summary
