# Serving Predicitons

- Few methods : 
1. Using Tensorflow Serving natively
2. Using GCP api
3. Using docker
4. Using Flask

This notebook shows method-1.

#Method -1 : Using TensorFlow Serving natively

In [None]:
!pip install -q requests

import requests
import time
import numpy as np
import json
from tensorflow import keras
import tensorflow as tf

In [None]:
from google.colab import drive

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
!ls

drive  mle-test-20200904-mnj86y-master.zip  sample_data


In [None]:
%cd drive/My Drive/project_folder

/content/drive/My Drive/project_folder


In [None]:
!unzip /content/mle-test-20200904-mnj86y-master.zip -d /content/cookpadtest

Archive:  /content/mle-test-20200904-mnj86y-master.zip
54381689d40f484743728f0a55fecd9c853aea46
   creating: /content/cookpadtest/mle-test-20200904-mnj86y-master/
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/.gitignore  
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/IMPROVEMENTS.md  
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/README.md  
 extracting: /content/cookpadtest/mle-test-20200904-mnj86y-master/RUN_INSTRUCTIONS.md  
 extracting: /content/cookpadtest/mle-test-20200904-mnj86y-master/__init__.py  
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/inference.py  
   creating: /content/cookpadtest/mle-test-20200904-mnj86y-master/model/
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/model/tomato_model.h5  
  inflating: /content/cookpadtest/mle-test-20200904-mnj86y-master/train.py  


### Add TensorFlow Serving distribution URI as a package source

In [None]:
!echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update

deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2943  100  2943    0     0  39240      0 --:--:-- --:--:-- --:--:-- 39770
OK
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Ign:4 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Hit:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release
Get:6 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:7 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease [21.3 kB]
Get:8 https://deve

In [None]:
!apt-get install tensorflow-model-server

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 77 not upgraded.
Need to get 210 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 2.3.0 [210 MB]
Fetched 210 MB in 3s (68.1 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 144579 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_2.3.0_all.deb ...
Unpacking tensorflow-model-server (2.3.0) ...
Setting up tensorflow-model-server (2.3.0) ...


###Load given model and save in .pb format to create a servable

In [None]:

model = keras.models.load_model("/content/cookpadtest/mle-test-20200904-mnj86y-master/model/0/tomato_model.h5",compile=False)

In [None]:

tf.saved_model.save(model, '/content/cookpadtest/mle-test-20200904-mnj86y-master/model/1/')

Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
INFO:tensorflow:Assets written to: /content/cookpadtest/mle-test-20200904-mnj86y-master/model/1/assets


### We'll use the command line utility saved_model_cli to look at the MetaGraphDefs (the models) and SignatureDefs (the methods you can call) in our SavedModel.

- helps us find the input/output type and shape and other metadata
- we find that input tensore must be of shape -1,4 and output will be of shape -1,3 (4 features, 3 output classes)

In [None]:
export_path = "/content/cookpadtest/mle-test-20200904-mnj86y-master/model/1/"
!saved_model_cli show --dir {export_path} --all


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['dense_input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 4)
        name: serving_default_dense_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['dense_2'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 3)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
W0913 17:18:32.995490 140529772189568 deprecation.py:506] From /usr/local/lib/python2.7/dist-packages/tensorflow_core/python/ops/res

### Start running TensorFlow Serving

In [None]:
%%bash --bg 
nohup tensorflow_model_server \
  --rest_api_port=8517 \
  --model_name=saved_model.pb\
  --model_base_path="${/content/cookpadtest/mle-test-20200904-mnj86y-master/model/}" >server.log 2>&1

Starting job # 0 in a separate thread.


In [None]:
!tail server.log

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-13 15:38:41.011477: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:199] Restoring SavedModel bundle.
2020-09-13 15:38:41.026201: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:183] Running initialization op on SavedModel bundle at path: /content/cookpadtest/mle-test-20200904-mnj86y-master/model/1
2020-09-13 15:38:41.029593: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:303] SavedModel load for tags { serve }; Status: success: OK. Took 37455 microseconds.
2020-09-13 15:38:41.030362: I tensorflow_serving/servables/tensorflow/saved_model_warmup_util.cc:59] No warmup data file found at /content/cookpadtest/mle-test-20200904-mnj86y-master/model/1/assets.extra/tf_serving_warmup_requests
2020-09-13 15:38:41.030674: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: saved_model.pb version: 1}
2020-09-13 15:38:

### Check prediction for one example without the rest-api for cross-verification purposes

--below predict_dataset is taken from inference.py

In [None]:
predict_dataset = [
    [5.1, 3.3, 1.7, 0.5, ],
    [5.9, 3.0, 4.2, 1.5, ],
    [6.9, 3.1, 5.4, 2.1]
]

In [None]:
class_names = ['Plum', 'Cherry', 'BeefSteak']

In [None]:
predictions = model.predict(predict_dataset)

In [None]:
for i in range(len(predictions)):
  print(class_names[np.argmax(predictions[i])], np.argmax(predictions[i]))

Plum 0
Cherry 1
BeefSteak 2


### Now, we post requests to the REST API

In [None]:
#json format for api calls
data = json.dumps({"signature_name": "serving_default", "instances": predict_dataset})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))

Data: {"signature_name": "serving_default", "instances": ... , 0.5], [5.9, 3.0, 4.2, 1.5], [6.9, 3.1, 5.4, 2.1]]}


In [None]:
headers = {"content-type": "application/json"}

In [None]:
json_response = ''

while json_response == '':
    try:
        json_response = requests.post('http://localhost:8517/v1/models/saved_model.pb:predict', data=data, headers=headers,verify=False)
        break
    except:
        print("Connection refused by the server..")
        print("Let me sleep for 5 seconds")
        print("ZZzzzz...")
        time.sleep(300)
        print("Was a nice sleep, now let me continue...")
        continue


In [None]:
apipredictions = json.loads(json_response.text)['predictions']


In [None]:
for i in range(len(apipredictions)):
  print(class_names[np.argmax(apipredictions[i])], np.argmax(apipredictions[i]))

#Method 3 - Using GCP - not complete as I could be billed and I was able to make things work with Method 2

In [None]:
# Install TensorFlow >2.0
!pip install tensorflow==2.1.0

Collecting tensorflow==2.1.0
[?25l  Downloading https://files.pythonhosted.org/packages/85/d4/c0cd1057b331bc38b65478302114194bd8e1b9c2bbc06e300935c0e93d90/tensorflow-2.1.0-cp36-cp36m-manylinux2010_x86_64.whl (421.8MB)
[K     |████████████████████████████████| 421.8MB 31kB/s 
Collecting tensorflow-estimator<2.2.0,>=2.1.0rc0
[?25l  Downloading https://files.pythonhosted.org/packages/18/90/b77c328a1304437ab1310b463e533fa7689f4bfc41549593056d812fab8e/tensorflow_estimator-2.1.0-py2.py3-none-any.whl (448kB)
[K     |████████████████████████████████| 450kB 41.4MB/s 
Collecting gast==0.2.2
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting keras-applications>=1.0.8
[?25l  Downloading https://files.pythonhosted.org/packages/71/e3/19762fdfc62877ae9102edf6342d71b28fbfd9dea3d2f96a882ce099b03f/Keras_Applications-1.0.8-py3-none-any.whl (50kB)
[K     |████████████████████████████████| 51kB 5.9MB/s 

In [None]:
# Load tensorboard 
%load_ext tensorboard

In [None]:
from google.colab import auth
auth.authenticate_user()

In [None]:
# GCP project name
CLOUD_PROJECT = 'gcpessentials-rz'
BUCKET = 'gs://' + CLOUD_PROJECT + '-tf2-models'

In [None]:
!gcloud config set project $CLOUD_PROJECT

Updated property [core/project].


To take a quick anonymous survey, run:
  $ gcloud survey



In [None]:
!gcloud config set project $CLOUD_PROJECT

Updated property [core/project].


In [None]:
!gsutil mb $BUCKET
print(BUCKET)

Creating gs://gcpessentials-rz-tf2-models/...
AccessDeniedException: 403 The project to be billed is associated with an absent billing account.
gs://gcpessentials-rz-tf2-models
