## How to create a numeric calculation web service that uses autoscaling GPUs

This notebook demonstrates how to create a web service that carries out numerical computations on a GPU.
It uses the following capabilities:
* Keras/TensorFlow models automatically run on a GPU if found
* TensorFlow provides an easy wrapper for most numpy functions
* It's possible to specify a custom serving function for Keras models
* Cloud AI Platform provides a easy way to deploy models as a web service

It accompanies this blog post:
https://medium.com/@lakshmanok/how-to-create-a-numeric-calculation-web-service-that-uses-autoscaling-gpus-c43b865d867d

## Original function
As an example, let's use a function that calculates the time of sunrise/sunset given the latitude, longitude of a point

Note how I'm aliasing sin, cos, etc. to the corresponding TensorFlow functions.

In [1]:
import numpy as np
import math
import tensorflow as tf

tf.debugging.set_log_device_placement(True)

def calc_sunrise_sunset(lat: float, lng: float, dayNo: int, utcOffset: int):
    """
    Specify the location of the point, the day you are interested in (0 is Jan. 1)
    https://math.stackexchange.com/questions/2186683/how-to-calculate-sunrise-and-sunset-times/2199903#2199903
    """
    # aliases
    pi = tf.constant(math.pi)
    sin, cos, asin, acos, tan, floor = tf.math.sin, tf.math.cos, tf.math.asin, tf.math.acos, tf.math.tan, tf.math.floor
    
    # actual calc, without any atmospheric correction
    longCorr = 4*(lng - 15*utcOffset);
    B = 2*pi*(dayNo - 81)/365;
    EoTCorr = 9.87*sin(2*B) - 7.53*cos(B) - 1.5*sin(B);
    solarCorr = longCorr - EoTCorr;
    delta = asin(sin(23.45 * pi/180)*sin(2*pi*(dayNo - 81)/365));
    sunrise = 12 - (180/pi)*acos(-tan(lat * pi/180)*tan(delta))/15 - solarCorr/60;
    sunset  = 12 + (180/pi)*acos(-tan(lat * pi/180)*tan(delta))/15 - solarCorr/60;
    
    sunrise_hr = floor(sunrise)
    sunrise_min = 60 * (sunrise - sunrise_hr)
    sunset_hr = floor(sunset)
    sunset_min = 60 * (sunset - sunset_hr)
    
    return {
        'dayNo': dayNo,
        'sunrise_hr': sunrise_hr,
        'sunrise_min': sunrise_min,
        'sunset_hr': sunset_hr,
        'sunset_min': sunset_min
    }
    
print(calc_sunrise_sunset(39.833, -98.583, 15, -6))

Executing op Mul in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op RealDiv in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Sin in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Cos in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Sub in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Asin in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Tan in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Neg in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Acos in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op AddV2 in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Floor in device /job:localhost/replica:0/task:0/device:CPU:0
{'dayNo': 15.0, 'sunrise_hr': <tf.Tensor: shape=(), dtype=float32, numpy=7.0>, 'sunrise_min': <tf.Tensor: shape=(), dtype=float32, numpy=40.322144>, 'sunset_hr': <tf.Tensor: shape=(), dtype=float32,

## Keras Model

Create a no-op Keras Model and export it with the above serving function.

In [2]:
import os, datetime, shutil

@tf.function(input_signature=[
    tf.TensorSpec([None], dtype=tf.float32),
    tf.TensorSpec([None], dtype=tf.float32),
    tf.TensorSpec([None], dtype=tf.float32),
    tf.TensorSpec([None], dtype=tf.float32),
])
def calc_on_gpu(lat, lng, dayno, utc_offset):
    return calc_sunrise_sunset(lat, lng, dayno, utc_offset)

model = tf.keras.Sequential([
    tf.keras.layers.Dense(1, input_shape=[1])
  ])

shutil.rmtree('export', ignore_errors=True)
export_path = os.path.join('export', 'sunrise_{}'.format(datetime.datetime.now().strftime("%Y%m%d_%H%M%S")))
model.save(export_path, signatures={'serving_default': calc_on_gpu})

Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Add in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op VarIsInitializedOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op LogicalNot in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Assert in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op AssignVariableOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op Reshape in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:CPU:0
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Executing op StringJoin in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op ShardedFilename in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op ReadVariableOp in device /j

In [3]:
!saved_model_cli show --dir {export_path} --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['dayno'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: serving_default_dayno:0
  inputs['lat'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: serving_default_lat:0
  inputs['lng'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: serving_default_lng:0
  inputs['utc_offset'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: serving_default_utc_offset:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['dayNo'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: PartitionedCall:0
  outputs['sunrise_hr'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: PartitionedCall:1
  outputs['sunrise_min'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: PartitionedCall:2
  outputs['sunset_hr'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: PartitionedCall:3
  outputs['s

In [4]:
restored = tf.keras.models.load_model(export_path)
infer = restored.signatures['serving_default']
# note input name
outputs = infer(lat=tf.constant(39.833), lng=tf.constant(-98.583), dayno=tf.constant(15.0), utc_offset=tf.constant(-6.0))
print(outputs)

Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op VarHandleOp in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op RestoreV2 in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op RestoreV2 in device /job:localhost/replica:0/task:0/device:CPU:0
Executing op __inference_signature_wrapper_554 in device /job:localhost/replica:0/task:0/device:CPU:0
{'sunrise_min': <tf.Tensor: shape=(), dtype=float32, numpy=40.322113>, 'sunset_hr': <tf.Tensor: shape=(), dtype=float32, numpy=17.0>, 'sunset_min': <tf.Tensor: shape=(), dtype=float32, numpy=9.641991>, 'sunrise_hr': <tf.Tensor: shape=(), dtype=float32, numpy=7.0>, 'dayNo': <tf.Tensor: shape=(), dtype=float32, numpy=15.0>}


## Deploy to Cloud AI Platform Predictions

We can deploy the model to AI Platform Predictions which will take care of scaling.
The key line here is:
```
       --machine-type n1-standard-2 --accelerator count=1,type=nvidia-tesla-k80
```
This will deploy to a machine with 2 CPUs and 1 GPU.

For more details, see: https://cloud.google.com/ai-platform/prediction/docs/machine-types-online-prediction

In [5]:
!find export/ | head -2 | tail -1

export/sunrise_20200927_004422


In [16]:
%%bash

MODEL_LOCATION=$(find export | head -2 | tail -1)
MODEL_NAME=sunrise
MODEL_VERSION=v1

TFVERSION=2.1
REGION=us-central1
BUCKET=ai-analytics-solutions-kfpdemo

# create the model if it doesn't already exist
modelname=$(gcloud ai-platform models list | grep -w "$MODEL_NAME")
echo $modelname
if [ -z "$modelname" ]; then
   echo "Creating model $MODEL_NAME"
   gcloud ai-platform models create ${MODEL_NAME} --regions $REGION
else
   echo "Model $MODEL_NAME already exists"
fi

# delete the model version if it already exists
modelver=$(gcloud ai-platform versions list --model "$MODEL_NAME" | grep -w "$MODEL_VERSION")
echo $modelver
if [ "$modelver" ]; then
   echo "Deleting version $MODEL_VERSION"
   yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}
   sleep 10
fi


echo "Creating version $MODEL_VERSION from $MODEL_LOCATION"
gcloud ai-platform versions create ${MODEL_VERSION} \
       --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --staging-bucket gs://${BUCKET} \
       --runtime-version $TFVERSION \
       --machine-type n1-standard-2 --accelerator count=1,type=nvidia-tesla-k80

sunrise
Model sunrise already exists

Creating version v1 from export/sunrise_20200927_004422


Using endpoint [https://ml.googleapis.com/]
Using endpoint [https://ml.googleapis.com/]
Listed 0 items.
Using endpoint [https://ml.googleapis.com/]
Creating version (this might take a few minutes)......
.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

In [7]:
%%writefile input.json
{"lat": 39.833, "lng": -98.583, "dayno": 15, "utc_offset": -6}
{"lat": 39.833, "lng": -98.583, "dayno": 45, "utc_offset": -6}
{"lat": 39.833, "lng": -98.583, "dayno": 72, "utc_offset": -6}
{"lat": 39.833, "lng": -98.583, "dayno": 102, "utc_offset": -6}

Writing input.json


In [17]:
!gcloud ai-platform predict --model sunrise --json-instances input.json --version v1

Using endpoint [https://ml.googleapis.com/]
DAY_NO  SUNRISE_HR  SUNRISE_MIN  SUNSET_HR  SUNSET_MIN
15.0    7.0         40.3221      17.0       9.64199
45.0    7.0         5.45803      17.0       34.0227
72.0    6.0         35.8807      18.0       12.3474
102.0   6.0         6.04737      19.0       0.529518


In [10]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
import json

credentials = GoogleCredentials.get_application_default()
api = discovery.build("ml", "v1", credentials = credentials,
            discoveryServiceUrl = "https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json")

request_data = {"instances":
  [
    {"lat": 39.833, "lng": -98.583, "dayno": 15, "utc_offset": -6},
    {"lat": 39.833, "lng": -98.583, "dayno": 45, "utc_offset": -6},
    {"lat": 39.833, "lng": -98.583, "dayno": 72, "utc_offset": -6},
    {"lat": 39.833, "lng": -98.583, "dayno": 102, "utc_offset": -6}
  ]
}

parent = "projects/{}/models/sunrise".format("ai-analytics-solutions", "v1") # use default version

response = api.projects().predict(body = request_data, name = parent).execute()
print("response = {0}".format(response))

response = {'predictions': [{'dayNo': 15.0, 'sunrise_min': 40.322113037109375, 'sunset_min': 9.641990661621094, 'sunrise_hr': 7.0, 'sunset_hr': 17.0}, {'dayNo': 45.0, 'sunrise_min': 5.458030700683594, 'sunset_min': 34.02271270751953, 'sunrise_hr': 7.0, 'sunset_hr': 17.0}, {'dayNo': 72.0, 'sunrise_min': 35.88074493408203, 'sunset_min': 12.347373962402344, 'sunrise_hr': 6.0, 'sunset_hr': 18.0}, {'dayNo': 102.0, 'sunrise_min': 6.047401428222656, 'sunset_min': 0.5295181274414062, 'sunrise_hr': 6.0, 'sunset_hr': 19.0}]}


In [13]:
print(response['predictions'][0]['sunrise_min'])

40.322113037109375


## Delete the web service

The REST endpoint scales down to one node when it doesn't encounter any traffic. 
To avoid paying for that node, let's delete the model.

In [15]:
%%bash

MODEL_NAME=sunrise
MODEL_VERSION=v1

# delete the model version if it already exists
modelver=$(gcloud ai-platform versions list --model "$MODEL_NAME" | grep -w "$MODEL_VERSION")
echo $modelver
if [ "$modelver" ]; then
   echo "Deleting version $MODEL_VERSION"
   yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}
   sleep 10
fi

v1 gs://ai-analytics-solutions-kfpdemo/207d95f93d36fe5ba780b9885c903f6a2d1ce78ab57bbd54e3cd32b59db022dc/ READY
Deleting version v1


Using endpoint [https://ml.googleapis.com/]
Using endpoint [https://ml.googleapis.com/]
This will delete version [v1]...

Do you want to continue (Y/n)?  
Deleting version [v1]......
................................................................................done.


Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License