<a href="https://colab.research.google.com/github/aishwarya1301/AutoScale-in-AIaaS/blob/main/Deploy_an_AS_inference_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, we train a DNN model using Keras and deploy it to serve predictions. The inference model will use Google AI Platform's Auto scaling feature.

We use the United States Census Income Dataset

**1. Preliminary steps:**
1. Create a GCP project
2. Enable AI Platform and Compute Engine APIs
3. Get the PROJECT ID from the GCP console.


In [None]:
PROJECT_ID = "ancient-blade-305921" #@param {type:"string"}
! gcloud config set project $PROJECT_ID

Updated property [core/project].


To take a quick anonymous survey, run:
  $ gcloud survey



**2. Authentication:**

In [None]:
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
  %env GOOGLE_APPLICATION_CREDENTIALS ''


**3. Cloud Storage bucket**

In [None]:
BUCKET_NAME = "cml-bucket" #@param {type:"string"}
REGION = "us-central1" #@param {type:"string"}

**4. Training a model using ai-platform** \
    In Local:

In [None]:
# Clone the repository of AI Platform samples
! git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples

# Set the working directory to the sample code directory
%cd cloudml-samples/census/tf-keras

Cloning into 'cloudml-samples'...
remote: Enumerating objects: 596, done.[K
remote: Counting objects: 100% (596/596), done.[K
remote: Compressing objects: 100% (461/461), done.[K
remote: Total 596 (delta 163), reused 326 (delta 87), pack-reused 0[K
Receiving objects: 100% (596/596), 23.26 MiB | 20.10 MiB/s, done.
Resolving deltas: 100% (163/163), done.
/content/cloudml-samples/census/tf-keras


In [None]:
! pip install -r requirements.txt

Collecting tensorflow<2,>=1.15
[?25l  Downloading https://files.pythonhosted.org/packages/9a/51/99abd43185d94adaaaddf8f44a80c418a91977924a7bc39b8dacd0c495b0/tensorflow-1.15.5-cp37-cp37m-manylinux2010_x86_64.whl (110.5MB)
[K     |████████████████████████████████| 110.5MB 79kB/s 
Collecting tensorboard<1.16.0,>=1.15.0
[?25l  Downloading https://files.pythonhosted.org/packages/1e/e9/d3d747a97f7188f48aa5eda486907f3b345cd409f0a0850468ba867db246/tensorboard-1.15.0-py3-none-any.whl (3.8MB)
[K     |████████████████████████████████| 3.8MB 45.4MB/s 
Collecting tensorflow-estimator==1.15.1
[?25l  Downloading https://files.pythonhosted.org/packages/de/62/2ee9cd74c9fa2fa450877847ba560b260f5d0fb70ee0595203082dafcc9d/tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503kB)
[K     |████████████████████████████████| 512kB 52.8MB/s 
[?25hCollecting gast==0.2.2
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz

In [None]:
# Use Python 3 
! gcloud config set ml_engine/local_python $(which python3)

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially for
# distributed training (not applicable here).
! gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output

Updated property [ml_engine/local_python].
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2021-05-06 20:57:59.915286: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-05-06 20:57:59.918825: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1999995000 Hz
2021-05-06 20:57:59.919051: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55db1280a840 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-05-06 20:57:59.919090: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Train on 254 steps, validate on 1 steps

Epoc

In Cloud:

In [None]:
JOB_NAME = 'training_keras_job'
JOB_DIR = 'gs://' + BUCKET_NAME + '/keras-job-dir'

In [None]:
! gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.7 \
  --runtime-version 1.15 \
  --job-dir $JOB_DIR \
  --stream-logs

In [None]:
MODEL_NAME = "the_model"

# ! gcloud ai-platform models create $MODEL_NAME \
  # --regions $REGION

In [None]:
MODEL_VERSION = "v1"

# Get a list of directories in the `keras_export` parent directory
KERAS_EXPORT_DIRS = ! gsutil ls $JOB_DIR/keras_export/

SAVED_MODEL_PATH = KERAS_EXPORT_DIRS[1]

# !gcloud beta ai-platform versions create v2 \
# --model $MODEL_NAME  \
# --region $REGION \
# --accelerator=count=1,type=nvidia-tesla-t4  \
# --metric-targets cpu-usage=50  \
# --metric-targets gpu-duty-cycle=60 \
# --machine-type n1-standard-4	
# --min-nodes 1 --max-nodes 3 \
# --runtime-version 2.3 \
# --framework tensorflow \
# --origin gs://cml-bucket/keras-job-dir/keras_export/ \


**5. Prediction** \
Prepare inputs

In [None]:
from trainer import util
import pandas as pd
import json

_, _, eval_x, eval_y = util.load_data()

prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]


_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
                            names=util._CSV_COLUMNS,
                            na_values='?')


with open('prediction_input.json', 'w') as json_file:
  for row in prediction_input.values.tolist():
    json.dump(row, json_file)
    json_file.write('\n')



Use AI Platform predictions

In [None]:
 !gcloud ai-platform predict --model $MODEL_NAME --version $MODEL_VERSION --region $REGION --json-instances prediction_input.json

Using endpoint [https://us-central1-ml.googleapis.com/]
[[0.1261338], [0.000512808561], [0.97866714], [0.00181245804], [0.0391239524], [0.0914238095], [0.0540127456], [0.594557345], [0.477067649], [0.00469523668], [0.0567878187], [0.224291414], [0.518628716], [0.57409054], [0.039419353], [0.0377126932], [0.651322186], [0.165879667], [0.000587552786], [0.296940744]]


In [None]:
! for i in {1..5000000}; do gcloud ai-platform predict --model the_model --version v1 --region us-central1 --json-instances prediction_input.json & done

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
[[0.0865182877], [0.0724981427], [0.0034326911], [0.678609312], [0.264642], [0.000200629234], [0.00110989809], [0.01131019], [0.986185193], [0.0479368865], [0.00128006935], [0.228020817], [0.0100353658], [0.00589317083], [0.687803268], [0.00968718529], [0.00801476836], [0.0328396559], [0.304086983], [0.609046]]
Using endpoint [https://us-central1-ml.googleapis.com/]
[[0.0865182877], [0.0724981427], [0.0034326911], [0.678609312], [0.264642], [0.000200629234], [0.00110989809], [0.01131019], [0.986185193], [0.0479368865], [0.00128006935], [0.228020817], [0.0100353658], [0.00589317083], [0.687803268], [0.00968718529], [0.00801476836], [0.0328396559], [0.304086983], [0.609046]]
[[0.0865182877], [0.0724981427], [0.0034326911], [0.678609312], [0.264642], [0.000200629234], [0.00110989809], [0.01131019], [0.986185193], [0.0479368865], [0.00128006935], [0.228020817], [0.0100353658], [0.00589317083], [0.687803268], [0.00968718529], 