# AI Platform - External Model Serving

This notebook uses an AI Platform Notebook to train a TensorFlow model (locally) with the data in BigQuery table `<PROJECT_ID>.digits.digits_prepped`.  This model is then saved and AI Platform clients are used to upload the model and deploy it to an endpoint for online predictions.

**Prerequisites**
- `00 - Initial Setup`
- `01 - BigQuery - Data`

**Overview**

<img src="architectures/statmike-mlops-04.png">

---
## Setup

Prepare TensorFlow:

In [1]:
from tensorflow_io.bigquery import BigQueryClient
from tensorflow_io.bigquery import BigQueryReadSession
import tensorflow as tf

Setup Parameters

In [2]:
PROJECT_ID='statmike-mlops'
REGION='us-central1'

BQDATASET_ID='digits'
BQTABLE_ID='digits_prepped'

MODEL_DIR='gs://{}/digits/keras'.format(PROJECT_ID)
PARENT = "projects/" + PROJECT_ID + "/locations/" + REGION

BATCH_SIZE = 30

MODEL_NAME='MODEL_KERAS-DIGITS'
ENDPOINT_NAME='ENDPOINT_KERAS-DIGITS'
params = {"MODEL_DIR":MODEL_DIR}
DEPLOY_IMAGE='us-docker.pkg.dev/cloud-aiplatform/prediction/tf2-cpu.2-2:latest'
DEPLOY_COMPUTE='n1-standard-4'

Setup AI Platform Python Clients
- https://googleapis.dev/python/aiplatform/latest/index.html

In [3]:
from google.cloud import aiplatform

API_ENDPOINT = "{}-aiplatform.googleapis.com".format(REGION)
client_options = {"api_endpoint": API_ENDPOINT}
clients = {}

---
## Prepare Data Connection

Retrieve the Schema info from BigQuery Information Schema via the Storage API:
- https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas

In [4]:
from google.cloud import bigquery
bqclient = bigquery.Client()
bqjob = bqclient.query(
"""
SELECT * FROM `"""+BQDATASET_ID+""".INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
WHERE TABLE_NAME = '"""+BQTABLE_ID+"""' """
)
schema = bqjob.result().to_dataframe()
schema

Unnamed: 0,table_catalog,table_schema,table_name,column_name,field_path,data_type,description
0,statmike-mlops,digits,digits_prepped,p0,p0,FLOAT64,
1,statmike-mlops,digits,digits_prepped,p1,p1,FLOAT64,
2,statmike-mlops,digits,digits_prepped,p2,p2,FLOAT64,
3,statmike-mlops,digits,digits_prepped,p3,p3,FLOAT64,
4,statmike-mlops,digits,digits_prepped,p4,p4,FLOAT64,
...,...,...,...,...,...,...,...
62,statmike-mlops,digits,digits_prepped,p62,p62,FLOAT64,
63,statmike-mlops,digits,digits_prepped,p63,p63,FLOAT64,
64,statmike-mlops,digits,digits_prepped,target,target,INT64,
65,statmike-mlops,digits,digits_prepped,target_OE,target_OE,STRING,


Use the the table schema to prepare the TensorFlow Model:
- Omit unused columns
- Create `feature_columns` for the model
- Define the `dtypes` for TensorFlow

In [5]:
OMIT = ['target_OE','SPLITS']

selected_fields = schema[~schema.column_name.isin(OMIT)].column_name.tolist()

feature_columns = []
feature_layer_inputs = {}
for header in selected_fields:
    if header != 'target':
        feature_columns.append(tf.feature_column.numeric_column(header))
        feature_layer_inputs[header] = tf.keras.Input(shape=(1,),name=header)

from tensorflow.python.framework import dtypes
output_types = schema[~schema.column_name.isin(OMIT)].data_type.tolist()
output_types = [dtypes.float64 if x=='FLOAT64' else dtypes.int64 for x in output_types]

Define a function that remaps the input data for TensorFlow into features, target and one_hot encodes the `target`:

In [6]:
def transTable(row_dict):
    target=row_dict.pop('target')
    target = tf.one_hot(tf.cast(target,tf.int64),10)
    target = tf.cast(target,tf.float32)
    return(row_dict,target)

Setup TensorFlow_IO client > session > table + table.map
- https://www.tensorflow.org/io/api_docs/python/tfio/bigquery/BigQueryClient

In [7]:
client = BigQueryClient()
session = client.read_session("projects/"+PROJECT_ID,PROJECT_ID,BQTABLE_ID,BQDATASET_ID,selected_fields,output_types,row_restriction="SPLITS='TRAIN'",requested_streams=3)
table = session.parallel_read_rows()
table = table.map(transTable)
train = table.shuffle(100000).batch(BATCH_SIZE)

In [8]:
client = BigQueryClient()
session = client.read_session("projects/"+PROJECT_ID,PROJECT_ID,BQTABLE_ID,BQDATASET_ID,selected_fields,output_types,row_restriction="SPLITS='TEST'",requested_streams=3)
table = session.parallel_read_rows()
table = table.map(transTable)
test = table.batch(BATCH_SIZE)

Review a single batch of the train data:

In [9]:
for a, b in train.take(1):
    columns=list(a.keys())
    print('columns: ',columns)
    print('target: ',b)

columns:  ['p0', 'p1', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p2', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p3', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p4', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p5', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p6', 'p60', 'p61', 'p62', 'p63', 'p7', 'p8', 'p9']
target:  tf.Tensor(
[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0

---
## Train the Model

Define the Model:

In [10]:
feature_layer = tf.keras.layers.DenseFeatures(feature_columns)
feature_layer_outputs = feature_layer(feature_layer_inputs)
model = tf.keras.Model(inputs=[v for v in feature_layer_inputs.values()],outputs=tf.keras.layers.Dense(10,activation=tf.nn.softmax)(feature_layer_outputs))
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
#tf.keras.utils.plot_model(model,show_shapes=True, show_dtype=True)

In [11]:
#model.summary()

Fit the Model:

In [12]:
history = model.fit(train,epochs=25)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


Evaluate the model with the test data:

In [13]:
loss, accuracy = model.evaluate(test)



Create Prediction from a batch of the test data:

In [14]:
model.predict(test.take(1))

array([[9.86854672e-01, 6.29118176e-06, 1.46573058e-07, 6.00077765e-09,
        6.27308680e-07, 4.94029884e-08, 1.30995326e-02, 6.52086496e-10,
        2.20472157e-06, 3.64428270e-05],
       [9.99084234e-01, 3.62124950e-07, 6.63985304e-07, 2.47430321e-09,
        1.87037557e-07, 5.31468231e-06, 2.68311822e-04, 1.68074332e-09,
        8.83675602e-06, 6.32079318e-04],
       [8.71308565e-01, 5.14373596e-07, 2.91707367e-03, 2.42554734e-07,
        8.54366462e-08, 1.26378685e-08, 1.25752628e-01, 1.39964973e-09,
        4.06263609e-07, 2.03779018e-05],
       [9.99833584e-01, 8.90558072e-10, 2.83411623e-08, 1.36561695e-09,
        3.69588227e-09, 3.99954706e-06, 3.65481560e-06, 4.16426182e-09,
        1.06705456e-05, 1.47957297e-04],
       [9.49158013e-01, 5.08312043e-03, 1.62665256e-05, 7.86311502e-11,
        4.53970730e-02, 5.14867850e-07, 2.00283175e-04, 1.30508968e-04,
        1.32847481e-05, 1.00414945e-06],
       [5.59010997e-12, 9.99807894e-01, 1.22580104e-12, 6.38657070e-07,
   

---
## Save the model:

In [15]:
model.save(MODEL_DIR)

Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.
INFO:tensorflow:Assets written to: gs://statmike-mlops/digits/keras/assets
INFO:tensorflow:Assets written to: gs://statmike-mlops/digits/keras/assets


---
## Upload the Model to AI Platform

Create a client to the Model Service, define the Model, and upload the model:

In [16]:
clients['model'] = aiplatform.gapic.ModelServiceClient(client_options=client_options)

MODEL = {
    "display_name": MODEL_NAME,
    "metadata_schema_uri": "",
    "artifact_uri": MODEL_DIR,
    "container_spec": {
        "image_uri": DEPLOY_IMAGE,
        "command": [],
        "args": [],
        "env": [],
        "ports": [{"container_port": 8080}],
        "predict_route": "",
        "health_route": ""
    }
}

uploaded_model = clients['model'].upload_model(parent=PARENT, model=MODEL)

Retrieve the model information and view the name and display name:

In [17]:
model_info = clients['model'].get_model(name=uploaded_model.result(timeout=180).model)
model_info.display_name, model_info.name

('MODEL_KERAS-DIGITS',
 'projects/691911073727/locations/us-central1/models/931312737005338624')

---
## Create the AI Platform Endpoint

Create a client to the Endpoint Service and use it to create the endpoint:

In [18]:
clients['endpoint'] = aiplatform.gapic.EndpointServiceClient(client_options=client_options)

endpoint = clients['endpoint'].create_endpoint(parent=PARENT, endpoint={"display_name": ENDPOINT_NAME})

Retrieve the endpoint information and view the name and display name:

In [19]:
endpoint_info = clients['endpoint'].get_endpoint(name=endpoint.result(timeout=180).name)
endpoint_info.display_name, endpoint_info.name

('ENDPOINT_KERAS-DIGITS',
 'projects/691911073727/locations/us-central1/endpoints/4482295490070708224')

---
## Deploy the Model to the AI Platform Endpoint

In [20]:
DMODEL = {
        "model": model_info.name,
        "display_name": 'DEPLOYED_'+MODEL_NAME,
        "dedicated_resources": {
            "min_replica_count": 1,
            "max_replica_count": 1,
            "machine_spec": {
                    "machine_type": DEPLOY_COMPUTE,
                    "accelerator_count": 0,
                }
        }   
}

TRAFFIC = {
    '0' : 100
}

dmodel = clients['endpoint'].deploy_model(endpoint=endpoint_info.name, deployed_model=DMODEL, traffic_split=TRAFFIC)

Retrieve the deployed model information from the endpoint:

In [34]:
clients['endpoint'].get_endpoint(name=endpoint_info.name).deployed_models

[id: "4408689783660871680"
model: "projects/691911073727/locations/us-central1/models/931312737005338624"
display_name: "DEPLOYED_MODEL_KERAS-DIGITS"
create_time {
  seconds: 1619120005
  nanos: 744988000
}
dedicated_resources {
  machine_spec {
    machine_type: "n1-standard-4"
  }
  min_replica_count: 1
  max_replica_count: 1
}
]

---
## Predictions

Create a client to the prediction service:

In [35]:
clients['prediction'] = aiplatform.gapic.PredictionServiceClient(client_options=client_options)

Setup an observation for prediction:

In [26]:
%%bigquery pred
SELECT *
FROM `digits.digits_prepped`
WHERE splits='TEST'

Query complete after 0.00s: 100%|██████████| 1/1 [00:00<00:00, 715.02query/s] 
Downloading: 100%|██████████| 344/344 [00:01<00:00, 314.25rows/s]


In [27]:
pred.head(1)

Unnamed: 0,p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,...,p57,p58,p59,p60,p61,p62,p63,target,target_OE,SPLITS
0,0.0,0.0,0.0,10.0,11.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,9.0,15.0,14.0,5.0,0.0,0,Even,TEST


In [28]:
newob = pred.loc[:0,'p0':'p63'].to_dict(orient='records')[0]
#newob

### With Python Client

Request prediction from the prediction service:

In [36]:
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

response = clients['prediction'].predict(endpoint=endpoint_info.name, instances=[json_format.ParseDict(newob, Value())], parameters=json_format.ParseDict({}, Value()))

In [37]:
response.predictions

[[0.986854672, 6.29117e-06, 1.46573058e-07, 6.0007781e-09, 6.27308e-07, 4.94029919e-08, 0.0130995335, 6.52086496e-10, 2.20471952e-06, 3.64427906e-05]]

In [38]:
import numpy as np
np.argmax(response.predictions[0])

0

### With REST

In [33]:
import json
with open('request.json','w') as file:
    file.write(json.dumps({"instances": [newob]}))

In [77]:
!curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
https://{API_ENDPOINT}/v1/{endpoint_info.name}:predict

{
  "predictions": [
    [
      0.986854672,
      6.29117e-06,
      1.46573058e-07,
      6.0007781e-09,
      6.27308e-07,
      4.94029919e-08,
      0.0130995335,
      6.52086496e-10,
      2.20471952e-06,
      3.64427906e-05
    ]
  ],
  "deployedModelId": "4408689783660871680"
}


### With gcloud

In [78]:
!gcloud beta ai endpoints predict {endpoint_info.name.rsplit('/',1)[-1]} --region={REGION} --json-request=request.json

Using endpoint [https://us-central1-prediction-aiplatform.googleapis.com/]
[[0.986854672, 6.29117e-06, 1.46573058e-07, 6.0007781e-09, 6.27308e-07, 4.94029919e-08, 0.0130995335, 6.52086496e-10, 2.20471952e-06, 3.64427906e-05]]


# Remove Resources
- undeploy-model
- remove endpoint
- remove model
- delete model files

Undeploy Model:

In [79]:
dmodel = clients['endpoint'].get_endpoint(name=endpoint_info.name).deployed_models[0].id
clients['endpoint'].undeploy_model(endpoint=endpoint_info.name, deployed_model_id=dmodel)

<google.api_core.operation.Operation at 0x7fa2d43df390>

Delete Endpoint:

In [80]:
clients['endpoint'].delete_endpoint(name=endpoint_info.name)

<google.api_core.operation.Operation at 0x7fa2d43df490>

Remove Model:

In [81]:
clients['model'].delete_model(name=model_info.name)

<google.api_core.operation.Operation at 0x7fa2d43a8510>

Delete Model Files:

In [82]:
from google.cloud import storage
gcs = storage.Client()

path = gcs.bucket(PROJECT_ID)
blobs = path.list_blobs(prefix='digits/keras')
for blob in blobs:
    blob.delete()