# 05Tools: Prediction - Local
## BROKEN - Current Being Worked ON - Dependency in tensorflow/serving container?

Predictions from models created in the 05 series of notebooks.

This notebook is part of collection of examples that showcase many ways to serve models:
- Online:
    - Vertex AI Endpoints: Python, REST, CLI (gcloud): [05Tools - Prediction - Online.ipynb](./05Tools%20-%20Prediction%20-%20Online.ipynb)
    - (**THIS NOTEBOOK**) Local with TensorFlow ModelServer: [05Tools - Prediction - Local.ipynb](./05Tools%20-%20Prediction%20-%20Local.ipynb)
    - Remote with Cloud Run with TensorFlow ModelServer: [05Tools - Prediction - Cloud Run.ipynb](./05Tools%20-%20Prediction%20-%20Cloud%20Run.ipynb)
- Batch: [05Tools - Prediction - Batch.ipynb](./05Tools%20-%20Prediction%20-%20Batch.ipynb)
    - BigQuery ML Model Import
    - Vertex AI Batch Prediction Jobs

### Prerequisites:
-  At least 1 of the notebooks in this series [05, 05a-05i]5

### Conceptual Flow & Workflow
<p align="center">
  <img alt="Conceptual Flow" src="../architectures/slides/05tools_pred_arch.png" width="45%">
&nbsp; &nbsp; &nbsp; &nbsp;
  <img alt="Workflow" src="../architectures/slides/05tools_pred_console.png" width="45%">
</p>

---
## Setup

inputs:

In [1]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [2]:
REGION = 'us-central1'
EXPERIMENT = '05_predictions'
SERIES = '05'

# source data
BQ_PROJECT = PROJECT_ID
BQ_DATASET = 'fraud'
BQ_TABLE = 'fraud_prepped'

# Resources
DEPLOY_COMPUTE = 'n1-standard-4'
DEPLOY_IMAGE='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-7:latest'

# Model Training
VAR_TARGET = 'Class'
VAR_OMIT = 'transaction_id' # add more variables to the string with space delimiters

packages:

In [8]:
from google.cloud import aiplatform
from google.cloud import bigquery

import tensorflow as tf

from datetime import datetime
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value
import json
import numpy as np

import asyncio
import time
import multiprocessing

clients:

In [9]:
aiplatform.init(project=PROJECT_ID, location=REGION)
bq = bigquery.Client()

parameters:

In [10]:
BUCKET = PROJECT_ID
DIR = f"temp/{EXPERIMENT}"

environment:

In [11]:
!rm -rf {DIR}
!mkdir -p {DIR}

---
## Get Endpoint

[Endpoint Properties and Methods](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.Endpoint):

```python
endpoint
endpoint.display_name
endpoint.resource_name
endpoint.traffic_split
endpoint.list_models()
```

In [12]:
endpoints = aiplatform.Endpoint.list(filter = f"labels.series={SERIES}")
endpoint = endpoints[0]

In [13]:
print(f'Review the Endpoint in the Console:\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/endpoints/{endpoint.name}?project={PROJECT_ID}')

Review the Endpoint in the Console:
https://console.cloud.google.com/vertex-ai/locations/us-central1/endpoints/1961322035766362112?project=statmike-mlops-349915


### Model Information
Using the model on the endpoint for the current series:

In [14]:
endpoint

<google.cloud.aiplatform.models.Endpoint object at 0x7f8276fd6650> 
resource name: projects/1026793852137/locations/us-central1/endpoints/1961322035766362112

In [15]:
#endpoint.list_models()[0]

In [16]:
model = aiplatform.Model(
    model_name = endpoint.list_models()[0].model+f'@{endpoint.list_models()[0].model_version_id}'
)

In [17]:
model.display_name

'05_05h'

In [18]:
model.resource_name

'projects/1026793852137/locations/us-central1/models/model_05_05h'

In [19]:
model.version_id

'1'

In [20]:
model.version_description

'run-20220927230247-6'

In [21]:
model.versioned_resource_name

'projects/1026793852137/locations/us-central1/models/model_05_05h@1'

In [22]:
model.supported_input_storage_formats

['jsonl', 'bigquery', 'csv', 'tf-record', 'tf-record-gzip', 'file-list']

In [23]:
model.name

'model_05_05h'

In [24]:
model.uri

'gs://statmike-mlops-349915/05/05h/models/20220927230247/6/model'

In [25]:
print(f'Review the model in the Vertex AI Model Registry:\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/models/{model.name}/versions/{model.version_id}/properties?project={PROJECT_ID}')

Review the model in the Vertex AI Model Registry:
https://console.cloud.google.com/vertex-ai/locations/us-central1/models/model_05_05h/versions/1/properties?project=statmike-mlops-349915


---
## Retrieve Records For Prediction

In [31]:
n = 1000
pred = bq.query(query = f"SELECT * FROM {BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE} WHERE splits='TEST' LIMIT {n}").to_dataframe()

In [32]:
pred.head(4)

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V23,V24,V25,V26,V27,V28,Amount,Class,transaction_id,splits
0,35337,1.092844,-0.01323,1.359829,2.731537,-0.707357,0.873837,-0.79613,0.437707,0.39677,...,-0.167647,0.027557,0.592115,0.219695,0.03697,0.010984,0.0,0,a1b10547-d270-48c0-b902-7a0f735dadc7,TEST
1,60481,1.238973,0.035226,0.063003,0.641406,-0.260893,-0.580097,0.049938,-0.034733,0.405932,...,-0.057718,0.104983,0.537987,0.589563,-0.046207,-0.006212,0.0,0,814c62c8-ade4-47d5-bf83-313b0aafdee5,TEST
2,139587,1.870539,0.211079,0.224457,3.889486,-0.380177,0.249799,-0.577133,0.179189,-0.120462,...,0.180776,-0.060226,-0.228979,0.080827,0.009868,-0.036997,0.0,0,d08a1bfa-85c5-4f1b-9537-1c5a93e6afd0,TEST
3,162908,-3.368339,-1.980442,0.153645,-0.159795,3.847169,-3.516873,-1.209398,-0.292122,0.760543,...,-1.171627,0.214333,-0.159652,-0.060883,1.294977,0.120503,0.0,0,802f3307-8e5a-4475-b795-5d5d8d7d0120,TEST


Remove columns not included as features in the model:

In [33]:
newobs = pred[pred.columns[~pred.columns.isin(VAR_OMIT.split()+[VAR_TARGET, 'splits'])]].to_dict(orient='records')
#newobs[0]

In [34]:
len(newobs)

1000

---
## Notebook Predictions: Load Keras Model

Note: The version of TensorFlow used in the training job that created the model may be a different version than the one running in this notebook.  This can cause an issue with tf.keras.models.load_model.  Make sure the versions are the same to prevent issues.

In [35]:
test_uri = 'gs://statmike-mlops-349915/05/05/models/20220927110007/model'

In [36]:
keras_model = tf.keras.models.load_model(test_uri) #(model.uri)

In [39]:
predictions = keras_model.predict(
    {
        key: tf.constant([value], dtype=tf.float32, name = key) for key, value in newobs[0].items()
    }
)
predictions

array([[9.9991906e-01, 8.0941936e-05]], dtype=float32)

In [38]:
np.argmax(predictions[0])

0

---
## Local Predictions: With TensorFlow ModelServer
Locally run [TensorFlow Serving with Docker](https://www.tensorflow.org/tfx/serving/docker#serving_example)

Review the local directory for this notebook (created above):

In [40]:
DIR

'temp/05_predictions'

In [65]:
!ls {DIR}

model


Copy the model files to the local directory for this notebook:

In [42]:
!gsutil cp -R {model.uri} {DIR}

Copying gs://statmike-mlops-349915/05/05h/models/20220927230247/6/model/keras_metadata.pb...
Copying gs://statmike-mlops-349915/05/05h/models/20220927230247/6/model/saved_model.pb...
/ [2 files][513.3 KiB/513.3 KiB]                                                
==> NOTE: You are performing a sequence of gsutil operations that may
run significantly faster if you instead use gsutil -m cp ... Please
see the -m section under "gsutil help options" for further information
about when gsutil -m can be advantageous.

Copying gs://statmike-mlops-349915/05/05h/models/20220927230247/6/model/variables/variables.data-00000-of-00001...
Copying gs://statmike-mlops-349915/05/05h/models/20220927230247/6/model/variables/variables.index...
/ [4 files][558.2 KiB/558.2 KiB]                                                
Operation completed over 4 objects/558.2 KiB.                                    


In [43]:
!ls {DIR}

model


In [44]:
!ls {DIR}/model

keras_metadata.pb  saved_model.pb  variables


### Load the Model and Review

In [45]:
reloaded_model = tf.saved_model.load(f'{DIR}/model')

In [46]:
reloaded_model.signatures

_SignatureMap({'serving_default': <ConcreteFunction signature_wrapper(Amount, Time, V1, V10, V11, V12, V13, V14, V15, V16, V17, V18, V19, V2, V20, V21, V22, V23, V24, V25, V26, V27, V28, V3, V4, V5, V6, V7, V8, V9) at 0x7F8270174690>})

In [47]:
reloaded_model.signatures['serving_default']

<ConcreteFunction signature_wrapper(Amount, Time, V1, V10, V11, V12, V13, V14, V15, V16, V17, V18, V19, V2, V20, V21, V22, V23, V24, V25, V26, V27, V28, V3, V4, V5, V6, V7, V8, V9) at 0x7F8270174690>

In [48]:
reloaded_model.signatures['serving_default'].structured_input_signature

((),
 {'V8': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V8'),
  'V3': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V3'),
  'V25': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V25'),
  'V24': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V24'),
  'V7': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V7'),
  'V17': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V17'),
  'V2': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V2'),
  'V23': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V23'),
  'Amount': TensorSpec(shape=(None, 1), dtype=tf.float32, name='Amount'),
  'V14': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V14'),
  'V10': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V10'),
  'V22': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V22'),
  'V26': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V26'),
  'V16': TensorSpec(shape=(None, 1), dtype=tf.float32, name='V16'),
  'V11': TensorSpec(shape=(None, 1), dtype=tf

In [49]:
#!saved_model_cli show --dir {DIR}/model --all

### Download Docker Image and Start Serving Container

In [83]:
!docker pull tensorflow/serving

Using default tag: latest
latest: Pulling from tensorflow/serving
Digest: sha256:6c3c199683df6165f5ae28266131722063e9fa012c15065fc4e245ac7d1db980
Status: Image is up to date for tensorflow/serving:latest
docker.io/tensorflow/serving:latest


In [113]:
command = f'''docker run -t -p 8501:8501 \
-v "/$(pwd)/{DIR}/model/:/models/{SERIES}/1" \
-e MODEL_NAME={SERIES} \
tensorflow/serving'''
print(command)

docker run -t -p 8501:8501 -v "/$(pwd)/temp/05_predictions/model/:/models/05/1" -e MODEL_NAME=05 tensorflow/serving


**Run the command above in a subprocess at the local folder of this notebook - use multiprocess.Process():**

In [114]:
!pwd

/home/jupyter/vertex-ai-mlops/05 - TensorFlow


In [115]:
import multiprocessing

def docker_runner():
    !{command}
    #!docker run -t -p 8501:8501 -v "/$(pwd)/temp/05tools_1/model:/models/fraud/1" -e MODEL_NAME=fraud tensorflow/serving

def main():
    p = multiprocessing.Process(target=docker_runner)
    p.start()
    return p
    
p = main()

2022-09-28 23:40:25.896126: I tensorflow_serving/model_servers/server.cc:89] Building single TensorFlow model file config:  model_name: 05 model_base_path: /models/05
2022-09-28 23:40:25.896586: I tensorflow_serving/model_servers/server_core.cc:465] Adding/updating models.
2022-09-28 23:40:25.896623: I tensorflow_serving/model_servers/server_core.cc:594]  (Re-)adding model: 05
2022-09-28 23:40:26.065968: I tensorflow_serving/core/basic_manager.cc:740] Successfully reserved resources to load servable {name: 05 version: 1}
2022-09-28 23:40:26.066044: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: 05 version: 1}
2022-09-28 23:40:26.066064: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: 05 version: 1}
2022-09-28 23:40:26.066139: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /models/05/1
2022-09-28 23:40:26.078722: I external/org_tensorflow/tensorflow/cc/saved_model/read

### Get Predictions on Exposed Port

In [116]:
import requests

In [118]:
headers = {"content-type": "application/json"}
json_response = requests.post(f'http://localhost:8501/v1/models/{SERIES}:predict', data=json.dumps({"instances": [newobs[0]]}), headers=headers)

ConnectionError: HTTPConnectionPool(host='localhost', port=8501): Max retries exceeded with url: /v1/models/05:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f820f5dbd10>: Failed to establish a new connection: [Errno 111] Connection refused'))

In [56]:
print(json_response.text)

NameError: name 'json_response' is not defined

In [57]:
predictions = json.loads(json_response.text)['predictions']
predictions

NameError: name 'json_response' is not defined

In [58]:
np.argmax(predictions[0])

0

### Shutdown TensorFlow Serving Container
There are two entities running: a subprocess called `p` and a docker container that was run by the subprocess.  It is not enough to just stop `p` but it might be enough to stop the container and then the subprocess will terminate due to completion.  The command below stop the subprocess `p` and then stop and remove the container.

In [119]:
p.terminate()

In [120]:
p.is_alive()

False

In [121]:
docker = !docker ps -a
docker

['CONTAINER ID   IMAGE                          COMMAND                  CREATED         STATUS                       PORTS     NAMES',
 'f63a6740d226   tensorflow/serving             "/usr/bin/tf_serving…"   2 minutes ago   Exited (134) 2 minutes ago             sharp_hertz',
 'cfc6fa1ae606   gcr.io/inverting-proxy/agent   "/bin/sh -c \'/opt/bi…"   6 weeks ago     Up 2 days                              proxy-agent']

In [122]:
for d in docker:
    if 'tensorflow/serving' in d:
        print(d.split()[-1])
        !docker stop {d.split()[-1]}
        !docker rm {d.split()[0]}

sharp_hertz
sharp_hertz
f63a6740d226


In [123]:
!docker ps -a

CONTAINER ID   IMAGE                          COMMAND                  CREATED       STATUS      PORTS     NAMES
cfc6fa1ae606   gcr.io/inverting-proxy/agent   "/bin/sh -c '/opt/bi…"   6 weeks ago   Up 2 days             proxy-agent
