## Using Vertex AI

In [None]:
import os
from pathlib import  Path
import numpy as np
import json
import tensorflow as tf

In [2]:
os.environ['GOOGLE_APPLICATION_CREDENTIALS']='./tf101.json'

In [3]:
!gcloud config list

[core]
account = iconresearch@protonmail.com
disable_usage_reporting = True
project = just-aloe-414315

Your active configuration is: [default]


## Initilizing SDK - done in shell

In [3]:
!gcloud init

Welcome! This command will take you through the configuration of gcloud.

Your current configuration has been set to: [default]

You can skip diagnostics next time by using the following flag:
  gcloud init --skip-diagnostics

Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.                                            
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).

You must log in to continue. Would you like to log in (Y/n)?  ^C


Command killed by keyboard interrupt



In [4]:
from google.cloud import storage

## Configuring GCS and creating bucket

### Role *Storage Object Admin* added in AIM

In [5]:
project_id = 'just-aloe-414315'
bucket_name = 'tf101_bucket'
location = 'us-central1'

In [6]:
storage_client = storage.Client(project=project_id)

In [None]:
# Bucket creation, done only once
#bucket = storage_client.create_bucket(bucket_name, location=location)

In [15]:
bucket = storage_client.get_bucket(bucket_name)

In [16]:
bucket

<Bucket: tf101_bucket>

### Files in GCS are calld *blobs* and aren't organised in directories. 

## GCS uploader, for model upload

### For many files such single-threaded uploader would be slow, but it can be accelerated with multithreading

In [41]:
def upload_directory(bucket, dirpath):
    dirpath = Path(dirpath)
    for filepath in dirpath.glob("**/*"):
        if filepath.is_file():
            blob = bucket.blob(filepath.relative_to(dirpath.parent).as_posix())
            blob.upload_from_filename(filepath)

## Alternatively for large file sets one can use CLI:

In [13]:
!gsutil -m cp -r my_mnist_model gs://{bucket_name}/

Copying file://my_mnist_model/0001/fingerprint.pb [Content-Type=application/octet-stream]...
Copying file://my_mnist_model/0001/keras_metadata.pb [Content-Type=application/octet-stream]...
Copying file://my_mnist_model/0001/saved_model.pb [Content-Type=application/octet-stream]...
Copying file://my_mnist_model/0001/variables/variables.data-00000-of-00001 [Content-Type=application/octet-stream]...
Copying file://my_mnist_model/0001/variables/variables.index [Content-Type=application/octet-stream]...
| [5/5 files][  2.1 MiB/  2.1 MiB] 100% Done                                    
Operation completed over 5 objects/2.1 MiB.                                      


## Communicating with Vertex AI

### https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main
### https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/migration/sdk-custom-xgboost-prebuilt-container.ipynb

In [17]:
from google.cloud import aiplatform

## VM image to use for inference

## Parent directory contains various ither containers, also for XGBoost and Scikit-Learn

In [25]:
server_image = "gcr.io/cloud-aiplatform/prediction/tf2-gpu.2-8:latest"

In [26]:
project_id

'just-aloe-414315'

In [27]:
aiplatform.init(project=project_id, location=location)

In [22]:
! gcloud services enable artifactregistry.googleapis.com

Operation "operations/acat.p2-372043913167-f2257f3f-9567-4abe-8cca-295a9aa0c8d2" finished successfully.


In [28]:
mnist_model = aiplatform.Model.upload(
    display_name="fashion_mnist",
    artifact_uri=f"gs://{bucket_name}/my_mnist_model/0001",
    serving_container_image_uri=server_image,
)

Creating Model
Create Model backing LRO: projects/372043913167/locations/us-central1/models/6398275865330843648/operations/385180844723011584
Model created. Resource name: projects/372043913167/locations/us-central1/models/6398275865330843648@1
To use this Model in another session:
model = aiplatform.Model('projects/372043913167/locations/us-central1/models/6398275865330843648@1')


## Model deployment

### Creating serving endpoint

In [29]:
endpoint = aiplatform.Endpoint.create(display_name='fashion_mnist-endpoint')

Creating Endpoint
Create Endpoint backing LRO: projects/372043913167/locations/us-central1/endpoints/138021694634721280/operations/2296958886541787136
Endpoint created. Resource name: projects/372043913167/locations/us-central1/endpoints/138021694634721280
To use this Endpoint in another session:
endpoint = aiplatform.Endpoint('projects/372043913167/locations/us-central1/endpoints/138021694634721280')


### Note: quotas can limit deployment possibilities

### Quotas are controlled in IAM and admin -> Quotas

## Creating VM

In [31]:
endpoint.deploy(
    mnist_model,
    min_replica_count=1,
    # setting to 1 due to quota
    # With more replicas available high QPS would spawn more replicas
    max_replica_count=1, 
    machine_type="n1-standard-4",
    accelerator_type="NVIDIA_TESLA_K80",
    accelerator_count=1
)
    

Deploying Model projects/372043913167/locations/us-central1/models/6398275865330843648 to Endpoint : projects/372043913167/locations/us-central1/endpoints/138021694634721280
Deploy Endpoint model backing LRO: projects/372043913167/locations/us-central1/endpoints/138021694634721280/operations/5034021560076206080
Endpoint model deployed. Resource name: projects/372043913167/locations/us-central1/endpoints/138021694634721280


## Inference

### Quick sample data preparation

In [34]:
fashion_mnist=tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]
# Data normalization
X_train, X_valid, X_test = X_train/255., X_valid/255., X_test/255. 
X_new = X_test[:3]

In [35]:
response = endpoint.predict(instances=X_new.tolist())

In [36]:
np.round(response.predictions, 2)

array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.01, 0.  , 0.02, 0.  , 0.98],
       [0.  , 0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]])

## Removing enpoints to stop costs

In [38]:
endpoint.undeploy_all()
endpoint.delete()

Undeploying Endpoint model: projects/372043913167/locations/us-central1/endpoints/138021694634721280
Undeploy Endpoint model backing LRO: projects/372043913167/locations/us-central1/endpoints/138021694634721280/operations/1046084090039631872
Endpoint model undeployed. Resource name: projects/372043913167/locations/us-central1/endpoints/138021694634721280
Deleting Endpoint : projects/372043913167/locations/us-central1/endpoints/138021694634721280
Delete Endpoint  backing LRO: projects/372043913167/locations/us-central1/operations/255702355436109824
Endpoint deleted. . Resource name: projects/372043913167/locations/us-central1/endpoints/138021694634721280


## Batch prediction

### Batch prediction is done with a job and does not need an endpoint. Data is stored in GCS and used remotely.

In [39]:
batch_path = Path("my_mnist_batch")
batch_path.mkdir(exist_ok=True)

In [42]:
with open(batch_path / "my_mnist_batch.jsonl", "w") as jsonl_file:
    for image in X_test[:100].tolist():
        jsonl_file.write(json.dumps(image))
        jsonl_file.write("\n")

In [44]:
upload_directory(bucket, batch_path)

## Predicitons are stored in a specified bucket in GCS

### For large data instances *instances_format* can specify data types e.g. to tf-record, CSV or *files-list*
### In the latter case gcs_source must point to a text file with one filepath per line
### In such cases the model must have a preprocessing layer tf.io.decode.base64() bacuse Vertex AI will read files contents and encode it in Base64. 
### Images require additional parsing like tf.io.decode_image() or tf.io.decode_png()


In [45]:
batch_prediction_job = mnist_model.batch_predict(
    job_display_name="my_batch_prediction_job",
    machine_type="n1-standard-4",
    starting_replica_count=1,
    max_replica_count=1,
    accelerator_type="NVIDIA_TESLA_K80",
    accelerator_count=1,
    gcs_source=[f"gs://{bucket_name}/{batch_path.name}/my_mnist_batch.jsonl"],
    # Predictions storage point
    gcs_destination_prefix=f"gs://{bucket_name}/my_mnist_predictions/",
    sync=True # set to False if you don't want to wait for completion
)

Creating BatchPredictionJob
BatchPredictionJob created. Resource name: projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408
To use this BatchPredictionJob in another session:
bpj = aiplatform.BatchPredictionJob('projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408')
View Batch Prediction Job:
https://console.cloud.google.com/ai/platform/locations/us-central1/batch-predictions/7541529845230993408?project=372043913167
BatchPredictionJob projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408 current state:
JobState.JOB_STATE_PENDING
BatchPredictionJob projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408 current state:
JobState.JOB_STATE_PENDING
BatchPredictionJob projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408 current state:
JobState.JOB_STATE_PENDING
BatchPredictionJob projects/372043913167/locations/us-central1/batchPredictionJobs/

## Fetching predictions

In [46]:
y_probas = []

for blob in batch_prediction_job.iter_outputs():
    if "prediction.results" in blob.name:
        for line in blob.download_as_text().splitlines():
            y_proba = json.loads(line)["prediction"]
            y_probas.append(y_proba)

In [47]:
 y_pred = np.argmax(y_probas, axis=1)

In [48]:
accuracy = np.sum(y_pred == y_test[:100]) / 100

In [49]:
accuracy

0.86

## Emptying GCS bucket

In [50]:
for prefix in ["my_mnist_model/", "my_mnist_batch/", "my_mnist_predictions/"]:
    blobs = bucket.list_blobs(prefix=prefix)
    for blob in blobs:
        blob.delete()

## Deleting bucket and job

In [51]:
bucket.delete() # if the bucket is empty
batch_prediction_job.delete()

Deleting BatchPredictionJob : projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408
Delete BatchPredictionJob  backing LRO: projects/372043913167/locations/us-central1/operations/3988623496572829696
BatchPredictionJob deleted. . Resource name: projects/372043913167/locations/us-central1/batchPredictionJobs/7541529845230993408
