# 05i - Vertex AI > Training > Hyperparameter Tuning Jobs - With Custom Container

### 05 Series Overview
Where a model gets trained is where it consumes computing resources.  With Vertex AI, you have choices for configuring the computing resources available at training.  This notebook is an example of an execution environment.  When it was set up there were choices for machine type and accelerators (GPUs).  

In the `05` notebook, the model training happened directly in the notebook.  The models were then imported to Vertex AI and deployed to an endpoint for online predictions. 

In this `05a-05i` series of demonstrations, the same model is trained using managed computing resources in Vertex AI as custom training jobs.  These jobs will be demonstrated as:

-  Custom Job from a python script (`05a`), python source distribution (`05b`), and custom container (`05c`)
-  Training Pipeline that trains and saves models from a python script (`05d`), python source distribution (`05e`), and custom container (`05f`)
-  Hyperparameter Tuning Jobs from a python script (`05g`), python source distribution (`05h`), and custom container (`05i`)

### This Notebook (`05i`): An extension of `05c` with Hyperparmeter Tuning - And Tensorboard HParams  
This notebook trains the same Tensorflow Keras model from `05` by first modifying and saving the training code as a Python module on a custom container (same as `05c`).  While this example fits nicely in a single script, larger examples will benefit from the flexibility offered by source distributions or module storage and this notebook gives an example of making the shift. 

The training code is stored directly on the custom container as part of the Docker build process.  This build process uses a pre-built container as the base image and adds both packages and the training code as a Python module.  This container is specified in the setup of a custom training job and also assigned compute resources for executing the training in a managed service.  This is done with the [Vertex AI Python SDK](https://googleapis.dev/python/aiplatform/latest/aiplatform.html#) using the class [`aiplatform.CustomJob()`](https://googleapis.dev/python/aiplatform/latest/aiplatform.html#google.cloud.aiplatform.CustomJob).

The Custom Job is then used as the input for a Vertex AI > Training > Hyperparameter Tuning Job.  This runs and manages the tuning loops for the number of trials in each loop, collects the metric(s) and manages the parameters with the selected search algorithm for parameter modification.  This is done with the [Vertex AI Python SDK](https://googleapis.dev/python/aiplatform/latest/aiplatform.html#) using the class [`aiplatform.HyperparameterTuningJob()`](https://googleapis.dev/python/aiplatform/latest/aiplatform.html#google.cloud.aiplatform.HyperparameterTuningJob).

The training can be reviewed with Vertex AI's managed Tensorboard under Experiments > Experiments, or by clicking on the `05i...` job under Training > Hyperparameter Tuning Jobs and then clicking the 'Open Tensorboard' link.  **Click on the HParams tab in Tensorboard to review the hyperparameters and metrics.**

<img src="architectures/overview/Training.png">

### Prerequisites:
-  01 - BigQuery - Table Data Source
-  Understanding:
    -  05 - Vertex AI > Notebooks - Models Built in Notebooks with Tensorflow
        -  Contains a more granular review of the Tensorflow model training

### Resources:
- [Vertex AI Custom Container For Training](https://cloud.google.com/vertex-ai/docs/training/containers-overview)

---
## Vertex AI - Conceptual Flow

<img src="architectures/slides/05i_arch.png">

---
## Vertex AI - Workflow

<img src="architectures/slides/05i_console.png">

---
## Setup

inputs:

In [1]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [2]:
REGION = 'us-central1'
EXPERIMENT = '05i'
SERIES = '05'

# source data
BQ_PROJECT = PROJECT_ID
BQ_DATASET = 'fraud'
BQ_TABLE = 'fraud_prepped'

# Resources
BASE_IMAGE = 'gcr.io/deeplearning-platform-release/tf-cpu.2-3'
DEPLOY_IMAGE ='us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-7:latest'
TRAIN_COMPUTE = 'n1-standard-4'
DEPLOY_COMPUTE = 'n1-standard-4'

# Model Training
VAR_TARGET = 'Class'
VAR_OMIT = 'transaction_id' # add more variables to the string with space delimiters
EPOCHS = 10
BATCH_SIZE = 100

packages:

In [3]:
from google.cloud import aiplatform
from datetime import datetime

from google.cloud import bigquery
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value
import json
import numpy as np
import pandas as pd

clients:

In [4]:
aiplatform.init(project=PROJECT_ID, location=REGION)
bigquery = bigquery.Client()

parameters:

In [5]:
TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
BUCKET = PROJECT_ID
URI = f"gs://{BUCKET}/{BQ_DATASET}/models/{SERIES}/{EXPERIMENT}"
DIR = f"temp/{EXPERIMENT}"

In [6]:
# Give service account roles/storage.objectAdmin permissions
# Console > IMA > Select Account <projectnumber>-compute@developer.gserviceaccount.com > edit - give role
SERVICE_ACCOUNT = !gcloud config list --format='value(core.account)' 
SERVICE_ACCOUNT = SERVICE_ACCOUNT[0]
SERVICE_ACCOUNT

'1026793852137-compute@developer.gserviceaccount.com'

environment:

In [7]:
!rm -rf {DIR}
!mkdir -p {DIR}

Experiment Tracking:

In [8]:
FRAMEWORK = 'tf'
TASK = 'classification'
MODEL_TYPE = 'dnn'
EXPERIMENT_NAME = f'experiment-{SERIES}-{EXPERIMENT}-{FRAMEWORK}-{TASK}-{MODEL_TYPE}'
RUN_NAME = f'run-{TIMESTAMP}'

---
## Get Vertex AI Experiments Tensorboard Instance Name
[Vertex AI Experiments](https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-overview) has managed [Tensorboard](https://www.tensorflow.org/tensorboard) instances that you can track Tensorboard Experiments (a training run or hyperparameter tuning sweep).  

The training job will show up as an experiment for the Tensorboard instance and have the same name as the training job ID.

This code checks to see if a Tensorboard Instance has been created in the project, retrieves it if so, creates it otherwise:

In [9]:
tb = aiplatform.Tensorboard.list(filter=f"labels.series={SERIES}")
if tb:
    tb = tb[0]
else: 
    tb = aiplatform.Tensorboard.create(display_name = SERIES, labels = {'series' : f'{SERIES}'})

In [10]:
tb.resource_name

'projects/1026793852137/locations/us-central1/tensorboards/7179142426307592192'

---
## Setup Vertex AI Experiments

The code in this section initializes the experiment and starts a run that represents this notebook.  Throughout the notebook sections for model training and evaluation information will be logged to the experiment using:
- [.log_params](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform#google_cloud_aiplatform_log_params)
- [.log_metrics](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform#google_cloud_aiplatform_log_metrics)
- [.log_time_series_metrics](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform#google_cloud_aiplatform_log_time_series_metrics)

In [11]:
aiplatform.init(experiment = EXPERIMENT_NAME, experiment_tensorboard = tb.resource_name)

---
## Training

### Assemble Python File for Training

This is the training code from the notebook based training in `05` restructured as a Python Script that has parameter inputs and creates a Vertex AI Experiment run.

Create the main python trainer file as `/train.py`:

#### Review Pre-Built `05_train_hp.py`

In [12]:
from IPython.display import Markdown as md

with open(f'05_train_hp.py', 'r') as file:
    data = file.read()
md("```python" + data + "```")

```python
# package import
from tensorflow.python.framework import dtypes
from tensorflow_io.bigquery import BigQueryClient
import tensorflow as tf
from google.cloud import bigquery
from google.cloud import aiplatform
import argparse
import os
import sys
import hypertune
from tensorboard.plugins.hparams import api as hp

# import argument to local variables
parser = argparse.ArgumentParser()
# the passed param, dest: a name for the param, default: if absent fetch this param from the OS, type: type to convert to, help: description of argument
parser.add_argument('--epochs', dest = 'epochs', default = 10, type = int, help = 'Number of Epochs')
parser.add_argument('--batch_size', dest = 'batch_size', default = 32, type = int, help = 'Batch Size')
parser.add_argument('--var_target', dest = 'var_target', type=str)
parser.add_argument('--var_omit', dest = 'var_omit', type=str, nargs='*')
parser.add_argument('--project_id', dest = 'project_id', type=str)
parser.add_argument('--bq_project', dest = 'bq_project', type=str)
parser.add_argument('--bq_dataset', dest = 'bq_dataset', type=str)
parser.add_argument('--bq_table', dest = 'bq_table', type=str)
parser.add_argument('--region', dest = 'region', type=str)
parser.add_argument('--experiment', dest = 'experiment', type=str)
parser.add_argument('--series', dest = 'series', type=str)
parser.add_argument('--experiment_name', dest = 'experiment_name', type=str)
parser.add_argument('--run_name', dest = 'run_name', type=str)
# hyperparameters
parser.add_argument('--lr', dest='learning_rate', required=True, type=float, help='Learning Rate')
parser.add_argument('--m', dest='momentum', required=True, type=float, help='Momentum')
args = parser.parse_args()

# setup tensorboard hparams
HP_LEARNING_RATE = hp.HParam('learning_rate', hp.RealInterval(0.0, 1.0))
HP_MOMENTUM = hp.HParam('momentum', hp.RealInterval(0.0,1.0))
hparams = {
    HP_LEARNING_RATE: args.learning_rate,
    HP_MOMENTUM: args.momentum
}

# clients
bigquery = bigquery.Client(project = args.project_id)
aiplatform.init(project = args.project_id, location = args.region)
hpt = hypertune.HyperTune()
args.run_name = f'{args.run_name}-{hpt.trial_id}'

# Vertex AI Experiment
expRun = aiplatform.ExperimentRun.create(run_name = args.run_name, experiment = args.experiment_name)
expRun.log_params({'experiment': args.experiment, 'series': args.series, 'project_id': args.project_id})
expRun.log_params({'hyperparameter.learning_rate': args.learning_rate, 'hyperparameter.momentum': args.momentum})

# get schema from bigquery source
query = f"SELECT * FROM {args.bq_project}.{args.bq_dataset}.INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{args.bq_table}'"
schema = bigquery.query(query).to_dataframe()

# get number of classes from bigquery source
nclasses = bigquery.query(query = f'SELECT DISTINCT {args.var_target} FROM {args.bq_project}.{args.bq_dataset}.{args.bq_table} WHERE {args.var_target} is not null').to_dataframe()
nclasses = nclasses.shape[0]
expRun.log_params({'data_source': f'bq://{args.bq_project}.{args.bq_dataset}.{args.bq_table}', 'nclasses': nclasses, 'var_split': 'splits', 'var_target': args.var_target})

# Make a list of columns to omit
OMIT = args.var_omit + ['splits']

# use schema to prepare a list of columns to read from BigQuery
selected_fields = schema[~schema.column_name.isin(OMIT)].column_name.tolist()

# all the columns in this data source are either float64 or int64
output_types = [dtypes.float64 if x=='FLOAT64' else dtypes.int64 for x in schema[~schema.column_name.isin(OMIT)].data_type.tolist()]

# remap input data to Tensorflow inputs of features and target
def transTable(row_dict):
    target = row_dict.pop(args.var_target)
    target = tf.one_hot(tf.cast(target, tf.int64), nclasses)
    target = tf.cast(target, tf.float32)
    return(row_dict, target)

# function to setup a bigquery reader with Tensorflow I/O
def bq_reader(split):
    reader = BigQueryClient()

    training = reader.read_session(
        parent = f"projects/{args.project_id}",
        project_id = args.bq_project,
        table_id = args.bq_table,
        dataset_id = args.bq_dataset,
        selected_fields = selected_fields,
        output_types = output_types,
        row_restriction = f"splits='{split}'",
        requested_streams = 3
    )
    
    return training

# setup feed for train, validate and test
train = bq_reader('TRAIN').parallel_read_rows().prefetch(1).map(transTable).shuffle(args.batch_size*10).batch(args.batch_size)
validate = bq_reader('VALIDATE').parallel_read_rows().prefetch(1).map(transTable).batch(args.batch_size)
test = bq_reader('TEST').parallel_read_rows().prefetch(1).map(transTable).batch(args.batch_size)
expRun.log_params({'training.batch_size': args.batch_size, 'training.shuffle': 10*args.batch_size, 'training.prefetch': 1})

# Logistic Regression

# model input definitions
feature_columns = {header: tf.feature_column.numeric_column(header) for header in selected_fields if header != args.var_target}
feature_layer_inputs = {header: tf.keras.layers.Input(shape = (1,), name = header) for header in selected_fields if header != args.var_target}

# feature columns to a Dense Feature Layer
feature_layer_outputs = tf.keras.layers.DenseFeatures(feature_columns.values(), name = 'feature_layer')(feature_layer_inputs)

# batch normalization then Dense with softmax activation to nclasses
layers = tf.keras.layers.BatchNormalization(name = 'batch_normalization_layer')(feature_layer_outputs)
layers = tf.keras.layers.Dense(64, activation = 'relu', name = 'hidden_layer')(layers)
layers = tf.keras.layers.Dense(32, activation = 'relu', name = 'embedding_layer')(layers)
layers = tf.keras.layers.Dense(nclasses, activation = tf.nn.softmax, name = 'prediction_layer')(layers)

# the model
model = tf.keras.Model(
    inputs = feature_layer_inputs,
    outputs = layers,
    name = args.experiment
)
opt = tf.keras.optimizers.SGD(learning_rate = args.learning_rate, momentum = args.momentum) #SGD or Adam
loss = tf.keras.losses.CategoricalCrossentropy()
model.compile(
    optimizer = opt,
    loss = loss,
    metrics = ['accuracy', tf.keras.metrics.AUC(curve='PR', name = 'auprc')]
)

# setup tensorboard logs and train
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=os.environ['AIP_TENSORBOARD_LOG_DIR'], histogram_freq=1)
hparams_callback = hp.KerasCallback(os.environ['AIP_TENSORBOARD_LOG_DIR'] + 'train/', hparams, trial_id = args.run_name)
history = model.fit(train, epochs = args.epochs, callbacks = [tensorboard_callback, hparams_callback], validation_data = validate)
expRun.log_params({'epochs': history.params['epochs']})
for e in range(0, history.params['epochs']):
    expRun.log_time_series_metrics(
        {
            'train_loss': history.history['loss'][e],
            'train_accuracy': history.history['accuracy'][e],
            'train_auprc': history.history['auprc'][e],
            'val_loss': history.history['val_loss'][e],
            'val_accuracy': history.history['val_accuracy'][e],
            'val_auprc': history.history['val_auprc'][e]
        }
    )

# evaluations:
loss, accuracy, auprc = model.evaluate(test)
expRun.log_metrics({'test_loss': loss, 'test_accuracy': accuracy, 'test_auprc': auprc})
loss, accuracy, auprc = model.evaluate(validate)
expRun.log_metrics({'val_loss': loss, 'val_accuracy': accuracy, 'val_auprc': auprc})
loss, accuracy, auprc = model.evaluate(train)
expRun.log_metrics({'train_loss': loss, 'train_accuracy': accuracy, 'train_auprc': auprc})

# output the model save files
model.save(os.getenv("AIP_MODEL_DIR"))
expRun.log_params({'model.save': os.getenv("AIP_MODEL_DIR")})
expRun.end_run()

# report hypertune info back to Vertex AI Training > Hyperparamter Tuning Job
hpt.report_hyperparameter_tuning_metric(
    hyperparameter_metric_tag = 'auprc',
    metric_value = history.history['auprc'][-1])
```

#### Copy Script to This Experiment

Create the main python trainer file as `/train.py`:

In [13]:
!mkdir -p {DIR}/source/trainer
!cp 05_train_hp.py {DIR}/source/trainer/train.py

### Create Requirements.txt File for Python

In [14]:
requirements = f"""tensorflow_io
google-cloud-aiplatform>={aiplatform.__version__}
cloudml-hypertune
"""
with open(f'{DIR}/source/requirements.txt', 'w') as f:
    f.write(requirements)

### Create Custom Container
- https://cloud.google.com/vertex-ai/docs/training/create-custom-container
- https://cloud.google.com/vertex-ai/docs/training/pre-built-containers
- https://cloud.google.com/vertex-ai/docs/general/deep-learning
    - https://cloud.google.com/deep-learning-containers/docs/choosing-container

#### Choose a Base Image

In [15]:
BASE_IMAGE # Defined above in Setup

'gcr.io/deeplearning-platform-release/tf-cpu.2-3'

#### Create the Dockerfile
A basic dockerfile thats take the base image and copies the code in and define an entrypoint - what python script to run first in this case.  Add RUN entries to pip install additional packages.

In this case, hyperparameter tuning uses [reports metrics to Vertex AI](https://cloud.google.com/vertex-ai/docs/training/using-hyperparameter-tuning#report-metrics) using the [cloudml-hypertune Python package](https://github.com/GoogleCloudPlatform/cloudml-hypertune) and is missing from the base image.  

In [16]:
dockerfile = f"""
FROM {BASE_IMAGE}
WORKDIR /
# copy requirements and install them
COPY requirements.txt ./
RUN pip install --no-cache-dir --upgrade pip \
  && pip install --no-cache-dir -r requirements.txt
## Copies the trainer code to the docker image
COPY trainer /trainer
## Sets up the entry point to invoke the trainer
ENTRYPOINT ["python", "-m", "trainer.train"]
"""
with open(f'{DIR}/source/Dockerfile', 'w') as f:
    f.write(dockerfile)

#### Setup Artifact Registry

The container will need to be stored in Artifact Registry, Container Registry or Docker Hub in order to be used by Vertex AI Training jobs.  This notebook will setup Artifact registry and push a local (to this notebook) built container to it. 

https://cloud.google.com/artifact-registry/docs/docker/store-docker-container-images#gcloud

##### Enable Artifact Registry API:
Check to see if the api is enabled, if not then enable it:

In [17]:
services = !gcloud services list --format="json" --available --filter=name:artifactregistry.googleapis.com
services = json.loads("".join(services))

if (services[0]['config']['name'] == 'artifactregistry.googleapis.com') & (services[0]['state'] == 'ENABLED'):
    print(f"Artifact Registry is Enabled for This Project: {PROJECT_ID}")
else:
    print(f"Enabeling Artifact Registry for this Project: {PROJECT_ID}")
    !gcloud services enable artifactregistry.googleapis.com

Artifact Registry is Enabled for This Project: statmike-mlops-349915


##### Create A Repository
Check to see if the registry is already created, if not then create it

In [18]:
check_for_repo = !gcloud artifacts repositories describe {PROJECT_ID} --location={REGION}

if check_for_repo[0].startswith('ERROR'):
    print(f'Creating a repository named {PROJECT_ID}')
    !gcloud  artifacts repositories create {PROJECT_ID} --repository-format=docker --location={REGION} --description="Vertex AI Training Custom Containers"
else:
    print(f'There is already a repository named {PROJECT_ID}')

There is already a repository named statmike-mlops-349915


##### Configure Local Docker to Use GCLOUD CLI

In [19]:
!gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet


{
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud",
    "us-central1-docker.pkg.dev": "gcloud"
  }
}
Adding credentials for: us-central1-docker.pkg.dev
gcloud credential helpers already registered correctly.


#### Build The Custom Container (local to notebook)

In [20]:
IMAGE_URI=f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{PROJECT_ID}/{EXPERIMENT}_{BQ_DATASET}:latest"
IMAGE_URI

'us-central1-docker.pkg.dev/statmike-mlops-349915/statmike-mlops-349915/05i_fraud:latest'

In [21]:
!docker build {DIR}/source/. -t $IMAGE_URI

Sending build context to Docker daemon  12.29kB
Step 1/6 : FROM gcr.io/deeplearning-platform-release/tf-cpu.2-3
 ---> 7c0738e47d7d
Step 2/6 : WORKDIR /
 ---> Using cache
 ---> d7460d021e89
Step 3/6 : COPY requirements.txt ./
 ---> Using cache
 ---> 30e431326c08
Step 4/6 : RUN pip install --no-cache-dir --upgrade pip   && pip install --no-cache-dir -r requirements.txt
 ---> Using cache
 ---> efae6da9c9a7
Step 5/6 : COPY trainer /trainer
 ---> 689e230a1214
Step 6/6 : ENTRYPOINT ["python", "-m", "trainer.train"]
 ---> Running in 08bc2c283989
Removing intermediate container 08bc2c283989
 ---> 53467944ab78
Successfully built 53467944ab78
Successfully tagged us-central1-docker.pkg.dev/statmike-mlops-349915/statmike-mlops-349915/05i_fraud:latest


#### Push The Custom Container To Artifact Registry

In [22]:
!docker push $IMAGE_URI

The push refers to repository [us-central1-docker.pkg.dev/statmike-mlops-349915/statmike-mlops-349915/05i_fraud]

[1Bb5edf374: Preparing 
[1Be876ed82: Preparing 
[1B2dddd539: Preparing 
[1B44a0e0fb: Preparing 
[1Bb70226ae: Preparing 
[1Ba2906ddf: Preparing 
[1B56d0c008: Preparing 
[1B59dfa907: Preparing 
[1B668df2d8: Preparing 
[1B767a76ae: Preparing 
[1B559b3e11: Preparing 
[1Bc5f28369: Preparing 
[1Beeca4cbf: Preparing 
[1Bc2b66f65: Preparing 
[1B0bba959a: Preparing 
[1B677fbd36: Preparing 
[1B713472f0: Preparing 
[1B33654a88: Preparing 
[1Bbf18a086: Preparing 
[1B5cfc6aa2: Preparing 
[3Bbf18a086: Preparing 
[1B4b178955: Preparing 
[1Bd92504ae: Layer already exists 3kB3A[2K[21A[2K[12A[2K[9A[2K[5A[2K[1A[2Klatest: digest: sha256:b87a493b194ca765d00161b03639db4ceb16c641db1bde257d1162a1dc775e26 size: 5338


### Setup Training Job

In [23]:
CMDARGS = [
    "--epochs=" + str(EPOCHS),
    "--batch_size=" + str(BATCH_SIZE),
    "--var_target=" + VAR_TARGET,
    "--var_omit=" + VAR_OMIT,
    "--project_id=" + PROJECT_ID,
    "--bq_project=" + BQ_PROJECT,
    "--bq_dataset=" + BQ_DATASET,
    "--bq_table=" + BQ_TABLE,
    "--region=" + REGION,
    "--experiment=" + EXPERIMENT,
    "--series=" + SERIES,
    "--experiment_name=" + EXPERIMENT_NAME,
    "--run_name=" + RUN_NAME
]

MACHINE_SPEC = {
    "machine_type": TRAIN_COMPUTE,
    "accelerator_count": 0
}

WORKER_POOL_SPEC = [
    {
        "replica_count": 1,
        "machine_spec": MACHINE_SPEC,
        "container_spec": {
            "image_uri": IMAGE_URI,
            "command": [],
            "args": CMDARGS
        }
    }
]

In [24]:
customJob = aiplatform.CustomJob(
    display_name = f'{EXPERIMENT}_{BQ_DATASET}_{TIMESTAMP}',
    worker_pool_specs = WORKER_POOL_SPEC,
    base_output_dir = f"{URI}/{TIMESTAMP}",
    staging_bucket = f"{URI}/{TIMESTAMP}",
    labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}'}
)

### Setup Hyperparameter Tuning Job

In [25]:
METRIC_SPEC = {
    "auprc": "maximize"
}

PARAMETER_SPEC = {
    "lr": aiplatform.hyperparameter_tuning.DoubleParameterSpec(min=0.001, max=0.1, scale="log"),
    "m": aiplatform.hyperparameter_tuning.DoubleParameterSpec(min=1e-7, max=0.9, scale="linear")
}

In [26]:
tuningJob = aiplatform.HyperparameterTuningJob(
    display_name = f'{EXPERIMENT}_{BQ_DATASET}_{TIMESTAMP}',
    custom_job = customJob,
    metric_spec = METRIC_SPEC,
    parameter_spec = PARAMETER_SPEC,
    max_trial_count = 18,
    parallel_trial_count = 3,
    search_algorithm = None,
    labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}'}
)

### Run Training Job

In [27]:
tuningJob.run(
    service_account = SERVICE_ACCOUNT,
    tensorboard = tb.resource_name
)

Creating HyperparameterTuningJob
HyperparameterTuningJob created. Resource name: projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/7346570834305089536
To use this HyperparameterTuningJob in another session:
hpt_job = aiplatform.HyperparameterTuningJob.get('projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/7346570834305089536')
View HyperparameterTuningJob:
https://console.cloud.google.com/ai/platform/locations/us-central1/training/7346570834305089536?project=1026793852137
View Tensorboard:
https://us-central1.tensorboard.googleusercontent.com/experiment/projects+1026793852137+locations+us-central1+tensorboards+7179142426307592192+experiments+7346570834305089536
HyperparameterTuningJob projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/7346570834305089536 current state:
JobState.JOB_STATE_RUNNING
HyperparameterTuningJob projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/7346570834305089536 current state:
JobSt

In [28]:
tuningJob.resource_name, tuningJob.display_name

('projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/7346570834305089536',
 '05i_fraud_20220826194138')

Create hyperlinks to job and tensorboard here:

In [29]:
job_link = f"https://console.cloud.google.com/ai/platform/locations/{REGION}/training/{tuningJob.resource_name.split('/')[-1]}?project={PROJECT_ID}"
board_link = f"https://{REGION}.tensorboard.googleusercontent.com/experiment/{tb.resource_name.replace('/', '+')}+experiments+{tuningJob.resource_name.split('/')[-1]}"

In [30]:
print(f'Review the Job here:\n{job_link}')
print(f'Review the TensorBoard From the Job here:\n{board_link}')
print(f'Review the TensorBoard From the Job here (direct link to HPARAMS dashboard):\n{board_link}/#hparams')

Review the Job here:
https://console.cloud.google.com/ai/platform/locations/us-central1/training/7346570834305089536?project=statmike-mlops-349915
Review the TensorBoard From the Job here:
https://us-central1.tensorboard.googleusercontent.com/experiment/projects+1026793852137+locations+us-central1+tensorboards+7179142426307592192+experiments+7346570834305089536
Review the TensorBoard From the Job here (direct link to HPARAMS dashboard):
https://us-central1.tensorboard.googleusercontent.com/experiment/projects+1026793852137+locations+us-central1+tensorboards+7179142426307592192+experiments+7346570834305089536/#hparams


### Get Best Run

In [31]:
# if trial.state.name == 'SUCCEEDED'
auprc = [trial.final_measurement.metrics[0].value if trial.state.name == 'SUCCEEDED' else 1 for trial in tuningJob.trials]
auprc

[0.9997206330299377,
 0.999586284160614,
 0.9995100498199463,
 0.9997671842575073,
 0.9997327327728271,
 0.9996916651725769,
 0.9998591542243958,
 0.9997910261154175,
 0.9998483061790466,
 0.9997384548187256,
 0.9997789859771729,
 0.9993014931678772,
 0.9997274279594421,
 0.9996792674064636,
 0.9995850920677185,
 0.9997789859771729,
 0.999750554561615,
 0.9995386600494385]

In [32]:
best = tuningJob.trials[auprc.index(max(auprc))]
best

id: "7"
state: SUCCEEDED
parameters {
  parameter_id: "lr"
  value {
    number_value: 0.1
  }
}
parameters {
  parameter_id: "m"
  value {
    number_value: 0.9
  }
}
final_measurement {
  step_count: 1
  metrics {
    metric_id: "auprc"
    value: 0.9998591542243958
  }
}
start_time {
  seconds: 1661544733
  nanos: 237426333
}
end_time {
  seconds: 1661545419
}

In [33]:
best.id

'7'

---
## Serving

### Upload The Model

In [34]:
modelmatch = aiplatform.Model.list(filter = f'labels.series={SERIES} AND labels.experiemnt={EXPERIMENT}')
if modelmatch:
    print("Model Already in Registry:")
    if RUN_NAME in modelmatch[0].version_aliases:
        print("This version already loaded, no action taken.")
        model = aiplatform.Model(model_name = modelmatch[0].resource_name)
    else:
        print('Loading model as new default version.')
        model = aiplatform.Model.upload(
            display_name = f'{EXPERIMENT}_{BQ_DATSET}',
            model_id = f'model_{EXPERIMENT}_{BQ_DATASET}',
            parent_model =  modelmatch[0].resource_name,
            serving_container_image_uri = DEPLOY_IMAGE,
            artifact_uri = f"{URI}/{TIMESTAMP}/{best.id}/model",
            is_default_version = True,
            version_aliases = [f'{RUN_NAME}-{best.id}'],
            version_description = f'{RUN_NAME}-{best.id}',
            labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}-{best.id}'}        
        )
else:
    print('This is a new model, creating in model registry')
    model = aiplatform.Model.upload(
        display_name = f'{EXPERIMENT}_{BQ_DATASET}',
        model_id = f'model_{EXPERIMENT}_{BQ_DATASET}',
        serving_container_image_uri = DEPLOY_IMAGE,
        artifact_uri = f"{URI}/{TIMESTAMP}/{best.id}/model",
        is_default_version = True,
        version_aliases = [f'{RUN_NAME}-{best.id}'],
        version_description = f'{RUN_NAME}-{best.id}',
        labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}-{best.id}'}
    )  

This is a new model, creating in model registry
Creating Model
Create Model backing LRO: projects/1026793852137/locations/us-central1/models/model_05i_fraud/operations/2528037060598562816
Model created. Resource name: projects/1026793852137/locations/us-central1/models/5850576138186784768@1
To use this Model in another session:
model = aiplatform.Model('projects/1026793852137/locations/us-central1/models/5850576138186784768@1')


**Note** on Version Aliases:
>Expectation is a name starting with `a-z` that can include `[a-zA-Z0-9-]`

**Retrieve a Model Resource**

[Resource](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.Model)
```Python
model = aiplatform.Model(model_name = f'model_{NOTEBOOK}_{DATANAME}') # retrieves default version
model = aiplatform.Model(model_name = f'model_{NOTEBOOK}_{DATANAME}@time-{TIMESTAMP}') # retrieves specific version
model = aiplatform.Model(model_name = f'model_{NOTEBOOK}_{DATANAME}', version = f'time-{TIMESTAMP}') # retrieves specific version
```

### Vertex AI Experiment Update and Review

In [35]:
expRun = aiplatform.ExperimentRun(run_name = f'{RUN_NAME}-{best.id}', experiment = EXPERIMENT_NAME)

In [36]:
expRun.log_params({
    'model.uri': model.uri,
    'model.display_name': model.display_name,
    'model.resource_name': model.resource_name,
    'model.version_id': model.version_id,
    'model.versioned_resource_name': model.versioned_resource_name,
    'hyperparameterTuningJobs.display_name': tuningJob.display_name,
    'hyperparameterTuning.resource_name': tuningJob.resource_name,
    'hyperparameterTuning.link': job_link,
    'hyperparameterTuning.tensorboard': board_link
})

Complete the experiment run:

In [37]:
expRun.update_state(state = aiplatform.gapic.Execution.State.COMPLETE)

Need to add the `hyperparameterTuning` job information to each run of the experiment:

In [38]:
for trial in tuningJob.trials:
    expRun = aiplatform.ExperimentRun(run_name = f'{RUN_NAME}-{trial.id}', experiment = EXPERIMENT_NAME)
    expRun.log_params({
        'hyperparameterTuningJobs.display_name': tuningJob.display_name,
        'hyperparameterTuning.resource_name': tuningJob.resource_name,
        'hyperparameterTuning.link': job_link,
        'hyperparameterTuning.tensorboard': board_link
    })
    expRun.update_state(state = aiplatform.gapic.Execution.State.COMPLETE)

Retrieve the experiment:

In [39]:
exp = aiplatform.Experiment(experiment_name = EXPERIMENT_NAME)

In [40]:
exp.get_data_frame()

Unnamed: 0,experiment_name,run_name,run_type,state,param.series,param.hyperparameterTuning.resource_name,param.data_source,param.hyperparameterTuningJobs.display_name,param.hyperparameterTuning.tensorboard,param.hyperparameter.learning_rate,...,metric.val_auprc,metric.test_auprc,metric.train_auprc,metric.train_loss,time_series_metric.train_auprc,time_series_metric.train_accuracy,time_series_metric.val_loss,time_series_metric.val_accuracy,time_series_metric.val_auprc,time_series_metric.train_loss
0,experiment-05-05i-tf-classification-dnn,run-20220826194138-18,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.001,...,0.999399,0.999403,0.999336,0.006489,0.999539,0.999053,0.008083,0.999079,0.999399,0.005853
1,experiment-05-05i-tf-classification-dnn,run-20220826194138-17,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.033156,...,0.999717,0.999674,0.999622,0.003292,0.999751,0.999491,0.004434,0.999292,0.999717,0.00254
2,experiment-05-05i-tf-classification-dnn,run-20220826194138-16,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.008662,...,0.99967,0.999629,0.999634,0.003335,0.999779,0.999448,0.004613,0.999292,0.99967,0.002691
3,experiment-05-05i-tf-classification-dnn,run-20220826194138-15,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.001,...,0.999432,0.99944,0.999381,0.005269,0.999585,0.999303,0.005562,0.999186,0.999432,0.004371
4,experiment-05-05i-tf-classification-dnn,run-20220826194138-14,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.003615,...,0.999576,0.999536,0.999512,0.004191,0.999679,0.999395,0.005305,0.999256,0.999576,0.003417
5,experiment-05-05i-tf-classification-dnn,run-20220826194138-13,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.027179,...,0.999671,0.999721,0.999646,0.003477,0.999727,0.999483,0.004483,0.999292,0.999671,0.002745
6,experiment-05-05i-tf-classification-dnn,run-20220826194138-12,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.001,...,0.999134,0.999237,0.999122,0.009234,0.999301,0.998864,0.010526,0.998938,0.999134,0.008103
7,experiment-05-05i-tf-classification-dnn,run-20220826194138-11,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.073547,...,0.999622,0.999628,0.999662,0.003273,0.999779,0.999522,0.004614,0.999327,0.999622,0.002491
8,experiment-05-05i-tf-classification-dnn,run-20220826194138-10,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.002678,...,0.999531,0.999628,0.999587,0.003581,0.999738,0.999399,0.00537,0.999292,0.999531,0.002907
9,experiment-05-05i-tf-classification-dnn,run-20220826194138-9,system.ExperimentRun,COMPLETE,5,projects/1026793852137/locations/us-central1/h...,bq://statmike-mlops-349915.fraud.fraud_prepped,05i_fraud_20220826194138,https://us-central1.tensorboard.googleusercont...,0.028017,...,0.999622,0.999673,0.999777,0.002866,0.999848,0.99943,0.004175,0.999292,0.999622,0.002272


Review the Experiments TensorBoard to compare runs:

In [41]:
print(f"The Experiment TensorBoard Link:\nhttps://{REGION}.tensorboard.googleusercontent.com/experiment/{tb.resource_name.replace('/', '+')}+experiments+{exp.name}")

The Experiment TensorBoard Link:
https://us-central1.tensorboard.googleusercontent.com/experiment/projects+1026793852137+locations+us-central1+tensorboards+7179142426307592192+experiments+experiment-05-05i-tf-classification-dnn


### Compare This Run Using Experiments

Get a list of all experiments in this project:

In [42]:
experiments = aiplatform.Experiment.list()

Remove experiments not in the SERIES:

In [43]:
experiments = [e for e in experiments if e.name.split('-')[0:2] == ['experiment', SERIES]]

Combine the runs from all experiments in SERIES into a single dataframe:

In [44]:
results = []
for experiment in experiments:
        results.append(experiment.get_data_frame())
        print(experiment.name)
results = pd.concat(results)

experiment-05-05i-tf-classification-dnn
experiment-05-05h-tf-classification-dnn
experiment-05-05g-tf-classification-dnn
experiment-05-05f-tf-classification-dnn
experiment-05-05e-tf-classification-dnn
experiment-05-05d-tf-classification-dnn
experiment-05-05c-tf-classification-dnn
experiment-05-05b-tf-classification-dnn
experiment-05-05a-tf-classification-dnn
experiment-05-05-tf-classification-dnn


Create ranks for models within experiment and across the entire SERIES:

In [45]:
def ranker(metric = 'metric.test_auprc'):
    ranks = results[['experiment_name', 'run_name', 'param.model.display_name', 'param.model.version_id', metric]].copy().reset_index(drop = True)
    ranks = ranks[~ranks['param.model.display_name'].isnull()]
    ranks['series_rank'] = ranks[metric].rank(method = 'dense', ascending = False)
    ranks['experiment_rank'] = ranks.groupby('experiment_name')[metric].rank(method = 'dense', ascending = False)
    return ranks.sort_values(by = ['experiment_name', 'run_name'])
    
ranks = ranker('metric.test_auprc')
ranks

Unnamed: 0,experiment_name,run_name,param.model.display_name,param.model.version_id,metric.test_auprc,series_rank,experiment_rank
56,experiment-05-05-tf-classification-dnn,run-20220825143943,05_fraud,1,0.999398,10.0,1.0
55,experiment-05-05-tf-classification-dnn,run-20220825161109,05_fraud,2,0.999397,11.0,2.0
54,experiment-05-05-tf-classification-dnn,run-20220825175329,05_fraud,3,0.999344,12.0,3.0
53,experiment-05-05a-tf-classification-dnn,run-20220826104731,05a_fraud,1,0.999627,6.0,1.0
52,experiment-05-05b-tf-classification-dnn,run-20220826114523,05b_fraud,1,0.999582,7.0,1.0
51,experiment-05-05c-tf-classification-dnn,run-20220826163231,05c_fraud,1,0.999674,3.0,1.0
50,experiment-05-05d-tf-classification-dnn,run-20220826170803,05d_fraud,1,0.999579,9.0,1.0
49,experiment-05-05e-tf-classification-dnn,run-20220826174636,05e_fraud,1,0.999581,8.0,1.0
48,experiment-05-05f-tf-classification-dnn,run-20220826182653,05f_fraud,1,0.999672,4.0,1.0
47,experiment-05-05g-tf-classification-dnn,run-20220826184958-7,05g_fraud,1,0.999671,5.0,1.0


In [46]:
current_rank = ranks.loc[(ranks['param.model.display_name'] == model.display_name) & (ranks['param.model.version_id'] == model.version_id)]
current_rank

Unnamed: 0,experiment_name,run_name,param.model.display_name,param.model.version_id,metric.test_auprc,series_rank,experiment_rank
17,experiment-05-05i-tf-classification-dnn,run-20220826194138-7,05i_fraud,1,0.999812,1.0,1.0


In [47]:
print(f"The current model is ranked {current_rank['experiment_rank'].iloc[0]} within this experiment and {current_rank['series_rank'].iloc[0]} across this series.")

The current model is ranked 1.0 within this experiment and 1.0 across this series.


### Create/Retrieve The Endpoint For This Series

In [48]:
endpoints = aiplatform.Endpoint.list(filter = f"labels.series={SERIES}")
if endpoints:
    endpoint = endpoints[0]
    print(f"Endpoint Exists: {endpoints[0].resource_name}")
else:
    endpoint = aiplatform.Endpoint.create(
        display_name = f"{SERIES}_{BQ_DATASET}",
        labels = {'series' : f"{SERIES}"}    
    )
    print(f"Endpoint Created: {endpoint.resource_name}")

Endpoint Exists: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808


In [49]:
endpoint.display_name

'05_fraud'

In [50]:
endpoint.traffic_split

{'261177992061911040': 100}

In [51]:
deployed_models = endpoint.list_models()
deployed_models

[id: "261177992061911040"
 model: "projects/1026793852137/locations/us-central1/models/model_05h_fraud"
 display_name: "05h_fraud"
 create_time {
   seconds: 1661546709
   nanos: 108896000
 }
 dedicated_resources {
   machine_spec {
     machine_type: "n1-standard-4"
   }
   min_replica_count: 1
   max_replica_count: 1
 }
 model_version_id: "1"]

### Should This Model Be Deployed?
Is it better than the model already deployed on the endpoint?

In [52]:
deploy = False
if deployed_models:
    for deployed_model in deployed_models:
        deployed_rank = ranks.loc[(ranks['param.model.display_name'] == deployed_model.display_name) & (ranks['param.model.version_id'] == deployed_model.model_version_id)]['series_rank'].iloc[0]
        model_rank = current_rank['series_rank'].iloc[0]
        if deployed_model.display_name == model.display_name and deployed_model.model_version_id == model.version_id:
            print(f'The current model/version is already deployed.')
            break
        elif model_rank <= deployed_rank:
            deploy = True
            print(f'The current model is ranked better ({model_rank}) than a currently deployed model ({deployed_rank}).')
            break
    if deploy == False: print(f'The current model is ranked worse ({model_rank}) than a currently deployed model ({deployed_rank})')
else: 
    deply = True
    print('No models currently deployed.')

The current model is ranked better (1.0) than a currently deployed model (2.0).


### Deploy Model To Endpoint

In [53]:
if deploy:
    print(f'Deploying model with 100% of traffic...')
    endpoint.deploy(
        model = model,
        deployed_model_display_name = model.display_name,
        traffic_percentage = 100,
        machine_type = DEPLOY_COMPUTE,
        min_replica_count = 1,
        max_replica_count = 1
    )
else: print(f'Not deploying - current model is worse ({model_rank}) than the currently deployed model ({deployed_rank})')

Deploying model with 100% of traffic...
Deploying Model projects/1026793852137/locations/us-central1/models/5850576138186784768 to Endpoint : projects/1026793852137/locations/us-central1/endpoints/4573537362990071808
Deploy Endpoint model backing LRO: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808/operations/1862630215654572032


  value=value,


Endpoint model deployed. Resource name: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808


### Remove Deployed Models without Traffic

In [54]:
for deployed_model in endpoint.list_models():
    if deployed_model.id in endpoint.traffic_split:
        print(f"Model {deployed_model.display_name} with version {deployed_model.model_version_id} has traffic = {endpoint.traffic_split[deployed_model.id]}")
    else:
        endpoint.undeploy(deployed_model_id = deployed_model.id)
        print(f"Undeploying {deployed_model.display_name} with version {deployed_model.model_version_id} because it has no traffic.")

Undeploying Endpoint model: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808
Undeploy Endpoint model backing LRO: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808/operations/6595913424020963328
Endpoint model undeployed. Resource name: projects/1026793852137/locations/us-central1/endpoints/4573537362990071808
Undeploying 05h_fraud with version 1 because it has no traffic.
Model 05i_fraud with version 1 has traffic = 100


In [55]:
endpoint.traffic_split

{'4872864010489298944': 100}

In [56]:
endpoint.list_models()

[id: "4872864010489298944"
 model: "projects/1026793852137/locations/us-central1/models/model_05i_fraud"
 display_name: "05i_fraud"
 create_time {
   seconds: 1661548852
   nanos: 544172000
 }
 dedicated_resources {
   machine_spec {
     machine_type: "n1-standard-4"
   }
   min_replica_count: 1
   max_replica_count: 1
 }
 model_version_id: "1"]

---
## Prediction

See many more details on requesting predictions in the `05tools_1 Predictions` notebook.

### Prepare a record for prediction: instance and parameters lists

In [57]:
pred = bigquery.query(query = f"SELECT * FROM {BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE} WHERE splits='TEST' LIMIT 10").to_dataframe()

In [58]:
pred.head(4)

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V23,V24,V25,V26,V27,V28,Amount,Class,transaction_id,splits
0,35337,1.092844,-0.01323,1.359829,2.731537,-0.707357,0.873837,-0.79613,0.437707,0.39677,...,-0.167647,0.027557,0.592115,0.219695,0.03697,0.010984,0.0,0,a1b10547-d270-48c0-b902-7a0f735dadc7,TEST
1,60481,1.238973,0.035226,0.063003,0.641406,-0.260893,-0.580097,0.049938,-0.034733,0.405932,...,-0.057718,0.104983,0.537987,0.589563,-0.046207,-0.006212,0.0,0,814c62c8-ade4-47d5-bf83-313b0aafdee5,TEST
2,139587,1.870539,0.211079,0.224457,3.889486,-0.380177,0.249799,-0.577133,0.179189,-0.120462,...,0.180776,-0.060226,-0.228979,0.080827,0.009868,-0.036997,0.0,0,d08a1bfa-85c5-4f1b-9537-1c5a93e6afd0,TEST
3,162908,-3.368339,-1.980442,0.153645,-0.159795,3.847169,-3.516873,-1.209398,-0.292122,0.760543,...,-1.171627,0.214333,-0.159652,-0.060883,1.294977,0.120503,0.0,0,802f3307-8e5a-4475-b795-5d5d8d7d0120,TEST


In [59]:
newob = pred[pred.columns[~pred.columns.isin(VAR_OMIT.split()+[VAR_TARGET, 'splits'])]].to_dict(orient='records')[0]
#newob

In [60]:
instances = [json_format.ParseDict(newob, Value())]

### Get Predictions: Python Client

In [61]:
prediction = endpoint.predict(instances=instances)
prediction

Prediction(predictions=[[0.998993218, 0.00100678555]], deployed_model_id='4872864010489298944', model_version_id='1', model_resource_name='projects/1026793852137/locations/us-central1/models/model_05i_fraud', explanations=None)

In [62]:
prediction.predictions[0]

[0.998993218, 0.00100678555]

In [63]:
np.argmax(prediction.predictions[0])

0

### Get Predictions: REST

In [64]:
with open(f'{DIR}/request.json','w') as file:
    file.write(json.dumps({"instances": [newob]}))

In [65]:
!curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @{DIR}/request.json \
https://{REGION}-aiplatform.googleapis.com/v1/{endpoint.resource_name}:predict

{
  "predictions": [
    [
      0.998993218,
      0.00100678555
    ]
  ],
  "deployedModelId": "4872864010489298944",
  "model": "projects/1026793852137/locations/us-central1/models/model_05i_fraud",
  "modelDisplayName": "05i_fraud",
  "modelVersionId": "1"
}


### Get Predictions: gcloud (CLI)

In [66]:
!gcloud beta ai endpoints predict {endpoint.name.rsplit('/',1)[-1]} --region={REGION} --json-request={DIR}/request.json

Using endpoint [https://us-central1-prediction-aiplatform.googleapis.com/]
[[0.998993218, 0.00100678555]]


---
## Remove Resources
see notebook "99 - Cleanup"