# 02d - Vertex AI AutoML - Tabular Workflows - TabNet

**IN DEVELOPMENT - INCOMPLETE**

- https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_study_spec_parameters_override
- https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_trainer_pipeline_and_parameters
- https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_hyperparameter_tuning_job_pipeline_and_parameters
- https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.TabNetHyperparameterTuningJobOp
- https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.TabNetTrainerOp


---
## Setup

### Package Installs (if needed)

This notebook uses the Python Clients for
- Google Service Usage
    - to enable APIs (Artifact Registry and Cloud Build)

The cells below check to see if the required Python libraries are installed.  If any are not it will print a message to do the install with the associated pip command to use.  These installs must be completed before continuing this notebook.

In [19]:
try:
    import google.cloud.service_usage_v1
except ImportError:
    print('You need to pip install google-cloud-service-usage')
    !pip install google-cloud-service-usage -q

### Environment

inputs:

In [1]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [2]:
REGION = 'us-central1'
EXPERIMENT = '02d'
SERIES = '02'

# source data
BQ_PROJECT = PROJECT_ID
BQ_DATASET = 'fraud'
BQ_TABLE = 'fraud_prepped'

# Model Training
VAR_TARGET = 'Class'
VAR_OMIT = 'transaction_id' # add more variables to the string with space delimiters

packages:

In [20]:
from google.cloud import aiplatform
from google.cloud import storage
from google.cloud import bigquery
from google.cloud import service_usage_v1

from datetime import datetime
import json
import numpy as np
import pandas as pd

from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

from google_cloud_pipeline_components.experimental.automl.tabular import utils as automl_tabular_utils

clients:

In [21]:
aiplatform.init(project=PROJECT_ID, location=REGION)
bq = bigquery.Client()
gcs = storage.Client()
su_client = service_usage_v1.ServiceUsageClient()

parameters:

In [22]:
TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
BUCKET = PROJECT_ID
URI = f"gs://{BUCKET}/{SERIES}/{EXPERIMENT}"
DIR = f"temp/{EXPERIMENT}"

In [23]:
SERVICE_ACCOUNT = !gcloud config list --format='value(core.account)' 
SERVICE_ACCOUNT = SERVICE_ACCOUNT[0]
SERVICE_ACCOUNT

'1026793852137-compute@developer.gserviceaccount.com'

List the service accounts current roles:

In [24]:
!gcloud projects get-iam-policy $PROJECT_ID --filter="bindings.members:$SERVICE_ACCOUNT" --format='table(bindings.role)' --flatten="bindings[].members"

ROLE
roles/bigquery.admin
roles/owner
roles/run.admin
roles/storage.objectAdmin


>Note: If the resulting list is missing [roles/storage.objectAdmin](https://cloud.google.com/storage/docs/access-control/iam-roles) then [revisit the setup notebook](../00%20-%20Setup/00%20-%20Environment%20Setup.ipynb#permissions) and add this permission to the service account with the provided instructions.

environment:

In [25]:
!rm -rf {DIR}
!mkdir -p {DIR}

Experiment Tracking:

In [82]:
FRAMEWORK = 'tabnet'
TASK = 'classification'
MODEL_TYPE = 'dnn'
EXPERIMENT_NAME = f'experiment-{SERIES}-{EXPERIMENT}-{FRAMEWORK}-{TASK}-{MODEL_TYPE}'
RUN_NAME = f'run-{TIMESTAMP}'

### Enable APIs

Using Dataflow requires enabling these APIs for the Google Cloud Project.

Options for enabeling these.  In this notebook option 2 is used.
 1. Use the APIs & Services page in the console: https://console.cloud.google.com/apis
     - `+ Enable APIs and Services`
     - Search for Dataflow
 2. Use [Google Service Usage](https://cloud.google.com/service-usage/docs) API from Python
     - [Python Client For Service Usage](https://github.com/googleapis/python-service-usage)
     - [Python Client Library Documentation](https://cloud.google.com/python/docs/reference/serviceusage/latest)
     
The following code cells use the Service Usage Client to:
- get the state of the service
- if 'DISABLED':
    - Try enabling the service and return the state after trying
- if 'ENABLED' print the state for confirmation

#### Dataflow

In [26]:
dataflow = su_client.get_service(
    request = service_usage_v1.GetServiceRequest(
        name = f'projects/{PROJECT_ID}/services/dataflow.googleapis.com'
    )
).state.name


if dataflow == 'DISABLED':
    print(f'Dataflow is currently {dataflow} for project: {PROJECT_ID}')
    print(f'Trying to Enable...')
    operation = su_client.enable_service(
        request = service_usage_v1.EnableServiceRequest(
            name = f'projects/{PROJECT_ID}/services/dataflow.googleapis.com'
        )
    )
    response = operation.result()
    if response.service.state.name == 'ENABLED':
        print(f'Dataflow is now enabled for project: {PROJECT_ID}')
    else:
        print(response)
else:
    print(f'Dataflow already enabled for project: {PROJECT_ID}')

Dataflow is currently DISABLED for project: statmike-mlops-349915
Trying to Enable...
Dataflow is now enabled for project: statmike-mlops-349915


---
## TabNet Trainer - With Hyperparmeter Tuning

Get the column information (schema) for the training data:

In [27]:
query = f"""
SELECT *
FROM {BQ_PROJECT}.{BQ_DATASET}.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '{BQ_TABLE}'
"""
schema = bq.query(query = query).to_dataframe()

Make a list of features from the table schema - omit the target variable (`VAR_TARGET`) and other variables that are not part of the model training:

In [28]:
features = schema[~schema.column_name.isin([VAR_OMIT] + [VAR_TARGET] + ['splits'])].column_name.tolist()

Create a feature transformation configuration.  This example uses `auto_transforms` for all features but many custom options are avialable.

In [29]:
transform_config = {"auto_transforms": features}

Save the transform_config the the bucket path for this pipeline:

In [30]:
pipeline_job_root_dir = f'{URI}/pipelines/{TIMESTAMP}'
bucket = gcs.lookup_bucket(PROJECT_ID)
blob = bucket.blob(pipeline_job_root_dir.split(f'gs://{PROJECT_ID}/')[1] + f'/transform_config.json')
blob.upload_from_string(json.dumps(transform_config))

Use the pre-built component [`get_tabnet_hyperparamter_tuning_job_pipeline_and_parameters`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.23/google_cloud_pipeline_components.experimental.automl.tabular.html#google_cloud_pipeline_components.experimental.automl.tabular.utils.get_tabnet_hyperparameter_tuning_job_pipeline_and_parameters) to create a pipeline template and set parameter values for hyperparamter tuning.

In [None]:
(template_path, parameter_values) = automl_tabular_utils.get_tabnet_hyperparameter_tuning_job_pipeline_and_parameters(
    project = PROJECT_ID,
    location = REGION,
    target_column = VAR_TARGET,
    prediction_type = 'classification',
    root_dir = pipeline_job_root_dir,
    transform_config = pipeline_job_root_dir + f'/transform_config.json',
    data_source_bigquery_table_path = f'bq://{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}',
    predefined_split_key = 'splits',
    study_spec_metric_id = 'loss',
    study_spec_metric_goal = 'MINIMIZE',
    study_spec_parameters_override = automl_tabular_utils.get_tabnet_study_spec_parameters_override(
        dataset_size_bucket = "small",
        prediction_type = 'classification',
        training_budget_bucket = "small",
    ),
    max_trial_count = 200,
    parallel_trial_count = 10,
    max_failed_trial_count = 10,
    worker_pool_specs_override = [{'machine_spec': {'machine_type': 'n1-standard-8'}, }],
    run_evaluation = True,
)

Create a Vertex AI Pipeline Job:

In [32]:
pipeline_job = aiplatform.PipelineJob(
    display_name = EXPERIMENT,
    job_id = f'series-{SERIES}-{EXPERIMENT}-{TIMESTAMP}',
    pipeline_root = pipeline_job_root_dir,
    template_path = template_path,
    parameter_values = parameter_values,
    enable_caching = False,
)

Run the Vertex AI Pipeline Job:

In [None]:
pipeline_job.run(
    sync = True
)

Creating PipelineJob
PipelineJob created. Resource name: projects/1026793852137/locations/us-central1/pipelineJobs/series-02-02d-20221115172949
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/1026793852137/locations/us-central1/pipelineJobs/series-02-02d-20221115172949')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/series-02-02d-20221115172949?project=1026793852137
PipelineJob projects/1026793852137/locations/us-central1/pipelineJobs/series-02-02d-20221115172949 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/1026793852137/locations/us-central1/pipelineJobs/series-02-02d-20221115172949 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/1026793852137/locations/us-central1/pipelineJobs/series-02-02d-20221115172949 current state:
PipelineState.PIPELINE_STATE_RUNNING
PipelineJob projects/1026793852137/locations/us-central1/pipelineJobs/series-

## Get Best Trials Parameters

In [41]:
# If the job above was run then this pointer is already defined as pipeline_job:
pipeline = aiplatform.PipelineJob.get(resource_name = f'series-{SERIES}-{EXPERIMENT}-{TIMESTAMP}')

In [42]:
pipeline_details = pipeline.gca_resource.job_detail.task_details

In [47]:
for task in pipeline_details:
    if task.task_name == 'get-best-hyperparameter-tuning-job-trial':
        task_details = task

In [50]:
task_details.outputs['unmanaged_container_model']

artifacts {
  name: "projects/1026793852137/locations/us-central1/metadataStores/default/artifacts/3593481980663021634"
  display_name: "unmanaged_container_model"
  uri: "gs://statmike-mlops-349915/02/02d/pipelines/20221115172949/6571834875286913024/5278658018173517824/train/trial_155/model"
  etag: "1668571761193"
  create_time {
    seconds: 1668571719
    nanos: 531000000
  }
  update_time {
    seconds: 1668571761
    nanos: 193000000
  }
  state: LIVE
  schema_title: "system.Artifact"
  schema_version: "0.0.1"
  metadata {
    fields {
      key: "containerSpec"
      value {
        struct_value {
          fields {
            key: "healthRoute"
            value {
              string_value: "/health"
            }
          }
          fields {
            key: "imageUri"
            value {
              string_value: "us-docker.pkg.dev/vertex-ai/automl-tabular/prediction-server:20221020_1325_RC00"
            }
          }
          fields {
            key: "predictRoute"


In [70]:
for md in task_details.execution.metadata:
    if md == 'input:gcp_resources':
        hpt_uri = json.loads(task_details.execution.metadata[md])['resources'][0]['resourceUri']
        hpt_jobid = hpt_uri.rsplit('/', 1)[1]

In [69]:
hpt_uri, hpt_jobid

('https://us-central1-aiplatform.googleapis.com/v1/projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/8784187235424534528',
 '8784187235424534528')

In [75]:
tuningJob = aiplatform.HyperparameterTuningJob.get(hpt_jobid)

In [76]:
tuningJob.resource_name, tuningJob.display_name

('projects/1026793852137/locations/us-central1/hyperparameterTuningJobs/8784187235424534528',
 'tabnet-hyperparameter-tuning-job-6571834875286913024-5278658018173517824')

In [80]:
auprc = [1 - trial.final_measurement.metrics[0].value if trial.state.name == 'SUCCEEDED' else 0 for trial in tuningJob.trials]
auprc

[0.8677037209272385,
 0.8977457359433174,
 0.9001611471176147,
 0.8908845335245132,
 0.915027841925621,
 0.902820460498333,
 0.8819985017180443,
 0.9205226227641106,
 0.8662902265787125,
 0.9123073294758797,
 0.8082566261291504,
 0.9481976218521595,
 0.9295452758669853,
 0.9225355759263039,
 0.9963639543857425,
 0.9134967848658562,
 0.9269111901521683,
 0.9158775210380554,
 0.9333544746041298,
 0.9913488542661071,
 0.9926850749179721,
 0.9961234505753964,
 0.9958049366250634,
 0.9955801549367607,
 0.9960433579981327,
 0.9965664104092866,
 0.9963713132310659,
 0.9957402837462723,
 0.9962091210763901,
 0.995978101156652,
 0.9955959049984813,
 0.995867426507175,
 0.995451639406383,
 0.9957091896794736,
 0.9961250720079988,
 0.9960567271336913,
 0.99637147388421,
 0.9957201932556927,
 0.9955389364622533,
 0.996281556552276,
 0.9849202241748571,
 0.9953474551439285,
 0.9963344179559499,
 0.9962636115960777,
 0.9965063377749175,
 0.995743207167834,
 0.9873314378783107,
 0.994895467069,
 0.99

In [81]:
best = tuningJob.trials[auprc.index(max(auprc))]
best

id: "155"
state: SUCCEEDED
parameters {
  parameter_id: "alpha_focal_loss"
  value {
    number_value: 0.99
  }
}
parameters {
  parameter_id: "batch_momentum"
  value {
    number_value: 0.5
  }
}
parameters {
  parameter_id: "batch_size"
  value {
    number_value: 512.0
  }
}
parameters {
  parameter_id: "batch_size_ratio"
  value {
    number_value: 0.5
  }
}
parameters {
  parameter_id: "class_weight"
  value {
    number_value: 1.0
  }
}
parameters {
  parameter_id: "decay_every"
  value {
    number_value: 5000.0
  }
}
parameters {
  parameter_id: "decay_rate"
  value {
    number_value: 0.5
  }
}
parameters {
  parameter_id: "feature_dim"
  value {
    number_value: 200.0
  }
}
parameters {
  parameter_id: "feature_dim_ratio"
  value {
    number_value: 0.39029597890948337
  }
}
parameters {
  parameter_id: "gamma_focal_loss"
  value {
    number_value: 4.0
  }
}
parameters {
  parameter_id: "large_category_dim"
  value {
    number_value: 5.0
  }
}
parameters {
  parameter_id:

---
## Serving

### Upload The Model
Only upload the best model from the hyperparmeter tuning job:

In [24]:
modelmatch = aiplatform.Model.list(filter = f'display_name={SERIES}_{EXPERIMENT} AND labels.series={SERIES} AND labels.experiment={EXPERIMENT}')

upload_model = True
if modelmatch:
    print("Model Already in Registry:")
    if f'{RUN_NAME}-{best.id}' in modelmatch[0].version_aliases:
        print("This version already loaded, no action taken.")
        upload_model = False
        model = aiplatform.Model(model_name = modelmatch[0].resource_name)
    else:
        print('Loading model as new default version.')
        parent_model = modelmatch[0].resource_name

else:
    print('This is a new model, creating in model registry')
    parent_model = ''

if upload_model:
    model = aiplatform.Model.upload(
        display_name = f'{SERIES}_{EXPERIMENT}',
        model_id = f'model_{SERIES}_{EXPERIMENT}',
        parent_model =  modelmatch[0].resource_name,
        serving_container_image_uri = ,#DEPLOY_IMAGE,
        artifact_uri = ,#f"{URI}/models/{TIMESTAMP}/{best.id}/model",
        is_default_version = True,
        version_aliases = [f'{RUN_NAME}-{best.id}'],
        version_description = f'{RUN_NAME}-{best.id}',
        labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}-{best.id}'}        
    )

Model Already in Registry:
Loading model as new default version.
Creating Model
Create Model backing LRO: projects/1026793852137/locations/us-central1/models/model_05_05g/operations/4880038667457921024
Model created. Resource name: projects/1026793852137/locations/us-central1/models/3085066899818545152@2
To use this Model in another session:
model = aiplatform.Model('projects/1026793852137/locations/us-central1/models/3085066899818545152@2')


>**Note** on Version Aliases:
>Expectation is a name starting with `a-z` that can include `[a-zA-Z0-9-]`
>
>**Retrieve a Model Resource**
>[aiplatform.Model()](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform.Model)
>```Python
model = aiplatform.Model(model_name = f'model_{SERIES}_{EXPERIMENT}') # retrieves default version
model = aiplatform.Model(model_name = f'model_{SERIES}_{EXPERIMENT}@time-{TIMESTAMP}') # retrieves specific version
model = aiplatform.Model(model_name = f'model_{SERIES}_{EXPERIMENT}', version = f'time-{TIMESTAMP}') # retrieves specific version
```

In [25]:
print(f'Review the model in the Vertex AI Model Registry:\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/models/{model.name}?project={PROJECT_ID}')

Review the model in the Vertex AI Model Registry:
https://console.cloud.google.com/vertex-ai/locations/us-central1/models/3085066899818545152?project=statmike-mlops-349915
