![ga4](https://www.google-analytics.com/collect?v=2&tid=G-6VDTYWLKX6&cid=1&en=page_view&sid=1&dl=statmike%2Fvertex-ai-mlops%2F02+-+Vertex+AI+AutoML&dt=02b+-+Vertex+AI+-+AutoML+with+clients+%28code%29.ipynb)

# 02 Tools - AutoML Cloud Logging

Use the Vertex AI Python Client parse through the AutoML Tuning and Model Ensemble logs.

### View Model Architeture in Cloud Logging

This [link](https://cloud.google.com/vertex-ai/docs/tabular-data/classification-regression/logging#before_you_begin) provides information on how to use Cloud Logging to view details about a Vertex AI model.

**Note**: By Default, logs are deleted after ***30 days***.

**Prerequisites:**
-  02b - Vertex AI - AutoML with clients (code)

inputs:

In [None]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

In [None]:
REGION = 'us-central1'
DATANAME = 'fraud'
NOTEBOOK = '02b'

# Resources
DEPLOY_COMPUTE = 'n1-standard-4'

# Model Training
VAR_TARGET = 'Class'
VAR_OMIT = 'transaction_id' # add more variables to the string with space delimiters

packages:

In [None]:
from google.cloud import aiplatform
from datetime import datetime
from google.cloud import bigquery
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value
import json
import numpy as np
import pandas as pd
import google.cloud.logging_v2 as logging
import pandas_gbq as pd_gbq
import matplotlib.pylab as plt

clients:

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)
bigquery = bigquery.Client()

parameters:

In [None]:
model_list_filter = "labels.notebook=\"" + NOTEBOOK + "\""
model_list_filter

In [None]:
model_list = aiplatform.Model.list(filter=str(model_list_filter))
create = [m.create_time for m in model_list] 
create.index(max(create))
model_run_date_time = model_list[0].create_time
model_create_time = model_list[0].create_time
model_run_ts = datetime.strptime(str(model_create_time),"%Y-%m-%d %H:%M:%S.%f%z")
model_run_ts = model_run_ts.strftime("%Y-%m-%dT%H:%M:%SZ")
model_run_ts

In [None]:
# # import google.cloud.logging_v2 as logging
# # import pandas as pd
# # import json
# # import matplotlib.pylab as plt
# def get_logs(log_level):
#     logging_client = logging.Client()
#     hp_list = []
#     FILTER = log_filter
#     entries = logging_client.list_entries(filter_=FILTER)
#     for ind, entry in enumerate(entries):
#         if type(entry.payload) != dict:
#             continue
#         parse_log = entry.to_api_repr()
#         if model_log_filter in parse_log["logName"]:
#             if log_level == "tuning":
#                 for hp in parse_log["jsonPayload"]["modelStructure"]["modelParameters"]:
#                     # print(hp["hyperparameters"])
#                     hp_dict = hp["hyperparameters"]
#                     hp_dict["training_objective_point"] = parse_log['jsonPayload']['trainingObjectivePoint']['value']
#                     hp_list.append(hp_dict)
#             elif log_level == "model":
#                 for hp in parse_log["jsonPayload"]["modelParameters"]:
#                     hp_dict = hp["hyperparameters"]
#                     hp_list.append(hp_dict)
#             df = pd.DataFrame(hp_list)
#     return df

# display(get_logs(log_level))

### Parsing AutoML logs using Cloud Logging API

This [link](https://cloud.google.com/python/docs/reference/logging/latest/index.html) contains the details about using the **python client** for cloud logging.

There are two log levels in AutoML:
- Tuning
- Model

To start with, we will retrieve the tuning logs by setting the **log level** as `tuning` in the parameter list below:


In [None]:
log_level = "tuning"
model_log_filter = "projects/" + PROJECT_ID + "/logs/automl.googleapis.com%2F" + log_level
model_log_filter
log_filter = "timestamp > \"" + model_run_ts + "\" resource.type=\"cloudml_job\""
model_log_filter
log_filter

In [None]:
def get_tuning_logs(log_level):
    logging_client = logging.Client()
    hp_list = []
    FILTER = log_filter
    entries = logging_client.list_entries(filter_=FILTER)
    for ind, entry in enumerate(entries):
        if type(entry.payload) != dict:
            continue
        parse_log = entry.to_api_repr()
        if model_log_filter in parse_log["logName"]:
            for hp in parse_log["jsonPayload"]["modelStructure"]["modelParameters"]:
                hp_dict = hp["hyperparameters"]
                hp_list.append(hp_dict)
            tuning_log_df = pd.DataFrame(hp_list)
    return tuning_log_df

In [None]:
tuning_log_df = get_tuning_logs(log_level)
tuning_log_df = tuning_log_df.applymap(str)
# tuning_log_df.to_gbq(destination_table="automl_log.tuning_logs")
# pd_gbq.to_gbq(tuning_log_df,"automl_log.tuning_logs_str",project_id=PROJECT_ID)
with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
  display(tuning_log_df)

Next step is to add **training objective point** to the tuning logs. Please refer the [link](https://cloud.google.com/vertex-ai/docs/tabular-data/classification-regression/logging#before_you_begin) to learn more about training objective point.

In [None]:
# log_level = "tuning" ##possible values are tuning or model
# model_log_filter = "projects/" + PROJECT_ID + "/logs/automl.googleapis.com%2F" + log_level
# model_log_filter
# log_filter = "timestamp > \"" + model_run_ts + "\" resource.type=\"cloudml_job\""
# model_log_filter
# log_filter

In [None]:
def get_tuning_logs_with_obj(log_level):
    logging_client = logging.Client()
    hp_list = []
    FILTER = log_filter
    entries = logging_client.list_entries(filter_=FILTER)
    for ind, entry in enumerate(entries):
        if type(entry.payload) != dict:
            continue
        parse_log = entry.to_api_repr()
        if model_log_filter in parse_log["logName"]:
            for hp in parse_log["jsonPayload"]["modelStructure"]["modelParameters"]:
                # print(hp["hyperparameters"])
                hp_dict = hp["hyperparameters"]
                hp_dict["training_objective_point"] = parse_log['jsonPayload']['trainingObjectivePoint']['value']
                hp_list.append(hp_dict)
            df = pd.DataFrame(hp_list)
    return df

tuning_with_obj_log_df = get_tuning_logs_with_obj(log_level)
# pd_gbq.to_gbq(tuning_with_obj_log_df,"automl_log.tuning_with_obj_logs",project_id=PROJECT_ID)

For retrieving the model ensemble logs, update the **log_level** to `model` in the parameter list below:

In [None]:
log_level = "model"
model_log_filter = "projects/" + PROJECT_ID + "/logs/automl.googleapis.com%2F" + log_level
model_log_filter
log_filter = "timestamp > \"" + model_run_ts + "\" resource.type=\"cloudml_job\""
model_log_filter
log_filter

In [None]:
def get_model_logs(log_level):
    logging_client = logging.Client()
    hp_list = []
    FILTER = log_filter
    entries = logging_client.list_entries(filter_=FILTER)
    for ind, entry in enumerate(entries):
        if type(entry.payload) != dict:
            continue
        parse_log = entry.to_api_repr()
        if model_log_filter in parse_log["logName"]:
            for hp in parse_log["jsonPayload"]["modelParameters"]:
                hp_dict = hp["hyperparameters"]
                hp_list.append(hp_dict)
            model_log_df = pd.DataFrame(hp_list)
    return model_log_df

# display(get_model_logs(log_level))

In [None]:
model_log_df = get_model_logs(log_level)
model_log_df = model_log_df.applymap(str)
# pd_gbq.to_gbq(model_log_df,"automl_log.model_logs",project_id=PROJECT_ID)
# display(model_log_df)

Joining **Tuning** and **Model** logs to display the common training parmeters present in model ensemble logs

In [None]:
df_cd = pd.merge(tuning_log_df, model_log_df, how='inner')

In [None]:
with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    display(df_cd)