In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# E2E ML on GCP: MLOps stage 2 : experimentation: get started with Logging and Vertex AI Experiments

<table align="left">
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage2/get_started_vertex_experiments.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
    <td>
        <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage2/get_started_vertex_experiments.ipynb">
        <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt="Colab logo"> Run in Colab
        </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage2/get_started_vertex_experiments.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>

## Overview


This tutorial demonstrates how to use Vertex AI for E2E MLOps on Google Cloud in production. This tutorial covers stage 2 : experimentation: get started with Logging and Vertex AI Experiments.

### Objective

In this tutorial, you learn how to use Python logging and `Vertex AI Experiments` when training with `Vertex AI`.

This tutorial uses the following Google Cloud ML services:

- `Vertex AI Experiments`
- `Vertex AI ML Metadata`

The steps performed include:

- Use Python logging to log training configuration/results locally.
- Use Google Cloud Logging to log training configuration/results in cloud storage.
- Create a Vertex AI `Experiment` resource.
- Instantiate an experiment run.
- Log parameters for the run.
- Log metrics for the run.
- Display the logged experiment run.

### Recommendations

When doing E2E MLOps on Google Cloud, the following are some of the best practices for logging data when experimenting or formally training a model.

#### Python Logging

Use Python's logging package when doing ad-hoc training locally.

#### Cloud Logging

Use `Google Cloud Logging` when doing training on the cloud.

#### Experiments

Use Vertex AI Experiments in conjunction with logging when performing experiments to compare results for different experiment configurations.

### Costs
This tutorial uses billable components of Google Cloud:

- Vertex AI

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.

## Installations

Install the following packages for executing this notebook.

In [None]:
import os

# The Google Cloud Notebook product has specific requirements
IS_GOOGLE_CLOUD_NOTEBOOK = os.path.exists("/opt/deeplearning/metadata/env_version")

# Google Cloud Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_GOOGLE_CLOUD_NOTEBOOK:
    USER_FLAG = "--user"

! pip3 install --upgrade google-cloud-logging $USER_FLAG

### Restart the kernel

Once you've installed the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the Vertex AI, Compute Engine, Cloud Storage and Cloud Logging APIs](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component,storage_component,logging).

1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands.


#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`.

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = ! gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

In [None]:
! gcloud config set project $PROJECT_ID

#### Region

You can also change the `REGION` variable, which is used for operations
throughout the rest of this notebook.  Below are regions supported for Vertex AI. We recommend that you choose the region closest to you.

- Americas: `us-central1`
- Europe: `europe-west4`
- Asia Pacific: `asia-east1`

You may not use a multi-regional bucket for training with Vertex AI. Not all regions provide support for all Vertex AI services.

Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "[your-region]"  # @param {type: "string"}

if REGION == "[your-region]":
    REGION = "us-central1"

#### Timestamp

If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you create a timestamp for each instance session, and append the timestamp onto the name of resources you create in this tutorial.

In [None]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

### Authenticate your Google Cloud account

**If you are using Google Cloud Notebooks**, your environment is already authenticated. Skip this step.

**If you are using Colab**, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

In the Cloud Console, go to the [Create service account key](https://console.cloud.google.com/apis/credentials/serviceaccountkey) page.

1. **Click Create service account**.

2. In the **Service account name** field, enter a name, and click **Create**.

3. In the **Grant this service account access to project** section, click the Role drop-down list. Type "Vertex AI" into the filter box, and select **Vertex AI Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

4. Click Create. A JSON file that contains your key downloads to your local environment.

5. Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.

In [None]:
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

import os
import sys

# If on Google Cloud Notebook, then don't execute this code
if not os.path.exists("/opt/deeplearning/metadata/env_version"):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS ''

### Set up variables

Next, set up some variables used throughout the tutorial.
### Import libraries

In [None]:
import logging

import google.cloud.aiplatform as aiplatform

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)

## Python Logging

The Python logging package is widely used for logging within Python scripts. Commonly used features:

- Set logging levels.
- Send log output to console.
- Send log output to a file.

### Logging Levels in Python Logging

The logging levels in order (from least to highest) and each level inclusive of the previous level are :

1. Informational
2. Warnings
3. Errors
4. Debugging

By default, the logging level is set to error level.

### Logging output to console

By default, the Python logging package outputs to the console. Note, in the example the debug log message is not outputted since the default logging level is set to error.

In [None]:
def logging_examples():
    logging.info("Model training started...")
    logging.warning("Using older version of package ...")
    logging.error("Training was terminated ...")
    logging.debug("Hyperparameters were ...")


logging_examples()

### Setting logging level

To set the logging level, you get the logging handler using `getLogger()`. You can have multiple logging handles. When `getLogger()` is called without any arguments, it gets the default handler named ROOT. With the handler, you set the logging level with the method `setLevel()`.

In [None]:
logging.getLogger().setLevel(logging.DEBUG)

logging_examples()

### Clearing handlers

At times, you may desire to reconfigure your logging. A common practice in this case is to first remove all existing logging handles for a fresh start.

In [None]:
for handler in logging.root.handlers[:]:
    logging.root.removeHandler(handler)

### Output to a local file

You can preserve your logging output to a file that is local to where the Python script is running with the method `BasicConfig()`, that takes the following parameters:

- `filename`: The file path to the local file to write the log output to.
- `level`: Sets the level of logging that is written to the logging file.

*Note:* You cannot use a Cloud Storage bucket as the output file.

In [None]:
logging.basicConfig(filename="mylog.log", level=logging.DEBUG)

logging_examples()

! cat mylog.log

## Logging with Google Cloud Logging

You can preserve and retrieve your logging output to `Google Cloud Logging` service. Commonly used features:

- Set logging levels.
- Send log output to storage.
- Retrieve log output from storage.

### Logging Levels in Cloud Logging

The logging levels in order (from least to highest) are, with each level inclusive of the previous level:

1. Informational
2. Warnings
3. Errors
4. Debugging

By default, the logging level is set to warning level.

### Configurable and storing log data.

To use the `Google Cloud Logging` service, you do the following steps:

1. Create a client to the service.
2. Obtain a handler for the service.
3. Create a logger instance and set logging level.
4. Attach logger instance to the service.

Learn more about [Logging client libraries](https://cloud.google.com/logging/docs/reference/libraries).

In [None]:
import google.cloud.logging
from google.cloud.logging.handlers import CloudLoggingHandler

# Connect to the Cloud Logging service
cl_client = google.cloud.logging.Client(project=PROJECT_ID)
handler = CloudLoggingHandler(cl_client, name="mylog")

# Create a logger instance and logging level
cloud_logger = logging.getLogger("cloudLogger")
cloud_logger.setLevel(logging.INFO)

# Attach the logger instance to the service.
cloud_logger.addHandler(handler)

# Log something
cloud_logger.error("bad news")

### Logging output

Logging output at specific levels is identical to Python logging with respect to method and method names. The only difference is that you use your instance of the cloud logger in place of logging.

In [None]:
cloud_logger.info("Model training started...")
cloud_logger.warning("Using older version of package ...")
cloud_logger.error("Training was terminated ...")
cloud_logger.debug("Hyperparameters were ...")

### Get logging entries

To get the logged output, you:

1. Retrieve the log handle to the service.
2. Using the handle, call the method `list_entries()`.
3. Iterate through the entries.

In [None]:
logger = cl_client.logger("mylog")

for entry in logger.list_entries():
    timestamp = entry.timestamp.isoformat()
    print("* {}: {}: {}".format(timestamp, entry.severity, entry.payload))

## Logging with Vertex AI Experiments and Vertex AI ML Metadata

You can log results related to training experiments with `Vertex AI Experiments` and `ML Metadata` including:

- Preserve results of an experiment.
- Track multiple runs i.e., training runs within an experiment.
- Track parameters (configuration) and metrics (results).
- Retrieve and display the logged output.

Learn more about [Experiments](https://cloud.google.com/vertex-ai/docs/experiments/).

### Create experiment for tracking training related metadata

Setup tracking for parameters (configuration) and metrics (results) in each experiment:

- `aiplatform.init()` - Create an experiment instance
- `aiplatform.start_run()` - Track a specific run within the experiment.

Learn more about [Introduction to Vertex AI ML Metadata](https://cloud.google.com/vertex-ai/docs/ml-metadata/introduction).

In [None]:
# Specify a name for the experiment
EXPERIMENT_NAME = "[your-experiment-name]"

if EXPERIMENT_NAME == "[your-experiment-name]":
    EXPERIMENT_NAME = "example-" + TIMESTAMP

In [None]:
# Create experiment
aiplatform.init(experiment=EXPERIMENT_NAME)
aiplatform.start_run("run-1")

### Log parameters for the experiment

Typically, an experiment is associated with a specific dataset and a model architecture. Within an experiment, you may have multiple training runs, where each run tries a different configuration. For example:

- Dataset split
- Dataset sampling and boosting
- Depth and width of layers
- Hyperparameters

These configuration settings are referred to as parameters, which you store as key-value pairs using the method `log_params()`

In [None]:
hyperparams = {}
hyperparams["epochs"] = 100
hyperparams["batch_size"] = 32
hyperparams["learning_rate"] = 0.01
aiplatform.log_params(hyperparams)

### Log metrics for the experiment

At the completion or termination of a run within an experiment, you can log results that you use to compare runs. For example:

- Evaluation metrics
- Hyperparameter search selection
- Time to train the model
- Early stop trigger

These results are referred to as metrics, which you store as key-value pairs using the method `log_metrics()`

In [None]:
metrics = {}
metrics["test_acc"] = 98.7
metrics["train_acc"] = 99.3
aiplatform.log_metrics(metrics)

### Get the experiment results

Next, you use the experiment name as a parameter to the method `get_experiment_df()` to get the results of the experiment as a pandas dataframe.

In [None]:
experiment_df = aiplatform.get_experiment_df()
experiment_df = experiment_df[experiment_df.experiment_name == EXPERIMENT_NAME]
experiment_df.T

# Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

### Delete the experiment

Next, delete the experiment. You will need to get the context via the metadata to delete it.

In [None]:
c = aiplatform.metadata._Context(EXPERIMENT_NAME)
c.delete()