# Train a few models and add it to a Vertex AI Experiment

In this exercise we will compare the results of training a few models using vertex AI experiments. We will see how changing the hyperparameters of the model affects the accuracy of the model. Here are the steps you need to do:

1. Download data and create a training script with some adjustable hyperparameters. In this exercise, we will train the model in this notebook instead of in vertex AI to save time and costs.
2. For each model you train, log the hyperparameters and metrics to vertex AI experiments
3. Fetch the results of all the experiments and compare the results.

In [None]:
!pip3 install --upgrade --user --force-reinstall tensorflow==2.5.0

In [None]:
! pip3 install --upgrade google-cloud-aiplatform --user -q

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

#### Set your project ID and region

In [None]:
shell_output = ! gcloud config list --format 'value(core.project)' 2>/dev/null
PROJECT_ID = shell_output[0]
print("Project ID:", PROJECT_ID)

REGION = "us-central1"

In [None]:
! gcloud config set project $PROJECT_ID

#### UUID

Some resources like the cloud bucket will need to have a unique name. An easy way to do that is to use a UUID.

In [None]:
import random
import string


# Generate a uuid of a specifed length(default=8)
def generate_uuid(length: int = 8) -> str:
    return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))


UUID = generate_uuid()

### Create a Cloud Storage bucket

In [None]:
BUCKET_NAME = "[your-bucket-name]"  # @param {type:"string"}
BUCKET_URI = f"gs://{BUCKET_NAME}"

In [None]:
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "[your-bucket-name]":
    BUCKET_NAME = PROJECT_ID + "aip-" + UUID
    BUCKET_URI = f"gs://{BUCKET_NAME}"

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [None]:
! gsutil ls -al $BUCKET_URI

### Import libraries

In [None]:
from google.cloud import bigquery
import pandas as pd
import google.cloud.aiplatform as aiplatform
from tensorflow.python.keras import Sequential, layers
from tensorflow.python.keras.utils import data_utils

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

In [None]:
bq_client = bigquery.Client(project=PROJECT_ID)

### Download data for training

In [None]:
import numpy as np
import pandas as pd

LABEL_COLUMN = "species"

# Define the BigQuery source dataset
BQ_SOURCE = "bigquery-public-data.ml_datasets.iris"

# Define NA values
NA_VALUES = ["NA", "."]

# Download a table
table = bq_client.get_table(BQ_SOURCE)
df = bq_client.list_rows(table).to_dataframe()

# Drop unusable rows
df = df.replace(to_replace=NA_VALUES, value=np.NaN).dropna()

# Convert categorical columns to numeric
df["species"], species_values = pd.factorize(df["species"])


# Split into a training and holdout dataset
df_train = df.sample(frac=0.8, random_state=100)
df_for_prediction = df[~df.index.isin(df_train.index)]

# Map numeric values to string values
index_to_species = dict(enumerate(species_values))

# View the mapped island, species, and sex data
print(index_to_species)


In [None]:
df_train.to_csv('train_df.csv', index=False)
df_for_prediction.to_csv('test_df.csv', index=False)

### Create a new Vertex AI Experiment

After creating the experiment, initialize your vertex AI object with the experiment name and tensorboard instance

In [None]:
EXPERIMENT_NAME = "vertex-ai-experiments"  # @param {type:"string"}

In [None]:
aiplatform_tb = aiplatform.Tensorboard.create()

In [None]:
aiplatform.init(experiment=EXPERIMENT_NAME, experiment_tensorboard=aiplatform_tb)

### Write the training script

Since we will be training locally instead of in vertex AI (to save time and cost) I have modified the training script to fetch data from a csv saved locally.

In [None]:
import argparse
import numpy as np
import os

import pandas as pd
import tensorflow as tf


# Download dataset splits
df_train = pd.read_csv('train_df.csv')
df_test = pd.read_csv('test_df.csv')

def convert_dataframe_to_dataset(
    df_train: pd.DataFrame,
):
    df_train_x, df_train_y = df_train, df_train.pop(LABEL_COLUMN)

    y_train = np.asarray(df_train_y).astype("float32")

    # Convert to numpy representation
    x_train = np.asarray(df_train_x).astype("float32")

    # Convert to one-hot representation
    num_species = len(df_train_y.unique())
    y_train = tf.keras.utils.to_categorical(y_train, num_classes=num_species)

    dataset_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    return dataset_train

def create_model(num_units=100):
    # Create model
    Dense = tf.keras.layers.Dense
    model = tf.keras.Sequential(
        [
            Dense(
                num_units,
                activation=tf.nn.relu,
                kernel_initializer="uniform",
                input_dim=4,
            ),
            Dense(75, activation=tf.nn.relu),
            Dense(50, activation=tf.nn.relu),            
            Dense(25, activation=tf.nn.relu),
            Dense(3, activation=tf.nn.softmax),
        ]
    )
    
    # Compile Keras model
    optimizer = tf.keras.optimizers.RMSprop(lr=0.001)
    model.compile(
        loss="categorical_crossentropy", metrics=["accuracy"], optimizer=optimizer
    )
    
    return model


def train_model(model, dataset_train, epochs):
    # Train the model
    history=model.fit(dataset_train, epochs=epochs)
    
    return history


In [None]:
# Create datasets
dataset_train = convert_dataframe_to_dataset(df_train)

# Set up datasets
dataset_train = dataset_train.batch(10)

### Create training experiments and log parameters for each experiment

For each experiment, you can change a few parameters of the model. For each parameter we will log the parameter value as well as the resulting accuracy. This way we can track how changing the parameters affects the accuracy.

In [None]:
# Define experiment parameters
parameters = [
    {"num_units": 100, "epochs": 3},
    {"num_units": 50,  "epochs": 10},
    {"num_units": 25, "epochs": 20},
]

# Run experiments
for i, params in enumerate(parameters):

    # Initialize Vertex AI Experiment run
    aiplatform.start_run(run=f"model-{i}")

    # Log training parameters
    aiplatform.log_params(params)

    # Create the model
    model = create_model(num_units=params['num_units'])


    # Train model
    history = train_model(
        model,
        dataset_train,
        epochs=params["epochs"],
    )

    # Log additional parameters
    aiplatform.log_params(history.params)

    aiplatform.end_run()

### Fetch the experiment results

In [None]:
experiment_df = aiplatform.get_experiment_df()
experiment_df.T

### Remember to delete the resources you used to save training costs

In [None]:
# Delete experiment
exp = aiplatform.Experiment(EXPERIMENT_NAME)
exp.delete(delete_backing_tensorboard_runs=True)

! gsutil rm -rf {BUCKET_URI}