## CIFAR-10 Classification Experiment

This demo shows how you can use SageMaker Experiment Management Python SDK to organize, track, compare, and evaluate your machine learning (ML) model training experiments.

You can track artifacts for experiments, including data sets, algorithms, hyper-parameters, and metrics. Experiments executed on SageMaker such as SageMaker Autopilot jobs and training jobs will be automatically tracked. You can also track artifacts for additional steps within an ML workflow that come before/after model training e.g. data pre-processing or post-training model evaluation.

The APIs also let you search and browse your current and past experiments, compare experiments, and identify best performing models.

Now we will demonstrate these capabilities through an `CIFAR-10` handwritten digits classification example. The experiment will be organized as follows:

1. Download and prepare the CIFAR-10 dataset.
2. Train a ResNet-50 Convolutional Neural Network (CNN) Model. Tune the hyper parameter that configures the number of epochs and the optimizer in the model. Track the parameter configurations and resulting model accuracy using SageMaker Experiments Python SDK.
3. Finally, use the search and analytics capabilities of Python SDK to search, compare, evaluate and visualize the performance of all model versions generated from model tuning in Step 2.

Make sure you selected `Python 3 (TensorFlow 2.3 Python 3.7 CPU Optimized)` kernel.

### Install Python Packages

In [None]:
import sys
!{sys.executable} -m pip install sagemaker-experiments==0.1.31 matplotlib

### Setup

In [None]:
import os
import time
import boto3
import itertools
import numpy as np
from sagemaker.tensorflow import TensorFlow
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import sagemaker
from sagemaker import get_execution_role

In [None]:
sess = boto3.Session()
sm = sess.client("sagemaker")
role = get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)

### Create a S3 bucket to hold data

In [None]:
# create a s3 bucket to hold data, note that your account might already created a bucket with the same name
account_id = sess.client("sts").get_caller_identity()["Account"]
bucket = "sagemaker-experiments-{}-{}".format(sess.region_name, account_id)
prefix = "tf2-cifar10-experiment"

print(bucket)

### Download cifar10 dataset and upload to Amazon S3

In [None]:
(X_train, y_train), (X_valid, y_valid) = cifar10.load_data()


with open("./data/training/train_data.npy", "wb") as f:
    np.save(f, X_train)

with open("./data/training/train_labels.npy", "wb") as f:
    np.save(f, y_train)

with open("./data/validation/validation_data.npy", "wb") as f:
    np.save(f, X_valid)

with open("./data/validation/validation_labels.npy", "wb") as f:
    np.save(f, y_valid)

In [None]:
s3_inputs_train = sagemaker.Session().upload_data(
    path="data/training", bucket=bucket, key_prefix=prefix + "/training"
)
s3_inputs_validation = sagemaker.Session().upload_data(
    path="data/validation", bucket=bucket, key_prefix=prefix + "/validation"
)
inputs = {"training": s3_inputs_train, "validation": s3_inputs_validation}
print(inputs)

### Step 1 - Set up the Experiment

Create an experiment to track all the model training iterations. Experiments are a great way to organize your data science work. You can create experiments to organize all your model development work for : [1] a business use case you are addressing (e.g. create experiment named “customer churn prediction”), or [2] a data science team that owns the experiment (e.g. create experiment named “marketing analytics experiment”), or [3] a specific data science and ML project. Think of it as a “folder” for organizing your “files”.

### Create an Experiment

In [None]:
from smexperiments.experiment import Experiment
from smexperiments.trial import Trial
from smexperiments.trial_component import TrialComponent
from smexperiments.tracker import Tracker

In [None]:
cifar10_experiment = Experiment.create(
    experiment_name=f"tf2-cifar10-classification-{int(time.time())}",
    description="Classification of CIFAR-10 dataset",
    sagemaker_boto_client=sm,
)
print(cifar10_experiment)

### Step 2 - Track Experiment
### Now create a Trial for each training run to track it's inputs, parameters, and metrics.
While training the ResNet-50 CNN model on SageMaker, we will experiment with several values for the number of hidden channel in the model. We will create a Trial to track each training job run. We will also create a `TrialComponent` from the tracker we created before, and add to the Trial. This will enrich the Trial with the parameters we captured from the data pre-processing stage.

In [None]:
hyperparam_options = {"optimizer": ["adam", "sgd", "rmsprop"], "epochs": [5, 10]}

hypnames, hypvalues = zip(*hyperparam_options.items())
trial_hyperparameter_set = [dict(zip(hypnames, h)) for h in itertools.product(*hypvalues)]
trial_hyperparameter_set

If you want to run the following training jobs asynchronously, you may need to increase your resource limit. Otherwise, you can run them sequentially.

<b>Note the execution of the following code takes around an hour.</b>

In [None]:
from sagemaker.tensorflow import TensorFlow

run_number = 1
for trial_hyp in trial_hyperparameter_set:
    # Combine static hyperparameters and trial specific hyperparameters
    hyperparams = trial_hyp

    # Create unique job name with hyperparameter and time
    time_append = int(time.time())
    hyp_append = "-".join([str(elm) for elm in trial_hyp.values()])
    training_job_name = f"tf2-cifar10-training-{hyp_append}-{time_append}"
    trial_name = f"trial-tf2-cifar10-training-{hyp_append}-{time_append}"
    trial_desc = f"my-tensorflow2-cifar10-run-{run_number}"

    # Create a new Trial and associate Tracker to it
    tf2_cifar10_trial = Trial.create(
        trial_name=trial_name,
        experiment_name=cifar10_experiment.experiment_name,
        sagemaker_boto_client=sm,
        tags=[{"Key": "trial-desc", "Value": trial_desc}],
    )

    # Create an experiment config that associates training job to the Trial
    experiment_config = {
        "ExperimentName": cifar10_experiment.experiment_name,
        "TrialName": tf2_cifar10_trial.trial_name,
        "TrialComponentDisplayName": training_job_name,
    }

    metric_definitions = [
        {"Name": "loss", "Regex": "loss: ([0-9\\.]+)"},
        {"Name": "accuracy", "Regex": "accuracy: ([0-9\\.]+)"},
        {"Name": "val_loss", "Regex": "val_loss: ([0-9\\.]+)"},
        {"Name": "val_accuracy", "Regex": "val_accuracy: ([0-9\\.]+)"},
    ]

    # Create a TensorFlow Estimator with the Trial specific hyperparameters
    cifar10_estimator = TensorFlow(
        entry_point="cifar10_tf2.py",
        source_dir="source_dir",
        role=sagemaker.get_execution_role(),
        instance_count=1,
        instance_type="ml.p3.2xlarge",
        framework_version="2.4.1",
        hyperparameters=hyperparams,
        py_version="py37",
        metric_definitions=metric_definitions,
        enable_sagemaker_metrics=True,
        tags=[{"Key": "trial-desc", "Value": trial_desc}],
    )

    # Launch a training job
    cifar10_estimator.fit(inputs, job_name=training_job_name, experiment_config=experiment_config)

    # give it a while before dispatching the next training job
    time.sleep(2)
    run_number = run_number + 1

### Compare the model training runs for an experiment

Now we will use the analytics capabilities of Python SDK to query and compare the training runs for identifying the best model produced by our experiment. You can retrieve trial components by using a search expression.

In [None]:
from sagemaker.analytics import ExperimentAnalytics

experiment_name = cifar10_experiment.experiment_name

trial_component_analytics = ExperimentAnalytics(
    sagemaker_session=sagemaker_session, experiment_name=experiment_name
)
trial_comp_ds_jobs = trial_component_analytics.dataframe()
trial_comp_ds_jobs

Let's show the accuracy, epochs and optimizer.
We will sort the results by accuracy descending.

In [None]:
trial_comp_ds_jobs = trial_comp_ds_jobs.sort_values("accuracy - Last", ascending=False)
trial_comp_ds_jobs["epochs"] = trial_comp_ds_jobs["epochs"].astype("Int64").astype("str")
trial_comp_ds_jobs[["TrialComponentName", "accuracy - Last", "epochs", "optimizer"]]

### Visualize experiment

Now we visualize the epochs/optimizer vs. accuracy in descending order

In [None]:
import matplotlib.pyplot as plt

trial_comp_ds_jobs["col_names"] = (
    trial_comp_ds_jobs["epochs"] + "-" + trial_comp_ds_jobs["optimizer"]
)

fig = plt.figure()
fig.set_size_inches([15, 10])
trial_comp_ds_jobs.plot.bar("col_names", "accuracy - Last", ax=plt.gca())