# MNIST Hand-written Digits Classification Experiment

<hr>

## Demo for using SageMaker Experiment Management ( Version Alpha)

This demo shows how you can use SageMaker Experiment Management Python SDK to organize, track, compare, and evaluate your machine learning (ML) model training experiments. 

You can track artifacts for hundreds and thousands of experiments, including data sets, algorithms, hyper-parameters, and metrics. Experiments executed on SageMaker will be automatically tracked. In addition, you can use the APIs to track experiments executed outside SageMaker e.g. models trained locally in your notebooks. You can also track artifacts for additional steps within an ML workflow that come before/after model training e.g. data pre-processing or post-training model evaluation. 

The APIs also let you query and compare experiments to pick the best performing models for your business use case.

Now we will demonstrate these capabilities through an MNIST hand written digits classification example. The experiment will be organized as follow:

1. Download and prepare the mnist dataset


2. Train 2-layer Multi Layer Perceptron (MLP) network.


3. Tune the number of hidden units in the network. Use SageMaker Experiment Management Python SDK APIs to track parameters and results.


4. Train a Convolutional Neural Network (CNN) model on SageMaker to compare its performance against the MLP based approach. For this training run executed using SageMaker estimator, the experiment parameters are automatically tracked  - no additional instrumentation required.


5. Finally, use  the analytics capabilities of Python SDK to visualize and compare the performance of all the model versions generated in previous steps.

Note that the ML framework we will be using throughout the experiment is `Pytroch` and `Scikit-learn`. So please switch the notebook kernel to `conda_pytorch_p36` if you haven't done so.

## Setup
<hr>

In [None]:
# add boto service model
!aws configure add-model --service-model file://./source/build/model/sagemaker-2017-07-24.normal.json --service-name sagemaker
!cp ./source/build/model/sagemaker-2017-07-24.paginators.json ~/.aws/models/sagemaker/2017-07-24/paginators.json
!cp ./source/build/model/sagemaker-2017-07-24.waiters-2.json ~/.aws/models/sagemaker/2017-07-24/waiters-2.json

In [None]:
import time

import boto3
import numpy as np
import pandas as pd
import multiprocessing
from matplotlib import pyplot as plt
import seaborn as sns; sns.set()
%config InlineBackend.figure_format = 'retina'

from torchvision import datasets, transforms

import sagemaker
from sagemaker.session import Session
from sagemaker.experiments.experiment import Experiment
from sagemaker.experiments.trial import Trial
from sagemaker.experiments.trial_component import TrialComponent
from sagemaker.analytics import TrialAnalytics

In [None]:
sess = boto3.Session()
role = sagemaker.get_execution_role()

In [None]:
# create a sagemaker client
sm = sess.client('sagemaker', region_name=sess.region_name)

### Create a S3 bucket to hold data

<div class="alert alert-block alert-warning">You will need to have permissions to create bucket.</div>

In [None]:
# create a s3 bucket to hold data
account_id = sess.client('sts').get_caller_identity()["Account"]
bucket = 'sagemaker-experiments-alpha-{}-{}'.format(account_id, sess.region_name)
prefix = 'mnist'

# list buckets to ensure no bucket with the same name gets created
s3_client = sess.client('s3')
response = s3_client.list_buckets()
buckets = [bucket['Name'] for bucket in response['Buckets']]

if bucket in buckets:
    print("{} already exists.".format(bucket))
else:
    # create the bucket
    s3_client.create_bucket(Bucket=bucket, CreateBucketConfiguration={'LocationConstraint': sess.region_name})

### Download, Transform and upload Dataset to S3

In [None]:
# download the dataset this will not only download data to ./mnist folder, but also load and transform them
train_set = datasets.MNIST('mnist', train=True, transform=transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))]), 
    download=True)
                           
test_set = datasets.MNIST('mnist', train=False, transform=transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.1307,), (0.3081,))]),
    download=False)

In [None]:
plt.imshow(train_set.data[2].numpy())

In [None]:
# upload the data to s3
inputs = sagemaker.Session().upload_data(path='mnist', bucket=bucket, key_prefix=prefix)
print('input spec: {}'.format(inputs))

## Step 1 - Set up the Experiment
Create an experiment to track all the model training iterations. Experiments are a great way to organize your data science work. You can create experiments to organize all your model development work for : [1] a business use case you are addressing (e.g. create experiment named “customer churn prediction”), or [2] a data science team that owns the experiment (e.g. create experiment named “marketing analytics experiment”), or  [3] a specific data science and ML project. Think of its as a “folder” for organizing your “files”.
<hr>

In [None]:
# creating an experiment to track training jobs for creating a mnist classifier.
mnist_experiment = Experiment.create(
    experiment_name="mnist-digits-classification", 
    description="Classification of mnist hand-written digits")

## Step 2 - Track Experiment
Track each of the model training iterations as “trials” within the experiment.
<hr>

### Create trials for training multiple versions of  an mlp model, each with a different value for number of hidden units. Track the experiment artifacts using the logging api offered by experiment management python sdk

In [None]:
def train_mlp(
    train_set,
    test_set,
    tracker,
    num_layer=2, 
    num_hidden=64, 
    batch_size=128, 
    lr=0.003, 
    optimizer='adam',
    max_iter=300,
    random_state=42,
    train_sample_size=3000,
):
    from sklearn.neural_network import MLPClassifier
    
    # log the dataset
    tracker.log_input("mlp_training_dataset", inputs)
    
    parameters = {
        "num_layer": num_layer,
        "num_hidden": num_hidden,
        "batch_size": batch_size,
        "lr": lr,
        "optimizer": optimizer,
        "max_iter": max_iter,
        "train_sample_size": train_sample_size,
        "random_state": random_state,
        
    }
    # log all the parameters
    tracker.log_parameters(parameters)
    
    mlp = MLPClassifier(
        hidden_layer_sizes=[num_hidden]*num_layer, 
        batch_size=batch_size,
        solver=optimizer,
        learning_rate_init=lr,
        random_state=random_state,
        max_iter=max_iter,
    )
    
    # for demo purpose we sample train set to reduce training time
    np.random.seed(1)
    sample_indices = np.random.choice(np.arange(train_set.data.shape[0]), size=train_sample_size, replace=False)
    X_train, y_train = train_set.data[sample_indices], train_set.targets[sample_indices]
    
    X_train = X_train.numpy().flatten().reshape(-1, 28*28)
    y_train = y_train.numpy()
    
    mlp.fit(X_train, y_train)
        
    train_acc = mlp.score(X_train, y_train)
    
    X_test, y_test = test_set.data, test_set.targets
    X_test = test_set.data.numpy().flatten().reshape(-1, 28*28)
    y_test =  test_set.targets.numpy()
    
    test_acc = mlp.score(X_test, y_test)
        
    return [num_hidden, train_acc*100, test_acc*100]

Here we tune the number of hidden units in network and record the accuracy for each local mlp training job. The Alpha version of SageMaker Experiment Management Python SDK does not support APIS for tracking training metrics. This  capability will be added when we make Experiment Management generally available.

In [None]:
hiddens = [4, 10, 24, 32, 64, 86, 128, 256]
mlp_training_results = np.zeros((len(hiddens), 3))

for i, hidden in enumerate(hiddens):
    print(f"Training: {hidden} hidden units ...")
    trial = mnist_experiment.create_trial(trial_name=f"local-mlp-training-job-{int(time.time())}")
    with trial.create_tracker(component_name="training") as mlp_training_tracker:
        accs = train_mlp(train_set, test_set, num_hidden=hidden, tracker=mlp_training_tracker)
        mlp_training_results[i] = accs
    print(f"Done: {hidden} hidden units.")

### Now training a CNN using SageMaker estimator

There are few ways to log and track your training job executed on sagemaker.

1. You can simply provide the experiment name, and a trial will be automatically created in this experiment with the same name as the training job.


2.	If you didn't provide the experiment name, you can still go back, and move this SageMaker training job to any experiment of your choice.

In this example, we supply experiment name as "mnist-digits-classification" to automatically track this training run in the ongoing experiment.

The training on SageMaker takes few minutes to complete ...

In [None]:
from sagemaker.pytorch import PyTorch

# just like how you kick-off a training job on SageMaker before, without any additional instrumentation, 
# your training job running on SageMaker is automatically tracked.
estimator = PyTorch(entry_point='mnist.py',
                    role=role,
                    sagemaker_session=sagemaker.Session(sagemaker_client=sm),
                    framework_version='1.1.0',
                    train_instance_count=2,
                    train_instance_type='ml.c4.xlarge',
                    source_dir='./source',
                    hyperparameters={
                        'epochs': 6,
                        'backend': 'gloo',
                        'dropout': 0.3,
                        'experiment-name':mnist_experiment.experiment_name,
                    },
                    metric_definitions=[
                        {'Name':'train:loss', 'Regex':'Train Loss: (.*?);'},
                        {'Name':'test:loss', 'Regex':'Test Average loss: (.*?),'},
                        {'Name':'test:accuracy', 'Regex':'Test Accuracy: (.*?)%;'}
                    ])

In [None]:
cnn_training_job_name = "cnn-training-job-{}".format(int(time.time()))

In [None]:
estimator.fit({'training': inputs}, job_name=cnn_training_job_name)

## Compare the model training runs for an experiment
Now we will use the analytics capabilities of Python SDK to query, visualize and compare the training runs for identifying the best model produced by our experiment
<hr>

In [None]:
# Analytics API returns all the logged metadata in a Pandas data frame for quick and easy analysis in notebook. 
# You can choose to get all metadata or a subset of it.
trial_analytics = TrialAnalytics(
    sagemaker_session=Session(sess, sm), 
    experiment_name=mnist_experiment.experiment_name,
    metric_names=['test:accuracy'],
    parameter_names=['trial_name', 'num_hidden', 'batch_size', 
                     'lr', 'num_hidden', 'optimizer', 
                     'num_layer', 'random_state', 'dropout', 
                     'InstanceCount', 'InstanceType']
)

In [None]:
analytic_table = trial_analytics.dataframe()

As we pointed out before, the Alpha version of SageMaker Python SDK doesn’t yet support logging metrics, however to simulate this experience, we will now manually add the testing accuracy for each MLP model version. Once the capability for logging metrics is available, you will be able to use it in the same way as you log other experiment artifacts such as data sets and parameters.

In [None]:
for row_idx in range(analytic_table.shape[0]):
    row = analytic_table.iloc[row_idx]
    if row['trial_name'].startswith('local-mlp'):
        hidden = row['num_hidden']
        result_index = np.where(mlp_training_results[:,0] == hidden)[0]
        analytic_table.at[row_idx, 'test:accuracy - last'] = mlp_training_results[result_index][0][-1]

In [None]:
analytic_table.sort_values(by='test:accuracy - last', ascending=False)

As we can see, for MLP, when `hidden_units=256`, the 2-layer mlp gives the best performance ~`90%` on testing data set. For the CNN model, the test accuracy is ~`94%`.

In [None]:
# we plot the hidden units against train/test performances for 2-layer mlp.
plt.title("2-layer MLP performance")
plt.plot(mlp_training_results[:,0], mlp_training_results[:,1], label="train")
plt.plot(mlp_training_results[:,0], mlp_training_results[:,2], label="test")
plt.ylabel("accuracy")
plt.xlabel("hidden units")
plt.legend()
plt.show()

## Cleanup
<hr>

### Clean up experiment entities created

In [None]:
def cleanup(experiment):
    '''Clean up everything in the given experiment object'''
    for trial_summary in experiment.list_trials():
        trial = Trial.load(trial_name=trial_summary.trial_name)
        for trial_step in trial.list_trial_components():
            trial_step.delete()
            # to prevent throttling
            time.sleep(1)
        trial.delete()
    
    experiment.delete()

In [None]:
cleanup(mnist_experiment)
del mnist_experiment

### Delete the s3 bucket created 
<div class="alert alert-block alert-warning">You could delete the bucket if you have the permission to attach the delete s3 bucket permission to the current sagemaker execution role shown below.</div>

In [None]:
role

In [None]:
# s3_client.delete_bucket(Bucket=bucket)