# Getting Started with Amazon SageMaker Automatic Model Tuning

## Background

Machine learning model training is controlled by the set of values that is refered to as hyperparameters. In contrast to parameters plugged into optimization functions, such as node weights or bias, hyperparameters are defined before model training. With experience and expertise, one can tune them manually to obtain better model performance. Alternatively, this task can be automated by performing hyperparameter optimization to tune a range of values automatically. In this notebook we demonstrate how to achieve this by leveraging the Amazon SageMaker Automatic Model Tuning feature.

Amazon SageMaker Automatic Model Tuning (AMT) reduces the undifferentiated heavy lifting of researching the hyperparameter space, by launching training jobs with several sets of hyperparameter combinations and provides the set of best performing values as a result.

This tutorial will walk you through the Amazon SageMaker Automatic Model Tuning (AMT) feature, using a built-in XGBoost algorithm provided by the Amazon SageMaker library. Additional information can be found in the documentation pages below:
* For more information on running a simple hyperparameter tuning job: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-ex.html
* For documentation on using the HyperParameterTuner API with the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html

## Overview

This notebook is split into the following sections:
* Setup and Imports
* Load and Prepare dataset
* Train a SageMaker Built-In XGBoost Algorithm
* Train and Tune a SageMaker Built-In XGBoost Algorithm
* View the AMT job statistics 
* Visualize AMT job results and tuned Hyperparameters


### Setup and Imports

In [None]:
%load_ext autoreload
%autoreload 2

For this notebook, we recommend using Amazon SageMaker version 2.88.1 or higher, if your version is lower and you encounter issues, we recommend uncommenting the code below to upgrade your pip and sagemaker versions. Make sure to restart your kernel after upgrading.

In [None]:
import sagemaker

sagemaker.__version__ 

In [None]:
import sys

#!{sys.executable} -m pip install --upgrade pip       --quiet # upgrade pip to the latest vesion
#!{sys.executable} -m pip install --upgrade sagemaker==2.114.0 --quiet # upgrade SageMaker to the recommended version

In [None]:
import io
import os
import argparse
import traceback
import boto3
import numpy as np
import pandas as pd

from pathlib import Path

In [None]:
# SDK setup
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
sm = boto3.client('sagemaker')
boto_sess = boto3.Session(region_name=region)
sm_sess = sagemaker.session.Session(boto_session=boto_sess, sagemaker_client=sm)

In [None]:
BUCKET = sm_sess.default_bucket()
PREFIX = 'amt-visualize-demo/data'
s3_data_url = f's3://{BUCKET}/{PREFIX}'

# eventual output destination for our XGBoost model
output_path = f's3://{BUCKET}/{PREFIX}/output'
output_path

## Load and Prepare dataset 

In [None]:
!mkdir -p data

The dataset used in the notebook is a scikit-learn library copy of the test set of the UCI ML hand-written digits datasets https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits. Each datapoint is a 8x8 image of a digit. 

In [None]:
from sklearn import datasets

digits         = datasets.load_digits()
digits_df      = pd.DataFrame(digits.data)
digits_df['y'] = digits.target
digits_df.insert(0, 'y', digits_df.pop('y')) # XGBoost expects the target to be the first column 

In [None]:
# randomly sort the data then split out into train 70% and validation 30% sets
train_data, valid_data= np.split(
    digits_df, [int(0.7 * len(digits_df))]
)

In [None]:
train_data.to_csv("data/train.csv", index=False, header=False)
valid_data.to_csv("data/valid.csv", index=False, header=False)

We upload train and validation datasets into [Amazon S3](https://aws.amazon.com/s3/). Amazon SageMaker will interact with the data directly from S3.

In [None]:
boto_sess.resource("s3").Bucket(BUCKET).Object(os.path.join(PREFIX, "train/train.csv")
                                                 ).upload_file("data/train.csv")
boto_sess.resource("s3").Bucket(BUCKET).Object(os.path.join(PREFIX, "valid/valid.csv")
                                                 ).upload_file("data/valid.csv")

We will use the built-in algorithm that comes in an image URI as described in the docs here:
https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html

## Train an Amazon SageMaker Built-In XGBoost Algorithm

In [None]:
%%time
from sagemaker import image_uris
from sagemaker.session import Session
from sagemaker.inputs import TrainingInput

hyperparameters = {
        "num_class": "10",
        "max_depth":"5",
        "eta":"0.2",
        "gamma":"1",
        "min_child_weight":"6",
        "subsample":"0.7",
        "objective":"multi:softmax",
        "eval_metric":"accuracy",
        "num_round":"50"}

# lookup the XGBoost image URI and builds an XGBoost container
xgboost_container = sagemaker.image_uris.retrieve("xgboost", region, "1.5-1")
print(xgboost_container)

# construct a SageMaker estimator that calls the XGBoost container
estimator = sagemaker.estimator.Estimator(image_uri=xgboost_container, 
                                          hyperparameters=hyperparameters,
                                          role=role,
                                          instance_count=1, 
                                          instance_type='ml.m5.large', 
                                          volume_size=5, # 5 GB 
                                          output_path=output_path)

# define the data type and paths to the training and validation datasets
s3_input_train = TrainingInput(
    s3_data=f's3://{BUCKET}/{PREFIX}/train', content_type="csv")
s3_input_valid = TrainingInput(
    s3_data=f's3://{BUCKET}/{PREFIX}/valid', content_type="csv")

# execute the XGBoost training job
estimator.fit({'train': s3_input_train, 'validation': s3_input_valid})

## Train and Tune an Amazon SageMaker Built-In XGBoost Algorithm

Amazon SageMaker AMT now orchestrates different trials. We use `tuner.wait()` to pause notebook execution until the AMT job is completed. Depending on the number of jobs and the level parallelization this may take some time. For the example below it may take up to 30 minutes for 50 jobs. During this time you can view the status of your jobs in the console by navigating to Amazon SageMaker > Training > Hyperparameter tuning jobs.

For more information on AMT job monitoring, see: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-monitor.html

In [None]:
from sagemaker.tuner import IntegerParameter, CategoricalParameter
from sagemaker.tuner import ContinuousParameter, HyperparameterTuner

n_jobs = 50
n_parallel_jobs = 2

# redundant declaration - included for visibility 
hyperparameters = {
        "num_class": "10",
        "max_depth":"5",
        "eta":"0.2",
        "gamma":"1",
        "min_child_weight":"6",
        "subsample":"0.7",
        "objective":"multi:softmax",
        "eval_metric":"accuracy",
        "num_round":"50"}

hpt_ranges = {'eta': IntegerParameter(0, 1),
              'alpha': IntegerParameter(0, 2),
              'min_child_weight': IntegerParameter(1, 10),
              'max_depth': IntegerParameter(1, 20)
             }

tuner_parameters = {'estimator': estimator,
                    'base_tuning_job_name': 'bayesian',                   
                    'objective_metric_name': 'validation:accuracy',
                    'objective_type': 'Maximize',
                    'hyperparameter_ranges': hpt_ranges,
                    'strategy': 'Bayesian',
                    'max_jobs': n_jobs,
                    'max_parallel_jobs': n_parallel_jobs}

In [None]:
tuner = HyperparameterTuner(**tuner_parameters)
tuner.fit({'train': s3_input_train, 'validation': s3_input_valid}, wait=False)
tuner_name = tuner.describe()["HyperParameterTuningJobName"]
print(f'tuning job submitted: {tuner_name}.')

In [None]:
tuner.wait()

## View the AMT job statistics and results 

Your tuning jobs can be accessed from the Amazon SageMaker console at https://console.aws.amazon.com/sagemaker/. Select Hyperparameter tuning jobs from the Training menu to see the list. More information here: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-monitor.html

You can also check the results of the jobs programmatically and investigate the hyperparameters used, the final value achieved in the objective function and the total training time per job.

#### 1. Via the Amazon SageMaker Python SDK

In [None]:
sagemaker.HyperparameterTuningJobAnalytics(tuner_name).dataframe()[:10]

#### 2. Via the AWS SDK for Python (Boto3)

With the boto3 client we review the results of HPO job using [`describe_hyper_parameter_tuning_job()`](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeHyperParameterTuningJob.html) function.

In [None]:
#sm.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuner_name)   # to review all statistics
sm.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuner_name)['BestTrainingJob']

We can also utilize the Boto3 [`list_training_jobs_for_hyper_parameter_tuning_job()`](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/list-training-jobs-for-hyper-parameter-tuning-job.html) function to review the results. This can be sorted by the value of the objective function or by the metric definitions. More functions available for Amazon SageMaker with Boto3 are described on this page: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html

In [None]:
hpo_jobs = sm.list_training_jobs_for_hyper_parameter_tuning_job(
    HyperParameterTuningJobName=tuner_name,
    MaxResults=100,
    SortBy='FinalObjectiveMetricValue',
    SortOrder='Descending')

for job in hpo_jobs['TrainingJobSummaries'][:10]:
    job_descr = sm.describe_training_job(TrainingJobName=job['TrainingJobName'])
    metrics = {m['MetricName']:  m['Value'] for m in job_descr['FinalMetricDataList']}
    print(f'{job["TrainingJobName"]} Metrics: {metrics}')

## Visualize AMT job results and tuned Hyperparameters

Finally, we want to visualise the behaviour of our hyperparameters at different values.

To do this, we are utilize the Vega-Altair statistical visualization library for Python, and have written two custom analysis scripts `job_analytics.py` and `reporting_util.py` that we make available with this notebook.

In [None]:
!pip install -Uq pip altair

Please ensure that the role used by SageMaker allows the `cloudwatch:ListMetrics` action on [IAM](https://console.aws.amazon.com/iam).

In [None]:
from reporting_util import analyze_hpo_job
analyze_hpo_job(tuner)