In [1]:
%%sh
pip -q install --upgrade pip
pip -q install sagemaker awscli boto3 --upgrade

In [2]:
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# Direct Marketing with Keras and Hyperparameter Tuning

Last update: December 2nd, 2019

In this lab, we're going to use a simple neural network implemented with [Keras](https://keras.io), a popular, beginner-friendly deep learning library.

Here's a high-level overview of the Keras code below:
* Read hyperparameters, architecture parameters (number and width of dense layers), and environment variables passed by SageMaker (as per [script mode](https://sagemaker.readthedocs.io/en/stable/using_tf.html))
* Read the full data set from the training channel,
* One-hot encode categorical variables,
* Separate samples (X) and labels (Y),
* Apply [min/max](https://en.wikipedia.org/wiki/Feature_scaling) scaling on numerical features,
* Split data set for training and validation,
* Build the neural network, with 1 to 'layers' dense layers, each one with 'dense_layer' neurons,
* Train the model, displaying precision, recall and f1 score,
* Score the model,
* Save the model.


In [3]:
!pygmentize dm_keras_tf.py

[34mimport[39;49;00m [04m[36margparse[39;49;00m, [04m[36mos[39;49;00m
[34mimport[39;49;00m [04m[36mnumpy[39;49;00m [34mas[39;49;00m [04m[36mnp[39;49;00m
[34mimport[39;49;00m [04m[36mpandas[39;49;00m [34mas[39;49;00m [04m[36mpd[39;49;00m

[34mimport[39;49;00m [04m[36mtensorflow[39;49;00m [34mas[39;49;00m [04m[36mtf[39;49;00m
[34mimport[39;49;00m [04m[36mkeras[39;49;00m

[34mimport[39;49;00m [04m[36msubprocess[39;49;00m
[34mimport[39;49;00m [04m[36msys[39;49;00m

[34mdef[39;49;00m [32minstall[39;49;00m(package):
    subprocess.call([sys.executable, [33m"[39;49;00m[33m-m[39;49;00m[33m"[39;49;00m, [33m"[39;49;00m[33mpip[39;49;00m[33m"[39;49;00m, [33m"[39;49;00m[33minstall[39;49;00m[33m"[39;49;00m, package])
    
[34mif[39;49;00m [31m__name__[39;49;00m == [33m'[39;49;00m[33m__main__[39;49;00m[33m'[39;49;00m:      
    
    [37m# Keras-metrics brings additional metrics: precision, recall, f1[39;49;00m
    

In [4]:
import sagemaker
import boto3

print (sagemaker.__version__)

sess   = sagemaker.Session()
bucket = sess.default_bucket()                     
prefix = 'sagemaker-autopilot/DEMO-hpo-keras-dm'
region = boto3.Session().region_name

# Role when working on a notebook instance
role = sagemaker.get_execution_role()
# Role when working locally
# role = ROLE_ARN

1.50.9


We upload the raw dataset to S3, as the Keras script itself will perform basic preprocessing.

In [5]:
training_input_path = sess.upload_data('bank-additional/bank-additional-full.csv', key_prefix=prefix+'/training')

print(training_input_path)

s3://sagemaker-us-east-1-806570384721/sagemaker-autopilot/DEMO-hpo-keras-dm/training/bank-additional-full.csv


## Configure Automatic Model Tuning

In [6]:
from sagemaker.tensorflow import TensorFlow

tf_estimator = TensorFlow(entry_point='dm_keras_tf.py', 
                          role=role,
                          train_instance_count=1, 
                          train_instance_type='ml.c5.2xlarge',
                          framework_version='1.14', 
                          py_version='py3',
                          script_mode=True,
                          train_use_spot_instances=True,        # Use spot instance
                          train_max_run=600,                    # Max training time
                          train_max_wait=3600                   # Max training time + spot waiting time
                         )

Let's try to tune our Keras model on two architecture parameters: number of dense layers, and dense layer width.

We're using the F1 metric again. It's not natively supported in Keras, and requires the addition of the keras-metrics package. Installation is done in the script itself. We also need to pass a regular expression so that SageMaker can locate and extract the metric from the training log.

In [7]:
from sagemaker.tuner import IntegerParameter, ContinuousParameter, HyperparameterTuner

hyperparameter_ranges = {
    'epochs':        IntegerParameter(1, 5),
    'learning-rate': ContinuousParameter(0.001, 0.1, scaling_type='ReverseLogarithmic'), # useful for values<1
    'batch-size':    IntegerParameter(16, 1024, scaling_type='Logarithmic'),
    'layers':        IntegerParameter(1, 4),
    'dense-layer':   IntegerParameter(4, 64)
}

objective_metric_name = 'f1_score'
objective_type = 'Maximize'
metric_definitions = [{'Name': 'f1_score', 'Regex': 'val_f1_score: ([0-9\\.]+)'}]

tuner = HyperparameterTuner(tf_estimator,
                            objective_metric_name,
                            hyperparameter_ranges,
                            metric_definitions,
                            max_jobs=20,
                            max_parallel_jobs=2,
                            objective_type=objective_type)

In [8]:
tuner.fit({'training': training_input_path})

You can repeatedly run the cells below while the job is running.

In [11]:
sagemaker = boto3.Session().client(service_name='sagemaker') 

job_name = tuner.latest_tuning_job.job_name

# run this cell to check current status of hyperparameter tuning job
tuning_job_result = sagemaker.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=job_name)

status = tuning_job_result['HyperParameterTuningJobStatus']
if status != 'Completed':
    print('Reminder: the tuning job has not been completed.')
    
job_count = tuning_job_result['TrainingJobStatusCounters']['Completed']
print("%d training jobs have completed" % job_count)

Reminder: the tuning job has not been completed.
10 training jobs have completed


## Inspect jobs with Amazon SageMaker Experiments

In [12]:
from sagemaker.analytics import HyperparameterTuningJobAnalytics

exp = HyperparameterTuningJobAnalytics(
    sagemaker_session=sess, 
    hyperparameter_tuning_job_name=tuner.latest_tuning_job.name
)

In [13]:
df = exp.dataframe()

In [14]:
df

Unnamed: 0,FinalObjectiveValue,TrainingElapsedTimeSeconds,TrainingEndTime,TrainingJobName,TrainingJobStatus,TrainingStartTime,batch-size,dense-layer,epochs,layers,learning-rate
0,0.3007,89.0,2020-02-04 18:33:13+00:00,tensorflow-training-200204-1757-020-79334ef9,Completed,2020-02-04 18:31:44+00:00,115.0,59.0,1.0,1.0,0.097707
1,0.1775,61.0,2020-02-04 18:31:51+00:00,tensorflow-training-200204-1757-019-c2d7394a,Completed,2020-02-04 18:30:50+00:00,42.0,62.0,4.0,1.0,0.005088
2,0.3861,42.0,2020-02-04 18:29:37+00:00,tensorflow-training-200204-1757-018-6035fda7,Completed,2020-02-04 18:28:55+00:00,62.0,49.0,3.0,1.0,0.075698
3,0.3189,90.0,2020-02-04 18:28:50+00:00,tensorflow-training-200204-1757-017-8133cb0d,Completed,2020-02-04 18:27:20+00:00,21.0,43.0,2.0,3.0,0.020413
4,0.0,44.0,2020-02-04 18:25:07+00:00,tensorflow-training-200204-1757-016-2d479e6a,Completed,2020-02-04 18:24:23+00:00,438.0,57.0,2.0,4.0,0.079491
5,0.3574,122.0,2020-02-04 18:26:29+00:00,tensorflow-training-200204-1757-015-1df02159,Completed,2020-02-04 18:24:27+00:00,19.0,57.0,2.0,1.0,0.047803
6,0.3701,42.0,2020-02-04 18:22:28+00:00,tensorflow-training-200204-1757-014-71492293,Completed,2020-02-04 18:21:46+00:00,30.0,63.0,2.0,1.0,0.052032
7,0.4615,64.0,2020-02-04 18:22:13+00:00,tensorflow-training-200204-1757-013-917f8f1d,Completed,2020-02-04 18:21:09+00:00,16.0,61.0,4.0,1.0,0.076594
8,0.4563,63.0,2020-02-04 18:19:29+00:00,tensorflow-training-200204-1757-012-ac34996f,Completed,2020-02-04 18:18:26+00:00,17.0,60.0,4.0,1.0,0.076594
9,0.4573,44.0,2020-02-04 18:19:04+00:00,tensorflow-training-200204-1757-011-401e68a9,Completed,2020-02-04 18:18:20+00:00,46.0,39.0,2.0,2.0,0.091455


'FinalObjectiveValue' is the F1 score. 

In [15]:
df.sort_values('FinalObjectiveValue', ascending=0)[:1]

Unnamed: 0,FinalObjectiveValue,TrainingElapsedTimeSeconds,TrainingEndTime,TrainingJobName,TrainingJobStatus,TrainingStartTime,batch-size,dense-layer,epochs,layers,learning-rate
7,0.4615,64.0,2020-02-04 18:22:13+00:00,tensorflow-training-200204-1757-013-917f8f1d,Completed,2020-02-04 18:21:09+00:00,16.0,61.0,4.0,1.0,0.076594


How does this compare to what you achieved in the first two labs?