<h1>Model Training</h1>

In this notebook, we will use the Amazon SageMaker built-in Linear Learner algorithm to train a binary classification model, using the pre-processed data generated in step 1.

First let's take a look at our preprocessed data.

In [14]:
import boto3
import sagemaker

role = sagemaker.get_execution_role()
region = boto3.Session().region_name

print(region)
print(role)

bucket_name = 'gianpo-predictive-maintenance'

eu-west-1
arn:aws:iam::825935527263:role/gianpo-path/SageMaker-Notebook-Role


In [15]:
import boto3

file_name = 'windturbine_raw_data.csv.out'
s3 = boto3.resource('s3')
s3.Bucket(bucket_name).download_file('data/'+ file_name, file_name)

In [8]:
import pandas
import numpy

df = pandas.read_csv(file_name, header=None)
df.head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,19,20,21,22,23,24,25,26,27,28
0,0.0,1.464884,0.536121,-0.055834,1.565308,-0.830572,-1.172415,-1.619257,1.317691,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
1,1.0,1.708953,1.365431,-0.189292,0.894531,-0.732993,-0.342473,1.622002,0.586022,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
2,0.0,-0.145972,-0.927368,-0.856581,0.335549,-0.196309,0.58511,1.622002,-0.877317,0.0,...,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
3,1.0,1.123187,0.975167,0.077624,-1.341394,-1.611203,0.78039,-0.461664,1.464025,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0
4,0.0,-1.659201,-1.317632,1.145287,-1.229598,1.267375,0.145729,1.390484,-1.023651,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0
5,0.0,1.367257,1.02395,-0.990039,-1.006006,0.77948,1.415052,-1.619257,1.512803,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0
6,0.0,1.464884,-1.220066,-0.055834,1.229919,-0.489045,1.219771,-0.924701,0.293354,0.0,...,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0
7,0.0,0.439794,-1.024934,-0.055834,-1.117802,-1.220887,1.610332,1.158965,0.244576,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0
8,0.0,0.39098,-1.659113,1.412203,-1.117802,-0.342677,-1.611796,-0.924701,-0.291982,0.0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
9,0.0,-1.317504,-0.585887,-0.856581,0.894531,-1.172098,-0.879494,0.695928,1.220135,1.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0


Let's split the data into training and test sets. and then copy back to Amazon S3 to start training.

In [9]:
train_set = df[:800000]
test_set = df[800000:]

train_set.to_csv('windturbine_data_train.csv', header=False, index=False)
test_set.to_csv('windturbine_data_test.csv', header=False, index=False)

In [12]:
import boto3

s3 = boto3.resource('s3')
target_bucket = s3.Bucket(bucket_name)

with open('windturbine_data_train.csv', 'rb') as data:
    target_bucket.upload_fileobj(data, 'data/windturbine_data_train.csv')
    
with open('windturbine_data_test.csv', 'rb') as data:
    target_bucket.upload_fileobj(data, 'data/windturbine_data_test.csv')

In order to start training, we need to specify the location of the docker container that will be used for training.
Docker Registry paths for Amazon algorithms are specified here: https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html

By the way, we can use a utility function of the Amazon SageMaker Python SDK to get the path.

In [4]:
import boto3
from sagemaker.amazon.amazon_estimator import get_image_uri

container = get_image_uri(boto3.Session().region_name, 'linear-learner', repo_version="latest")
print(container)

438346466558.dkr.ecr.eu-west-1.amazonaws.com/linear-learner:latest


We can now start training, by specifying the input and output settings and the required hyperparameters. You can find the list of the supported hyperparameters for the linear learner algorithm here: https://docs.aws.amazon.com/sagemaker/latest/dg/ll_hyperparameters.html.

You can also try running the following cell multiple times changing hyperparameters or other settings like the number of instances to be used for training.

In [16]:
import sagemaker

output_location = 's3://{0}/output'.format(bucket_name)

est = sagemaker.estimator.Estimator(container,
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.c5.4xlarge',
                                    output_path=output_location,
                                    base_job_name='predmain-training-ll')

est.set_hyperparameters(feature_dim=28,
                        predictor_type='binary_classifier',
                        mini_batch_size=200,
                        normalize_data=False,
                        normalize_label=False,
                        unbias_data=False,
                        unbias_label=False)

train_config = sagemaker.session.s3_input('s3://{0}/data/windturbine_data_train.csv'.format(
    bucket_name), content_type='text/csv')
test_config = sagemaker.session.s3_input('s3://{0}/data/windturbine_data_test.csv'.format(
    bucket_name), content_type='text/csv')

est.fit({'train': train_config, 'test': test_config })

INFO:sagemaker:Creating training-job with name: predmain-training-ll-2019-05-04-14-18-50-815


2019-05-04 14:18:51 Starting - Starting the training job...
2019-05-04 14:18:53 Starting - Launching requested ML instances......
2019-05-04 14:19:58 Starting - Preparing the instances for training...
2019-05-04 14:20:51 Downloading - Downloading input data
2019-05-04 14:20:51 Training - Downloading the training image....
[31mDocker entrypoint called with argument(s): train[0m
[31m[05/04/2019 14:21:23 INFO 140709393602368] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/default-input.json: {u'loss_insensitivity': u'0.01', u'epochs': u'15', u'init_bias': u'0.0', u'lr_scheduler_factor': u'auto', u'num_calibration_samples': u'10000000', u'accuracy_top_k': u'3', u'_num_kv_servers': u'auto', u'use_bias': u'true', u'num_point_for_scaler': u'10000', u'_log_level': u'info', u'quantile': u'0.5', u'bias_lr_mult': u'auto', u'lr_scheduler_step': u'auto', u'init_method': u'uniform', u'init_sigma': u'0.01', u'lr_scheduler_minimum_lr': u'auto', u'target_recall'

[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23535971304416656, "sum": 0.23535971304416656, "min": 0.23535971304416656}}, "EndTime": 1556979821.000698, "Dimensions": {"model": 0, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 1}, "StartTime": 1556979821.000646}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23578582885503768, "sum": 0.23578582885503768, "min": 0.23578582885503768}}, "EndTime": 1556979821.000761, "Dimensions": {"model": 1, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 1}, "StartTime": 1556979821.000752}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23537718517065048, "sum": 0.23537718517065048, "min": 0.23537718517065048}}, "EndTime": 1556979821.000793, "Dimensions": {"model": 2, "Host": "algo-1", "Operation": "traini

[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23534990886926652, "sum": 0.23534990886926652, "min": 0.23534990886926652}}, "EndTime": 1556979889.513293, "Dimensions": {"model": 0, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 2}, "StartTime": 1556979889.513241}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23563515264272689, "sum": 0.23563515264272689, "min": 0.23563515264272689}}, "EndTime": 1556979889.513354, "Dimensions": {"model": 1, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 2}, "StartTime": 1556979889.513346}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23536782847642898, "sum": 0.23536782847642898, "min": 0.23536782847642898}}, "EndTime": 1556979889.513379, "Dimensions": {"model": 2, "Host": "algo-1", "Operation": "traini

[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.2353486087870598, "sum": 0.2353486087870598, "min": 0.2353486087870598}}, "EndTime": 1556979957.648391, "Dimensions": {"model": 0, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 3}, "StartTime": 1556979957.648343}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.2355393171453476, "sum": 0.2355393171453476, "min": 0.2355393171453476}}, "EndTime": 1556979957.648451, "Dimensions": {"model": 1, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 3}, "StartTime": 1556979957.648442}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23536453356504441, "sum": 0.23536453356504441, "min": 0.23536453356504441}}, "EndTime": 1556979957.648483, "Dimensions": {"model": 2, "Host": "algo-1", "Operation": "training", "

[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23534752263069153, "sum": 0.23534752263069153, "min": 0.23534752263069153}}, "EndTime": 1556980025.059443, "Dimensions": {"model": 0, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 4}, "StartTime": 1556980025.059394}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23547798682689666, "sum": 0.23547798682689666, "min": 0.23547798682689666}}, "EndTime": 1556980025.059505, "Dimensions": {"model": 1, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 4}, "StartTime": 1556980025.059497}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23536310140609742, "sum": 0.23536310140609742, "min": 0.23536310140609742}}, "EndTime": 1556980025.059531, "Dimensions": {"model": 2, "Host": "algo-1", "Operation": "traini

[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23534660598754883, "sum": 0.23534660598754883, "min": 0.23534660598754883}}, "EndTime": 1556980092.750266, "Dimensions": {"model": 0, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 5}, "StartTime": 1556980092.750217}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23543764513015747, "sum": 0.23543764513015747, "min": 0.23543764513015747}}, "EndTime": 1556980092.750327, "Dimensions": {"model": 1, "Host": "algo-1", "Operation": "training", "Algorithm": "Linear Learner", "epoch": 5}, "StartTime": 1556980092.750318}
[0m
[31m#metrics {"Metrics": {"train_binary_classification_cross_entropy_objective": {"count": 1, "max": 0.23536241074323655, "sum": 0.23536241074323655, "min": 0.23536241074323655}}, "EndTime": 1556980092.750351, "Dimensions": {"model": 2, "Host": "algo-1", "Operation": "traini

[31m[05/04/2019 14:28:36 INFO 140709393602368] #train_score (algo-1) : ('binary_classification_cross_entropy_objective', 0.23534740452528)[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #train_score (algo-1) : ('binary_classification_accuracy', 0.89819375000000001)[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #train_score (algo-1) : ('binary_f_1.000', 0.5443101007670831)[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #train_score (algo-1) : ('precision', 0.7012773564776102)[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #train_score (algo-1) : ('recall', 0.444759388115245)[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #quality_metric: host=algo-1, train binary_classification_cross_entropy_objective <loss>=0.235347404525[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #quality_metric: host=algo-1, train binary_classification_accuracy <score>=0.89819375[0m
[31m[05/04/2019 14:28:36 INFO 140709393602368] #quality_metric: host=algo-1, train binary_f_1.000 