This notebook is based on the preprocessing already performed. Training and Validation sets of images and LST files have been developed and placed in the correct structure, on S3. 

At this point, we can now simply perform the multiclass image classification training.

## Import

## Libraries

In [20]:
%%time
import sagemaker
from sagemaker import get_execution_role
from sagemaker.amazon.amazon_estimator import get_image_uri

CPU times: user 12 µs, sys: 2 µs, total: 14 µs
Wall time: 17.6 µs


# Set-Up

## Establish AWS Conditions

In [21]:
role = get_execution_role()
print(role)

bucket = "dsba-6190-final-team-project"
prefix_1 = "channels"
prefix_file_type = "rec"
prefix_split_type = "split_im2rec"

sess_sage = sagemaker.Session()

arn:aws:iam::726963482731:role/sagemaker_execution


## Import Sagemaker Model

In [22]:
training_image = get_image_uri(sess_sage.boto_region_name, 'image-classification', repo_version="latest")
print (training_image)

811284229777.dkr.ecr.us-east-1.amazonaws.com/image-classification:latest


# Model Training
Two different data sets have been uploaded to S3. One is the complete dataset. The other is a 10% sample of the dataset. The 10% sample is for troubleshooting the algorithm.

There are only two differences between fitting the model with the sample data and the complete dataset:

* Input Location: We need to point the algorithm to two different S3 locations. We will do this with the **prefix_dataset** variable, which will be defined at the beginning of each dataset's notebook section.
* Number of Training Samples: The number of training samples will be different for the complete and the sample. Thes values are available in the Jupyter Notebook used to split the data and upload to S3. We will define the number of **training** samples for each dataset below.

In [23]:
num_training_samples_complete = 15686
num_training_samples_10 = 1567

## Sample Dataset
The following analysis will be for the 10% sample dataset.

In [24]:
prefix_dataset = "sample"
num_training_samples = num_training_samples_10

print("This section trains the image classification model on the {} data, which as {} images in its training sample.".format(prefix_dataset, num_training_samples))

This section trains the image classification model on the sample data, which as 1567 images in its training sample.


## Model Inputs

### Model Output Location

In [25]:
s3_output_location = 's3://{}/{}/{}/{}/{}/output'.format(bucket, prefix_1, prefix_file_type, prefix_split_type, prefix_dataset)
print(s3_output_location)

s3://dsba-6190-final-team-project/channels/rec/split_im2rec/sample/output


### Data Paths

First we establish the four channels.

In [26]:
s3train = 's3://{}/{}/{}/{}/{}/train/'.format(bucket, prefix_1, prefix_file_type, prefix_split_type, prefix_dataset)
s3validation = 's3://{}/{}/{}/{}/{}/validation/'.format(bucket, prefix_1, prefix_file_type, prefix_split_type, prefix_dataset)

print(s3train)
print(s3validation)

s3://dsba-6190-final-team-project/channels/rec/split_im2rec/sample/train/
s3://dsba-6190-final-team-project/channels/rec/split_im2rec/sample/validation/


Then we define the channels as inputs into the image classification model.

In [27]:
train_data = sagemaker.session.s3_input(s3train, 
                                        distribution='FullyReplicated', 
                                        content_type='application/x-recordio', 
                                        s3_data_type='S3Prefix')

validation_data = sagemaker.session.s3_input(s3validation, 
                                             distribution='FullyReplicated', 
                                             content_type='application/x-recordio', 
                                             s3_data_type='S3Prefix')

data_channels = {'train': train_data, 
                 'validation': validation_data}

print(data_channels)

{'train': <sagemaker.inputs.s3_input object at 0x7f1e1921a550>, 'validation': <sagemaker.inputs.s3_input object at 0x7f1e1921a588>}


## Train Model

### Initialize Parameters

In [28]:
dist_drive_ic = sagemaker.estimator.Estimator(training_image,
                                              role, 
                                              train_instance_count=1, 
                                              train_instance_type='ml.p3.2xlarge',
                                              train_volume_size = 50,
                                              train_max_run = 60,
                                              input_mode= 'File',
                                              output_path=s3_output_location,
                                              sagemaker_session=sess_sage)

### Initialize Hyper-Parameters

In [29]:
dist_drive_ic.set_hyperparameters(num_layers = 18,
                                  use_pretrained_model = 1,
                                  image_shape = "3,210,280", #RGB Pictures, 210 x 280
                                  num_classes = 10,
                                  mini_batch_size = 128,
                                  epochs = 2,
                                  learning_rate = 0.1,
                                  num_training_samples = num_training_samples,
                                  precision_dtype = 'float32')

### Run Model

In [30]:
%%time
dist_drive_ic.fit(inputs = data_channels, logs = True)

2020-03-30 15:05:05 Starting - Starting the training job...
2020-03-30 15:05:07 Starting - Launching requested ML instances.........
2020-03-30 15:06:39 Starting - Preparing the instances for training......
2020-03-30 15:07:59 Downloading - Downloading input data
2020-03-30 15:07:59 Training - Downloading the training image.....[34mDocker entrypoint called with argument(s): train[0m
[34m[03/30/2020 15:08:47 INFO 140350643181376] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/image_classification/default-input.json: {u'beta_1': 0.9, u'gamma': 0.9, u'beta_2': 0.999, u'optimizer': u'sgd', u'use_pretrained_model': 0, u'eps': 1e-08, u'epochs': 30, u'lr_scheduler_factor': 0.1, u'num_layers': 152, u'image_shape': u'3,224,224', u'precision_dtype': u'float32', u'mini_batch_size': 32, u'weight_decay': 0.0001, u'learning_rate': 0.1, u'momentum': 0}[0m
[34m[03/30/2020 15:08:47 INFO 140350643181376] Merging with provided configuration from /opt/ml/input/config/hyper