# SageMaker Image Classification Algorithm
In this module you will use the Training and Validation datasets that you created in [Module 1](../1_DataExploration/Data_Exploration.ipynb) and use one of SageMaker's built-in algorithms ([Image Classification Algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html)) to predict the steering angle of the vehicle.

In [39]:
import warnings
import boto3
import os
import boto3
import re
import time
import numpy as np
from scipy.misc import imsave
from sagemaker import get_execution_role
import urllib.request
from time import gmtime, strftime


warnings.simplefilter('ignore')

# Helper Functions
def download(url):
    filename = url.split('/')[-1]
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)

def upload2s3(folder, file):
    s3 = boto3.resource('s3')
    data = open(file, 'rb')
    key = folder+'/'+file
    s3.Bucket(bucket).put_object(Key=key, Body=data)

In [40]:
# Download `.rec` files and upload to S3 bucket
#bucket = <'S3 Bucket Name'>
bucket = 'sagemaker-us-west-2-500842391574'
download('https://s3-us-west-2.amazonaws.com/robostig-assets-us-west-2/train.rec')
upload2s3('train', 'train.rec')
download('https://s3-us-west-2.amazonaws.com/robostig-assets-us-west-2/valid.rec')
upload2s3('validation', 'valid.rec')

## Training parameters
__blah blah blah__

In [41]:
# Create the SageMaker permission and environmental variables
role = get_execution_role()
containers = {'us-west-2': '433757028032.dkr.ecr.us-west-2.amazonaws.com/image-classification:latest',
              'us-east-1': '811284229777.dkr.ecr.us-east-1.amazonaws.com/image-classification:latest',
              'us-east-2': '825641698319.dkr.ecr.us-east-2.amazonaws.com/image-classification:latest',
              'eu-west-1': '685385470294.dkr.ecr.eu-west-1.amazonaws.com/image-classification:latest',
              'ap-northeast-1': '501404015308.dkr.ecr.ap-northeast-1.amazonaws.com/image-classification:latest'}
training_image = containers[boto3.Session().region_name]

In [42]:
# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200
# For this training, we will use 18 layers
num_layers = '50'
# we need to specify the input image shape for the training data
image_shape = '66,200,3'
# we also need to specify the number of training samples in the training set
# for caltech it is 15420
num_training_samples = '7232'
# specify the number of output classes
num_classes = '7232'
# batch size for training
mini_batch_size =  "64"
# number of epochs
epochs = "11"
# learning rate
learning_rate = "0.01"

In [43]:
# Configure S3 API
s3 = boto3.client('s3')
# create unique job name 
job_name_prefix = 'ImageClassification'
timestamp = time.strftime('-%Y-%m-%d-%H-%M-%S', time.gmtime())
job_name = job_name_prefix + timestamp
training_params = \
{
    # specify the training docker image
    "AlgorithmSpecification": {
        "TrainingImage": training_image,
        "TrainingInputMode": "File"
    },
    "RoleArn": role,
    "OutputDataConfig": {
        "S3OutputPath": 's3://{}/{}/output'.format(bucket, job_name_prefix)
    },
    "ResourceConfig": {
        "InstanceCount": 1,
        "InstanceType": "ml.p2.xlarge",
        "VolumeSizeInGB": 50
    },
    "TrainingJobName": job_name,
    "HyperParameters": {
        "image_shape": image_shape,
        "num_layers": str(num_layers),
        "num_training_samples": str(num_training_samples),
        "num_classes": str(num_classes),
        "mini_batch_size": str(mini_batch_size),
        "epochs": str(epochs),
        "learning_rate": str(learning_rate)
    },
    "StoppingCondition": {
        "MaxRuntimeInSeconds": 360000
    },
#Training data should be inside a subdirectory called "train"
#Validation data should be inside a subdirectory called "validation"
#The algorithm currently only supports fullyreplicated model (where data is copied onto each machine)
    "InputDataConfig": [
        {
            "ChannelName": "train",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": 's3://{}/train/'.format(bucket),
                    "S3DataDistributionType": "FullyReplicated"
                }
            },
            "ContentType": "application/x-recordio",
            "CompressionType": "None"
        },
        {
            "ChannelName": "validation",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": 's3://{}/validation/'.format(bucket),
                    "S3DataDistributionType": "FullyReplicated"
                }
            },
            "ContentType": "application/x-recordio",
            "CompressionType": "None"
        }
    ]
}
print('Training job name: {}'.format(job_name))
print('\nInput Data Location: {}'.format(training_params['InputDataConfig'][0]['DataSource']['S3DataSource']))

Training job name: ImageClassification-2018-06-08-18-56-40

Input Data Location: {'S3DataType': 'S3Prefix', 'S3Uri': 's3://sagemaker-us-west-2-500842391574/train/', 'S3DataDistributionType': 'FullyReplicated'}


In [44]:
# create the Amazon SageMaker training job
sagemaker = boto3.client(service_name='sagemaker')
sagemaker.create_training_job(**training_params)

# confirm that the training job has started
status = sagemaker.describe_training_job(TrainingJobName=job_name)['TrainingJobStatus']
print('Training job current status: {}'.format(status))

try:
    # wait for the job to finish and report the ending status
    sagemaker.get_waiter('training_job_completed_or_stopped').wait(TrainingJobName=job_name)
    training_info = sagemaker.describe_training_job(TrainingJobName=job_name)
    status = training_info['TrainingJobStatus']
    print("Training job ended with status: " + status)
except:
    print('Training failed to start')
     # if exception is raised, that means it has failed
    message = sagemaker.describe_training_job(TrainingJobName=job_name)['FailureReason']
    print('Training failed with the following error: {}'.format(message))

Training job current status: InProgress
Training failed to start
Training failed with the following error: ClientError: image_shape must be smaller than the actual train image size.


In [45]:
training_info = sagemaker.describe_training_job(TrainingJobName=job_name)
status = training_info['TrainingJobStatus']
print("Training job ended with status: " + status)

Training job ended with status: Failed


---
# Appendix A: RecordIO Format

In [19]:
try:
    import multiprocessing
except ImportError:
    multiprocessing = None

def transform(x, y):
    """
    Reshape the numpy arrays as 4D Tensors for MXNet.
    
    Arguments:
    x -- Numpy Array of input images
    y -- Numpy Array of labels
    
    Returns:
    x -- Numpy Array as (NCHW).
    y -- Label as Column vector.
    """
    data  = x.reshape(-1, 3, 66, 200)
    label = y.reshape(-1, 1)
    return data, label

def load_data(f_path):
    """
    Retrieves and loads the training/testing data.
    
    Arguments:
    f_path -- Location for the training/testing input dataset.
    
    Returns:
    Pre-processed training and testing data along with training and testing labels.
    """
    train_x = np.load(f_path+'/train_X.npy')
    train_y = np.load(f_path+'/train_Y.npy')
#    X_train, y_train = transform(train_x, train_y)
    valid_x = np.load(f_path+'/valid_X.npy')
    valid_y = np.load(f_path+'/valid_Y.npy')
#    X_valid, y_valid = transform(valid_x, valid_y)
    return train_x, train_y, valid_x, valid_y
#    return X_train, y_train, X_valid, y_valid

In [20]:
# Load the data created from Module 1
train_X, train_y, valid_X, valid_y = load_data('/tmp/data')

In [21]:
# Build the make_list function
def make_lst(data, label, name):
    # Create local repository for the images based on name
    if not os.path.exists('./'+name):
        os.mkdir('./'+name)
        
    # Create the lst file
    lst_file = './'+name+'.lst'
    
    # Iterate through the numpy arrays and save as `.jpg`
    # and update the index file
    for i in range(len(data)):
        img = data[i]
        img_name = name+'/'+str(i)+'.jpg'
        imsave(img_name, img)
        with open(lst_file, 'a') as f:
            f.write("{}\t{}\t{}\n".format(str(i), str(label[i]), img_name))
            f.flush()
            f.close()

In [22]:
# Create `train.lst`
make_lst(train_X, train_y, name='train')

In [23]:
# Create `valid.lst'
make_lst(valid_X, valid_y, name='valid')

The image and lst files will be converted to RecordIO file internelly by the image classification algorithm. But if you want do the conversion, the following cell shows how to do it using the [im2rec](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) tool.

In [24]:
# Download the `imrec` tool
download('https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py')

Use the `imrec` tool to convert the RecordIO files as shown below. Additionally, more information on the tool can be found [here](https://mxnet.incubator.apache.org/faq/recordio.html?highlight=recordio).
>__Remember:__ If you are going to use RecordIO, make sure to set the `ContentType` SageMaker training parameter to `application/x-recordio`.

In [25]:
%%bash
# Create the RecordIO binary
python im2rec.py ./train.lst ./ --quality 100 --pass-through
python im2rec.py ./valid.lst ./ --quality 100 --pass-through

Creating .rec file from /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification/train.lst in /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification
multiprocessing not available, fall back to single threaded encoding
time: 0.00031113624572753906  count: 0
time: 0.06610679626464844  count: 1000
time: 0.06459736824035645  count: 2000
time: 0.06700468063354492  count: 3000
time: 0.06483006477355957  count: 4000
time: 0.06484079360961914  count: 5000
time: 0.06489443778991699  count: 6000
time: 0.06630492210388184  count: 7000
time: 0.0629739761352539  count: 8000
time: 0.062425851821899414  count: 9000
time: 0.06277894973754883  count: 10000
time: 0.0627601146697998  count: 11000
time: 0.06265068054199219  count: 12000
time: 0.06285715103149414  count: 13000
time: 0.06509757041931152  count: 14000
Creating .rec file from /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification/valid.lst in /home/ec2-user/SageMaker/RoboStig/modules/2_SageM

Now that you have the data available in the correct format for training, the next step is to upload the `.rec` files and `.lst` files to the S3 bucket created in __Module 1__.

In [27]:
# Enter the S3 Bucket created in Module 1
#bucket = <'S3 Bucket Name'>
bucket = 'sagemaker-us-west-2-500842391574'
# Upload the .rec files to S3
upload2s3('train', 'train.rec')
upload2s3('validation', 'valid.rec')