# SageMaker Image Classification Algorithm
In this module you will use the Training and Validation datasets that you created in [Module 1](../1_DataExploration/Data_Exploration.ipynb) and use one of SageMaker's built-in algorithms ([Image Classification Algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html)) to predict the steering angle of the vehicle.

In [10]:
import warnings
import os
import numpy as np
from scipy.misc import imsave
import urllib.request
warnings.simplefilter('ignore')



---
# Appendix A: RecordIO Format

In [11]:
def download(url):
    filename = url.split("/")[-1]
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)
    return filename

try:
    import multiprocessing
except ImportError:
    multiprocessing = None

def load_data(f_path):
    """
    Retrieves and loads the training/testing data from S3.
    
    Arguments:
    f_path -- Location for the training/testing input dataset.
    
    Returns:
    Pre-processed training and testing data along with training and testing labels.
    """
    train_x = np.load(f_path+'/train_X.npy')
    train_y = np.load(f_path+'/train_Y.npy')
    test_x = np.load(f_path+'/valid_X.npy')
    test_y = np.load(f_path+'/valid_Y.npy')
    return train_x, train_y, test_x, test_y

In [12]:
# Tool for creating lst file
download('https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py')

'im2rec.py'

In [13]:
# Load the data created from Module 1
train_X, train_y, valid_X, valid_y = load_data('/tmp/data')

In [14]:
# Build the make_list function
def make_lst(data, label, name):
    # Create local repository for the images based on name
    if not os.path.exists('./'+name):
        os.mkdir('./'+name)
        
    # Create the lst file
    lst_file = './'+name+'.lst'
    
    # Iterate through the numpy arrays and save as `.jpg`
    # and update the index file
    for i in range(len(data)):
        img = data[i]
        img_name = name+'/'+str(i)+'.jpg'
        imsave(img_name, img)
        with open(lst_file, 'a') as f:
            f.write("{}\t{}\t{}\n".format(str(i), str(label[i]), img_name))
            f.flush()
            f.close()

In [5]:
# Create `train.lst`
make_lst(train_X, train_y, name='train')

In [6]:
# Create `valid.lst'
make_lst(valid_X, valid_y, name='valid')

In [16]:
%%bash
# Create the RecordIO binary
python im2rec.py ./train.lst ./ --quality 100 --pass-through
python im2rec.py ./valid.lst ./ --quality 100 --pass-through

Creating .rec file from /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification/train.lst in /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification
multiprocessing not available, fall back to single threaded encoding
time: 0.0003056526184082031  count: 0
time: 0.06604743003845215  count: 1000
time: 0.06506943702697754  count: 2000
time: 0.06524467468261719  count: 3000
time: 0.06457138061523438  count: 4000
time: 0.06566810607910156  count: 5000
time: 0.06589531898498535  count: 6000
time: 0.0674893856048584  count: 7000
Creating .rec file from /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification/valid.lst in /home/ec2-user/SageMaker/RoboStig/modules/2_SageMakerImageClassification
multiprocessing not available, fall back to single threaded encoding
time: 0.00038242340087890625  count: 0
