## Fashion-MNIST Training with MXNet and Gluon using SageMaker

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

This tutorial will show how to train and test an Fashion-MNIST model on SageMaker using MXNet and the Gluon API.



In [None]:
import os
import boto3
import sagemaker
from sagemaker.mxnet import MXNet
from mxnet import gluon,nd
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

!~/sample-notebooks/sagemaker-python-sdk/mxnet_gluon_mnist/setup.sh
role = get_execution_role()

## Download training and test data

In [None]:
def input_transformer(data, label):
    return nd.transpose(data.astype(np.float32), (2,0,1))/255, label.astype(np.float32)

train = gluon.data.vision.FashionMNIST('./data_fmnist/train', train=True, transform=input_transformer)
test = gluon.data.vision.FashionMNIST('./data_fmnist/test', train=False, transform=input_transformer)

### Labels
Each training and test example is assigned to one of the following labels:

| Label | Description |
| --- | --- |
| 0 | T-shirt/top |
| 1 | Trouser |
| 2 | Pullover |
| 3 | Dress |
| 4 | Coat |
| 5 | Sandal |
| 6 | Shirt |
| 7 | Sneaker |
| 8 | Bag |
| 9 | Ankle boot |

## Uploading the data

We use the `sagemaker.Session.upload_data` function to upload our datasets to an S3 location. The return value `inputs` identifies the location -- we will use this later when we start the training job.

In [None]:
inputs = sagemaker_session.upload_data(path='data_fmnist', key_prefix='data_fminst/fmnist')

## Implement the training function

We need to provide a training script that can run on the SageMaker platform. The training scripts are essentially the same as one you would write for local training, except that you need to provide a `train` function. When SageMaker calls your function, it will pass in arguments that describe the training environment. Check the script below to see how this works.

The script here is an adaptation of the [Gluon MNIST example](https://github.com/apache/incubator-mxnet/blob/master/example/gluon/mnist.py) provided by the [Apache MXNet](https://mxnet.incubator.apache.org/) project. 

In [None]:
!cat 'fmnist_cnn.py'

## Set the hyperparameters

In [None]:
batch_size = 100
epochs = 10
learning_rate = 0.01
momentum = 0.9
log_interval = 100

## Run the training script on SageMaker

The ```MXNet``` class allows us to run our training function on SageMaker infrastructure. We need to configure it with our training script, an IAM role, the number of training instances, and the training instance type. In this case we will run our training job on a single c4.xlarge instance. 

In [None]:
m = MXNet("fmnist_cnn.py", 
          role=role, 
          train_instance_count=1, 
          train_instance_type="local",
          hyperparameters={'batch_size': batch_size, 
                         'epochs': epochs, 
                         'learning_rate': learning_rate, 
                         'momentum': momentum, 
                         'log_interval': log_interval})

After we've constructed our `MXNet` object, we can fit it using the data we uploaded to S3. SageMaker makes sure our data is available in the local filesystem, so our training script can simply read the data from disk.


In [None]:
m.fit(inputs)

After training, we use the MXNet object to build and deploy an MXNetPredictor object. This creates a SageMaker endpoint that we can use to perform inference. 

This allows us to perform inference on json encoded multi-dimensional arrays. 

In [None]:
predictor = m.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')

## Validate the model for use
Finally, we can now validate the model for use. We can pass HTTP POST requests to the endpoint to get back predictions. To make this easier, we'll again use the Amazon SageMaker Python SDK and specify how to serialize requests and deserialize responses that are specific to the algorithm.

In [None]:
from sagemaker.predictor import csv_serializer, json_deserializer

predictor.content_type = 'text/csv'
predictor.serializer = csv_serializer
predictor.deserializer = json_deserializer

First we need to get a single image from the validation dataset

In [None]:
import numpy as np
import gzip
import struct
import matplotlib.pyplot as plt
%matplotlib inline

def read_data(label_url, image_url):
    with gzip.open(label_url) as flbl:
        magic, num = struct.unpack(">II", flbl.read(8))
        label = np.fromstring(flbl.read(), dtype=np.int8)
    with gzip.open(image_url, 'rb') as fimg:
        magic, num, rows, cols = struct.unpack(">IIII", fimg.read(16))
        image = np.fromstring(fimg.read(), dtype=np.uint8).reshape(len(label), rows, cols)
    return (label, image)
(val_lbl, val_img) = read_data(
    'data_fmnist/test/t10k-labels-idx1-ubyte.gz', 'data_fmnist/test/t10k-images-idx3-ubyte.gz')
idx = 32 # This number can be changed to get another image
plt.imshow(val_img[idx], cmap='Greys_r')
plt.axis('off')
plt.show()

Now let's try getting a prediction for a single record.

In [None]:
image = nd.array(val_img[idx]).reshape((1,1,28,28)).asnumpy().tolist()
predictor.predict(str(image))

We can also use boto3 (AWS Python SDK) to send a prediciton to sagemaker endpoint

In [None]:
import boto3
client = boto3.client('sagemaker-runtime')
response = client.invoke_endpoint(
    EndpointName=predictor.endpoint,
    Body=str(image),
    ContentType='text/csv')
response['Body'].read()

## Cleanup

After you have finished with this example, remember to delete the prediction endpoint to release the instance(s) associated with it.

In [None]:
sagemaker.Session().delete_endpoint(predictor.endpoint)