# AWS re:Invent 2019
## Train and deploy custom deep learning models with AWS DeepLens and Amazon SageMaker
## Lab 2. Train a Classification Model

When the notebook launchs for the first time, select ** conda_mxnet_p36 ** for kernel.

#### Install GluonCV and the required python packages.
See the link below for GluonCV's `model_zoo` and `utils` packages.
- `model_zoo`: [https://gluon-cv.mxnet.io/model_zoo/index.html](https://gluon-cv.mxnet.io/model_zoo/index.html)
- `utils`: [https://gluon-cv.mxnet.io/api/utils.html](https://gluon-cv.mxnet.io/api/utils.html)

In [None]:
# Run only once to install the gluoncv package with the following code:
!pip install gluoncv==0.5.0

In [None]:
import zipfile, os
from gluoncv.utils import download

file_url = 'https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/classification/minc-2500-tiny.zip'
zip_file = download(file_url, path='./')
with zipfile.ZipFile(zip_file, 'r') as zin:
    zin.extractall(os.path.expanduser('./'))

Hyperparameters
----------

First, let's import all other necessary libraries.



In [None]:
import mxnet as mx
import numpy as np
import os, time, shutil

from mxnet import gluon, image, init, nd
from mxnet import autograd as ag
from mxnet.gluon import nn
from mxnet.gluon.data.vision import transforms
from gluoncv.utils import makedirs
from gluoncv.model_zoo import get_model

We set the hyperparameters as following:



In [None]:
# class: brown, polar, no bear
classes = 3

epochs = 10
lr = 0.001
per_device_batch_size = 256
num_gpus = 1
num_workers = 8
ctx = [mx.gpu(i) for i in range(num_gpus)] if num_gpus > 0 else [mx.cpu()]
batch_size = per_device_batch_size * max(num_gpus, 1)

Things to keep in mind:

1. `epochs = 10` is just for this lab with the small dataset. You can change it to a larger number in your experiments, for instance 40.

2. `per_device_batch_size` can be a larger number.

3. remember to tune `num_gpus` and `num_workers` according to your machine.

4. A pre-trained model is already in a pretty good status. So we can start with a small `lr`.

Data Augmentation
-----------------

In transfer learning, data augmentation can also help.
We use the following augmentation in training:

1. Randomly crop the image and resize it to 224x224
2. Randomly flip the image horizontally
3. Randomly jitter color and add noise
4. Transpose the data from height \* width \* num_channels to num_channels \* height \* width, and map values from [0, 255] to [0, 1]
5. Normalize with the mean and standard deviation from the ImageNet dataset.




In [None]:
jitter_param = 0.4
lighting_param = 0.1

transform_train = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomFlipLeftRight(),
    transforms.RandomColorJitter(brightness=jitter_param, 
                                 contrast=jitter_param,
                                 saturation=jitter_param),
    transforms.RandomLighting(lighting_param),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

transform_test = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

With the data augmentation functions, we can define our data loaders:



In [None]:
path = 'data'

train_path = os.path.join(path, 'train')
val_path = os.path.join(path, 'val')
test_path = os.path.join(path, 'test')

train_data = gluon.data.DataLoader(
    gluon.data.vision.ImageFolderDataset(train_path).transform_first(transform_train),
    batch_size=batch_size, shuffle=True, num_workers=num_workers)

val_data = gluon.data.DataLoader(
    gluon.data.vision.ImageFolderDataset(val_path).transform_first(transform_test),
    batch_size=batch_size, shuffle=False, num_workers = num_workers)

test_data = gluon.data.DataLoader(
    gluon.data.vision.ImageFolderDataset(test_path).transform_first(transform_test),
    batch_size=batch_size, shuffle=False, num_workers = num_workers)

Note that only ``train_data`` uses ``transform_train``, while
``val_data`` and ``test_data`` use ``transform_test`` to produce deterministic
results for evaluation.

Model and Trainer
-----------------

We use a pre-trained [``MobileNet1.0``](https://arxiv.org/pdf/1704.04861.pdf) model, which is useful for mobile and embedded vision applications due to its smaller model size and complexity.

![alt text](https://3.bp.blogspot.com/-ujGePiv1gZ8/WUBjrgwrPmI/AAAAAAAAB14/zOw9URnrMnIbe7Vv8ftYT4PsnH7S-gJIQCLcBGAs/s1600/image1.png "MobileNet1.0")

In [None]:
model_name = 'MobileNet1.0'

Here we introduce a common technique in transfer learning: fine-tuning. As shown in the figure below, **fine-tuning** consists of the following steps:

1. load the pre-trained model (e.g. `MobileNet1.0`)
2. re-define the output layer whose output size is the number of target dataset categories to the target model, and randomly initialize the model parameters of this layer.
3. train the target model on the target dataset.

![alt text](https://www.d2l.ai/_images/finetune.svg "Fine tuning")

In [None]:
finetune_net = get_model(model_name, pretrained=True)

with finetune_net.name_scope():
    finetune_net.output = nn.Dense(classes)
finetune_net.output.initialize(init.Xavier(), ctx = ctx)
finetune_net.collect_params().reset_ctx(ctx)
finetune_net.hybridize()

trainer = gluon.Trainer(finetune_net.collect_params(), 'adam', 
                        {'learning_rate': lr})

metric = mx.metric.Accuracy()
L = gluon.loss.SoftmaxCrossEntropyLoss()

We define a evaluation function for validation and testing.

In [None]:
def test(net, val_data, ctx):
    metric = mx.metric.Accuracy()
    for i, batch in enumerate(val_data):
        data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0, even_split=False)
        label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0, even_split=False)
        outputs = [net(X) for X in data]
        metric.update(label, outputs)

    return metric.get()

Training Loop
-------------

Following is the main training loop.

In [None]:
num_batch = len(train_data)

for epoch in range(epochs):
    tic = time.time()
    train_loss = 0
    metric.reset()

    for i, batch in enumerate(train_data):
        data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0, even_split=False)
        label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0, even_split=False)
        with ag.record():
            outputs = [finetune_net(X) for X in data]
            loss = [L(yhat, y) for yhat, y in zip(outputs, label)]
        for l in loss:
            l.backward()

        trainer.step(batch_size)
        train_loss += sum([l.mean().asscalar() for l in loss]) / len(loss)

        metric.update(label, outputs)

    _, train_acc = metric.get()
    train_loss /= num_batch

    _, val_acc = test(finetune_net, val_data, ctx)

    print('[Epoch %d] Train-acc: %.3f, loss: %.3f | Val-acc: %.3f | time: %.1f' %
             (epoch, train_acc, train_loss, val_acc, time.time() - tic))

_, test_acc = test(finetune_net, test_data, ctx)
print('[Finished] Test-acc: %.3f' % (test_acc))

Predict with finetuned model
-------------

We can test the performance using finetuned weights.

In [None]:
%matplotlib inline
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

from gluoncv.utils import viz, download

Let's test with the first picture.

In [None]:
plt.rcParams['figure.figsize'] = (15, 9)

img = image.imread('samples/bear00.jpg')
viz.plot_image(img)
plt.show()

In [None]:
transform_fn = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

img = transform_fn(img)

In [None]:
ctx = mx.gpu(0)
pred = finetune_net(img.expand_dims(0).as_in_context(ctx))

In [None]:
class_names = ['brown', 'no', 'polar']

topK = 3
ind = nd.topk(pred, k=topK).astype('int')[0]
for i in range(topK):
    print('[%s], with probability %.1f%%'%
         (class_names[ind[i].asscalar()], nd.softmax(pred)[0][ind[i]].asscalar()*100))

Let's test with another picture.

In [None]:
img = image.imread('samples/bear02.jpg')
viz.plot_image(img)
plt.show()

In [None]:
transform_fn = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

img = transform_fn(img)

pred = finetune_net(img.expand_dims(0).as_in_context(ctx))

ind = nd.topk(pred, k=topK).astype('int')[0]
for i in range(topK):
    print('[%s], with probability %.1f%%'%
         (class_names[ind[i].asscalar()], nd.softmax(pred)[0][ind[i]].asscalar()*100))

This time let's try a more difficult example.

In [None]:
img = image.imread('samples/bear06.jpg')
viz.plot_image(img)
plt.show()

In [None]:
transform_fn = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

img = transform_fn(img)

pred = finetune_net(img.expand_dims(0).as_in_context(ctx))

ind = nd.topk(pred, k=topK).astype('int')[0]
for i in range(topK):
    print('[%s], with probability %.1f%%'%
         (class_names[ind[i].asscalar()], nd.softmax(pred)[0][ind[i]].asscalar()*100))

Congratulations! You have built your own object classification model based on a custom dataset.