<a href="https://colab.research.google.com/github/ccarpenterg/LearningMXNet/blob/master/02_getting_started_with_mxnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Getting Started with MXNet: Training a NN on MNIST

In [0]:
!nvcc --version

In [0]:
!pip install mxnet-cu100

In [3]:
from __future__ import print_function

import mxnet as mx
from mxnet import nd, gluon, autograd
from mxnet.gluon import nn


print(mx.__version__)

1.5.0


### MNIST Dataset

In this notebook we are going to work with the MNIST dataset. Basically it contains images of handwritten digits in grayscale, and its corresponding labels (one, two, three, etc).



In [7]:

# MXNet's default data convention is NCHW whereas
# the MNIST Tensor's dimensions are NHWC

def data_convention_normalization(data):
    """HWC -> CHW; Move the channel axis (2) to the first axis (0)"""
    return nd.moveaxis(data, 2, 0).astype('float32') / 255


train_data = gluon.data.vision.MNIST(train=True).transform_first(data_convention_normalization)
val_data = gluon.data.vision.MNIST(train=False).transform_first(data_convention_normalization)

print(len(train_data))
print(len(val_data))

60000
10000


In [8]:
train_loader = gluon.data.DataLoader(train_data, shuffle=True, batch_size=64)
val_loader = gluon.data.DataLoader(val_data, shuffle=False, batch_size=64)

for X, y in train_loader:
    pass

print(X.shape)
print(y.shape)


(32, 1, 28, 28)
(32,)


In [6]:
drop_prob = 0.2

net = nn.Sequential()
net.add(nn.Flatten(),
        nn.Dense(128, activation='relu'),
        nn.Dropout(drop_prob),
        nn.Dense(10))

net

Sequential(
  (0): Flatten
  (1): Dense(None -> 128, Activation(relu))
  (2): Dropout(p = 0.2, axes=())
  (3): Dense(None -> 10, linear)
)