# Convolutional Neural Networks in ``gluon``

Now let's see how succinctly we can express a convolutional neural network using ``gluon``. You might be relieved to find out that this too requires hardly any more code than a logistic regression. 

In [2]:
from __future__ import print_function
import mxnet as mx
from mxnet import nd, autograd
from mxnet import gluon
import numpy as np
mx.random.seed(1)

## Set the context

In [3]:
ctx = mx.gpu(3)

## Grab the MNIST dataset

In [4]:
mnist = mx.test_utils.get_mnist()
batch_size = 64
train_data = mx.io.NDArrayIter(mnist["train_data"], mnist["train_label"], batch_size, shuffle=True)
test_data = mx.io.NDArrayIter(mnist["test_data"], mnist["test_label"], batch_size, shuffle=True)

## Define a convolutional neural network

Again, a few lines here is all we need in order to change the model. Let's add a couple convolutional layers using ``gluon.nn``.

In [5]:
net = gluon.nn.Sequential()
with net.name_scope():
    net.add(gluon.nn.Conv2D(channels=20, kernel_size=5, activation='relu'))
    net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))            
    net.add(gluon.nn.Conv2D(channels=50, kernel_size=5, activation='relu'))
    net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))            
    net.add(gluon.nn.Flatten())
    net.add(gluon.nn.Dense(500, activation="relu"))
    net.add(gluon.nn.Dense(10))

## Parameter initialization


In [6]:
net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)

## Softmax cross-entropy Loss

In [7]:
loss = gluon.loss.SoftmaxCrossEntropyLoss()

## Optimizer

In [8]:
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .1})

## Write evaluation loop to calculate accuracy

In [9]:
def evaluate_accuracy(data_iterator, net):
    acc = mx.metric.Accuracy()
    data_iterator.reset()
    for i, batch in enumerate(data_iterator):
        data = batch.data[0].as_in_context(ctx)
        label = batch.label[0].as_in_context(ctx)
        output = net(data)
        predictions = nd.argmax(output, axis=1)
        acc.update(preds=predictions, labels=label)
    return acc.get()[1]

## Training Loop

In [10]:
epochs = 10

for e in range(epochs):
    train_data.reset()
    moving_loss = 0.
    for i, batch in enumerate(train_data):
        data = batch.data[0].as_in_context(ctx)
        label = batch.label[0].as_in_context(ctx)
        with autograd.record():
            output = net(data)
            cross_entropy = loss(output, label)
        cross_entropy.backward()
        trainer.step(data.shape[0])
        
        moving_loss = .99 * moving_loss + .01 * nd.mean(cross_entropy).asscalar()
            
    test_accuracy = evaluate_accuracy(test_data, net)
    train_accuracy = evaluate_accuracy(train_data, net)
    print("Epoch %s. Loss: %s, Train_acc %s, Test_acc %s" % (e, moving_loss, train_accuracy, test_accuracy))    

Epoch 0. Loss: 0.0783619640677, Train_acc 0.982342750533, Test_acc 0.983180732484
Epoch 1. Loss: 0.0434696892554, Train_acc 0.988439498934, Test_acc 0.987161624204
Epoch 2. Loss: 0.0282524623064, Train_acc 0.991038113006, Test_acc 0.98736066879
Epoch 3. Loss: 0.0194726106777, Train_acc 0.992620602345, Test_acc 0.988753980892
Epoch 4. Loss: 0.0135390548931, Train_acc 0.994869402985, Test_acc 0.990246815287
Epoch 5. Loss: 0.0102831624521, Train_acc 0.996018789979, Test_acc 0.990644904459
Epoch 6. Loss: 0.00781261315504, Train_acc 0.997101545842, Test_acc 0.991739649682
Epoch 7. Loss: 0.0062510033702, Train_acc 0.997451359275, Test_acc 0.991042993631
Epoch 8. Loss: 0.00542135675604, Train_acc 0.997667910448, Test_acc 0.991242038217
Epoch 9. Loss: 0.00397699619535, Train_acc 0.997901119403, Test_acc 0.99134156051


## Conclusion

You might notice that by using ``gluon``, we get code that runs much faster whether on CPU or GPU. That's largely because ``gluon`` can call down to highly optimized layers that have been written in C++. 

For whinges or inquiries, [open an issue on  GitHub.](https://github.com/zackchase/mxnet-the-straight-dope)