# Multilayer perceptrons in ``gluon``

Using gluon, we only need two additional lines of code to transform our logisitc regression model into a multilayer perceptron.

In [1]:
from __future__ import print_function
import mxnet as mx
from mxnet import nd, autograd
from mxnet import gluon
import numpy as np

We'll also want to set the compute context for our modeling. Feel free to go ahead and change this to mx.gpu(0) if you're running on an appropriately endowed machine.

In [2]:
ctx = mx.gpu()

## The MNIST dataset

In [3]:
mnist = mx.test_utils.get_mnist()
batch_size = 64
num_inputs = 784
num_outputs = 10
train_data = mx.io.NDArrayIter(mnist["train_data"], mnist["train_label"], batch_size, shuffle=True)
test_data = mx.io.NDArrayIter(mnist["test_data"], mnist["test_label"], batch_size, shuffle=True)

## Define the model

*Here's the only real difference. We add two lines!*

In [4]:
num_hidden = 256
net = gluon.nn.Sequential()
with net.name_scope():
    net.add(gluon.nn.Dense(num_hidden, activation="relu"))
    net.add(gluon.nn.Dense(num_hidden, activation="relu"))
    net.add(gluon.nn.Dense(num_outputs))

## Parameter initialization


In [5]:
net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)

## Softmax cross-entropy loss

In [6]:
loss = gluon.loss.SoftmaxCrossEntropyLoss()

## Optimizer

In [7]:
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .1})

## Evaluation metric

In [8]:
def evaluate_accuracy(data_iterator, net):
    acc = mx.metric.Accuracy()
    data_iterator.reset()
    for i, batch in enumerate(data_iterator):
        data = batch.data[0].as_in_context(ctx).reshape((-1,784))
        label = batch.label[0].as_in_context(ctx)
        output = net(data)
        predictions = nd.argmax(output, axis=1)
        acc.update(preds=predictions, labels=label)
    return acc.get()[1]

## Training loop

In [9]:
epochs = 10
moving_loss = 0.

for e in range(epochs):
    train_data.reset()
    for i, batch in enumerate(train_data):
        data = batch.data[0].as_in_context(ctx).reshape((-1,784))
        label = batch.label[0].as_in_context(ctx)
        with autograd.record():
            output = net(data)
            cross_entropy = loss(output, label)
            cross_entropy.backward()
        trainer.step(data.shape[0])
        
        ##########################
        #  Keep a moving average of the losses
        ##########################
        if i == 0:
            moving_loss = nd.mean(cross_entropy).asscalar()
        else:
            moving_loss = .99 * moving_loss + .01 * nd.mean(cross_entropy).asscalar()
            
    test_accuracy = evaluate_accuracy(test_data, net)
    train_accuracy = evaluate_accuracy(train_data, net)
    print("Epoch %s. Loss: %s, Train_acc %s, Test_acc %s" % (e, moving_loss, train_accuracy, test_accuracy))    
    

Epoch 0. Loss: 0.205855066015, Train_acc 0.948560767591, Test_acc 0.946257961783
Epoch 1. Loss: 0.127244201719, Train_acc 0.965668310235, Test_acc 0.961086783439
Epoch 2. Loss: 0.090797148292, Train_acc 0.974746801706, Test_acc 0.967257165605
Epoch 3. Loss: 0.0688326986894, Train_acc 0.979710820896, Test_acc 0.971536624204
Epoch 4. Loss: 0.0544558694687, Train_acc 0.982809168443, Test_acc 0.97273089172
Epoch 5. Loss: 0.0433249605557, Train_acc 0.985141257996, Test_acc 0.973228503185
Epoch 6. Loss: 0.0344692461933, Train_acc 0.987323427505, Test_acc 0.974621815287
Epoch 7. Loss: 0.0275306442179, Train_acc 0.989572228145, Test_acc 0.976015127389
Epoch 8. Loss: 0.021924259863, Train_acc 0.991537846482, Test_acc 0.976015127389
Epoch 9. Loss: 0.0175955079664, Train_acc 0.992987073561, Test_acc 0.976910828025


## Conclusion

Now let's take a look at how to build convolutional neural networks.

For whinges or inquiries, [open an issue on  GitHub.](https://github.com/zackchase/mxnet-the-straight-dope)