### Learning basic MXNet APIs with a synthetic data set. Think of this as "Hello World" for MXNet.

In [1]:
import mxnet as mx
import numpy as np

import logging

logging.basicConfig(level=logging.INFO)

  import OpenSSL.SSL


Let's use **10,000 samples**: **8,000** for training and **2,000** for validation.
Each sample has **100 features** and belongs to one of **10 categories**.

In [2]:
sample_count = 10000
train_count = 8000
valid_count = sample_count - train_count

feature_count = 100
category_count = 10

batchsize = 16

Now, let's create a **synthetic** data set. 
- Samples are generated **randomly** from a uniform [0,1] distribution
- Labels are generated **randomly** between 0 and 9

In [3]:
X = mx.nd.uniform(low=0, high=1, shape=(sample_count,feature_count))

Y = mx.nd.empty((sample_count,))
for i in range(0,sample_count-1):
  Y[i] = np.random.randint(0,category_count)

We split the data set in two to build the **training set** and the **validation set**.

In [4]:
X_train = mx.nd.crop(X, begin=(0,0), end=(train_count,feature_count))
Y_train = Y[0:train_count]

X_valid = mx.nd.crop(X, begin=(train_count,0), end=(sample_count,feature_count))
Y_valid = Y[train_count:sample_count]

print(X.shape, Y.shape, X_train.shape, Y_train.shape, X_valid.shape, Y_valid.shape)

(10000, 100) (10000,) (8000, 100) (8000,) (2000, 100) (2000,)


We define a simple multi-layer perceptron model:
- an **input** layer,
- a fully connected **hidden layer** with 1024 neurons activated by the ReLU function,
- an **output layer** with 10 neurons (because we have 10 categories), holding probabilities computed by the SoftMax function.

In [5]:
data = mx.sym.Variable('data')
fc1 = mx.sym.FullyConnected(data, name='fc1', num_hidden=1024)
relu1 = mx.sym.Activation(fc1, name='relu1', act_type="relu")
fc2 = mx.sym.FullyConnected(relu1, name='fc2', num_hidden=category_count)
out = mx.sym.SoftmaxOutput(fc2, name='softmax')
mod = mx.mod.Module(out, context=mx.gpu(0))

We build the training iterator, which will serve the samples batch by batch

In [6]:
train_iter = mx.io.NDArrayIter(data=X_train,label=Y_train,batch_size=batchsize)

This is how you would print samples and labels in the training set

In [7]:
for batch in train_iter:
  print(batch.data)
  print(batch.label)
train_iter.reset()

[
[[ 0.79415995  0.88229781  0.15982462 ...,  0.44772229  0.75755006
   0.87283832]
 [ 0.31222865  0.48420015  0.0504978  ...,  0.16934721  0.12651889
   0.80685824]
 [ 0.80629975  0.48412949  0.05866874 ...,  0.69313461  0.34740889
   0.30012631]
 ..., 
 [ 0.1909433   0.05038637  0.49769074 ...,  0.45061344  0.23130836
   0.89167297]
 [ 0.74067473  0.43734229  0.85627335 ...,  0.55485642  0.96582317
   0.07514168]
 [ 0.06060805  0.04075193  0.64845788 ...,  0.96851498  0.87457222
   0.03390705]]
<NDArray 16x100 @cpu(0)>]
[
[ 6.  7.  2.  0.  7.  4.  7.  7.  5.  1.  0.  7.  5.  8.  2.  6.]
<NDArray 16 @cpu(0)>]
[
[[ 0.39131808  0.8424558   0.34232843 ...,  0.28209016  0.79048556
   0.09554287]
 [ 0.00684901  0.91822499  0.25137281 ...,  0.37233368  0.12592906
   0.17417954]
 [ 0.5463286   0.37183946  0.53601027 ...,  0.65388352  0.89471936
   0.31712183]
 ..., 
 [ 0.23259538  0.02688592  0.1904531  ...,  0.53184462  0.2157436
   0.28168073]
 [ 0.4096612   0.37619251  0.27467784 ...,  0.

<NDArray 16 @cpu(0)>]
[
[[ 0.66902804  0.75116634  0.03036506 ...,  0.6936239   0.3352319
   0.57753277]
 [ 0.5730468   0.88176668  0.7800892  ...,  0.8200528   0.85268563
   0.05600134]
 [ 0.39673758  0.3887637   0.85516208 ...,  0.97731429  0.28071305
   0.143171  ]
 ..., 
 [ 0.61621928  0.12770559  0.37840554 ...,  0.26865017  0.90583742
   0.61402696]
 [ 0.13396385  0.26848149  0.34792879 ...,  0.08526403  0.2605229
   0.46793213]
 [ 0.49209645  0.60591483  0.38067213 ...,  0.40172821  0.28114173
   0.89103138]]
<NDArray 16x100 @cpu(0)>]
[
[ 5.  7.  8.  8.  4.  4.  9.  6.  9.  2.  3.  1.  4.  2.  6.  6.]
<NDArray 16 @cpu(0)>]
[
[[ 0.99486595  0.87763381  0.09057061 ...,  0.58535302  0.97956532
   0.02088019]
 [ 0.85892051  0.99487746  0.43323648 ...,  0.02908323  0.77606148
   0.82891971]
 [ 0.59770966  0.05094122  0.96412981 ...,  0.8261205   0.1357279
   0.4771066 ]
 ..., 
 [ 0.04125385  0.5684551   0.19242501 ...,  0.43609026  0.45281762
   0.52273256]
 [ 0.46818891  0.83619195 

[
[ 7.  0.  1.  9.  8.  1.  2.  4.  3.  8.  9.  4.  8.  9.  4.  5.]
<NDArray 16 @cpu(0)>]
[
[[ 0.79854983  0.95615119  0.93661362 ...,  0.31655958  0.73474675
   0.95700145]
 [ 0.21975629  0.99249876  0.76052123 ...,  0.20119256  0.36250052
   0.70623797]
 [ 0.99083954  0.12838337  0.4227334  ...,  0.95234078  0.53452057
   0.4602479 ]
 ..., 
 [ 0.80980104  0.38702521  0.75617623 ...,  0.33797401  0.35698557
   0.45704845]
 [ 0.51217759  0.70796239  0.16672492 ...,  0.40914595  0.54153788
   0.62832987]
 [ 0.57432806  0.37682259  0.25872138 ...,  0.65810829  0.85622329
   0.15403223]]
<NDArray 16x100 @cpu(0)>]
[
[ 0.  0.  6.  1.  1.  8.  7.  0.  1.  4.  3.  1.  9.  6.  7.  6.]
<NDArray 16 @cpu(0)>]
[
[[ 0.60158646  0.8699038   0.01654912 ...,  0.83552849  0.79979175
   0.49907556]
 [ 0.29139113  0.26906407  0.83696657 ...,  0.72357035  0.78630626
   0.6293987 ]
 [ 0.56025749  0.41358995  0.94739091 ...,  0.08730688  0.61713058
   0.55319476]
 ..., 
 [ 0.21865839  0.87685943  0.70341885

[
[[ 0.67067236  0.57418263  0.39447278 ...,  0.75165135  0.17844287
   0.19158003]
 [ 0.92923522  0.95089614  0.12752075 ...,  0.64115691  0.73417288
   0.64311689]
 [ 0.96476507  0.09910604  0.60213137 ...,  0.01402608  0.55430913
   0.26982191]
 ..., 
 [ 0.19871679  0.35308215  0.85469419 ...,  0.21453144  0.29219246
   0.31848943]
 [ 0.8144002   0.93451005  0.74371207 ...,  0.6734336   0.79485595
   0.29140443]
 [ 0.02574019  0.43656653  0.28264743 ...,  0.85168761  0.58797544
   0.20123743]]
<NDArray 16x100 @cpu(0)>]
[
[ 4.  1.  0.  1.  1.  9.  2.  9.  2.  7.  8.  0.  1.  8.  0.  1.]
<NDArray 16 @cpu(0)>]
[
[[ 0.76261526  0.0501318   0.12239667 ...,  0.98370898  0.22228327
   0.22181371]
 [ 0.10652475  0.5305286   0.1318524  ...,  0.12702952  0.86459887
   0.25235665]
 [ 0.01705168  0.73582983  0.82227004 ...,  0.76936239  0.88486272
   0.75780767]
 ..., 
 [ 0.92202181  0.63693452  0.51656175 ...,  0.26146206  0.13444725
   0.30601469]
 [ 0.10014711  0.17158529  0.44601864 ...,  0

[
[ 4.  4.  2.  5.  3.  8.  1.  0.  5.  2.  1.  3.  7.  0.  7.  9.]
<NDArray 16 @cpu(0)>]
[
[[ 0.17358691  0.64436299  0.72142416 ...,  0.45659572  0.26117387
   0.20622627]
 [ 0.63680714  0.08559559  0.70073205 ...,  0.70127225  0.20880967
   0.72583103]
 [ 0.49705762  0.67530251  0.79298365 ...,  0.26066947  0.62232339
   0.27538267]
 ..., 
 [ 0.53349966  0.24006721  0.08813935 ...,  0.26494884  0.28346032
   0.90784633]
 [ 0.14367923  0.33899197  0.66227216 ...,  0.35455871  0.68235701
   0.385575  ]
 [ 0.513668    0.46867728  0.96578044 ...,  0.76154667  0.02667239
   0.40754732]]
<NDArray 16x100 @cpu(0)>]
[
[ 9.  9.  3.  5.  7.  4.  9.  1.  5.  4.  6.  9.  2.  2.  9.  9.]
<NDArray 16 @cpu(0)>]
[
[[ 0.84111482  0.17400835  0.25344518 ...,  0.31392461  0.7030775
   0.29244834]
 [ 0.30589747  0.41618165  0.44316205 ...,  0.93174213  0.00134671
   0.4520891 ]
 [ 0.69292355  0.51534146  0.35538274 ...,  0.94198078  0.69217175
   0.29462492]
 ..., 
 [ 0.49804887  0.98865557  0.4766742  

Now, we need to:
- **bind** the model to the training set,
- **initialize** the parameters, i.e. set initial values for all weights,
- pick an **optimizer** and a **learning rate**, to adjust weights during backpropagation

In [13]:
mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
mod.init_params(initializer=mx.init.Xavier(magnitude=2.))
mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))

  from ipykernel import kernelapp as app


Let's train!

In [14]:
mod.fit(train_iter, num_epoch=50, batch_end_callback=mx.callback.Speedometer(batchsize, 100))

  allow_missing=allow_missing, force_init=force_init)
INFO:root:Epoch[0] Batch [100]	Speed: 19200.24 samples/sec	accuracy=1.000000
INFO:root:Epoch[0] Batch [200]	Speed: 21068.96 samples/sec	accuracy=1.000000
INFO:root:Epoch[0] Train-accuracy=1.000000
INFO:root:Epoch[0] Time cost=0.184
INFO:root:Epoch[1] Batch [100]	Speed: 18720.08 samples/sec	accuracy=1.000000
INFO:root:Epoch[1] Batch [200]	Speed: 18675.54 samples/sec	accuracy=1.000000
INFO:root:Epoch[1] Batch [300]	Speed: 18713.87 samples/sec	accuracy=1.000000
INFO:root:Epoch[1] Batch [400]	Speed: 18392.24 samples/sec	accuracy=1.000000
INFO:root:Epoch[1] Train-accuracy=1.000000
INFO:root:Epoch[1] Time cost=0.430
INFO:root:Epoch[2] Batch [100]	Speed: 18623.04 samples/sec	accuracy=1.000000
INFO:root:Epoch[2] Batch [200]	Speed: 18978.48 samples/sec	accuracy=1.000000
INFO:root:Epoch[2] Batch [300]	Speed: 20375.72 samples/sec	accuracy=1.000000
INFO:root:Epoch[2] Batch [400]	Speed: 18954.68 samples/sec	accuracy=1.000000
INFO:root:Epoch[2] T

INFO:root:Epoch[20] Batch [400]	Speed: 18585.54 samples/sec	accuracy=1.000000
INFO:root:Epoch[20] Train-accuracy=1.000000
INFO:root:Epoch[20] Time cost=0.434
INFO:root:Epoch[21] Batch [100]	Speed: 18880.40 samples/sec	accuracy=1.000000
INFO:root:Epoch[21] Batch [200]	Speed: 19145.08 samples/sec	accuracy=1.000000
INFO:root:Epoch[21] Batch [300]	Speed: 19837.09 samples/sec	accuracy=1.000000
INFO:root:Epoch[21] Batch [400]	Speed: 18505.80 samples/sec	accuracy=1.000000
INFO:root:Epoch[21] Train-accuracy=1.000000
INFO:root:Epoch[21] Time cost=0.425
INFO:root:Epoch[22] Batch [100]	Speed: 21531.33 samples/sec	accuracy=1.000000
INFO:root:Epoch[22] Batch [200]	Speed: 18684.38 samples/sec	accuracy=1.000000
INFO:root:Epoch[22] Batch [300]	Speed: 18396.37 samples/sec	accuracy=1.000000
INFO:root:Epoch[22] Batch [400]	Speed: 18864.32 samples/sec	accuracy=1.000000
INFO:root:Epoch[22] Train-accuracy=1.000000
INFO:root:Epoch[22] Time cost=0.420
INFO:root:Epoch[23] Batch [100]	Speed: 18754.50 samples/se

INFO:root:Epoch[41] Batch [400]	Speed: 20415.33 samples/sec	accuracy=1.000000
INFO:root:Epoch[41] Train-accuracy=1.000000
INFO:root:Epoch[41] Time cost=0.418
INFO:root:Epoch[42] Batch [100]	Speed: 18772.34 samples/sec	accuracy=1.000000
INFO:root:Epoch[42] Batch [200]	Speed: 18807.38 samples/sec	accuracy=1.000000
INFO:root:Epoch[42] Batch [300]	Speed: 18540.82 samples/sec	accuracy=1.000000
INFO:root:Epoch[42] Batch [400]	Speed: 18425.67 samples/sec	accuracy=1.000000
INFO:root:Epoch[42] Train-accuracy=1.000000
INFO:root:Epoch[42] Time cost=0.434
INFO:root:Epoch[43] Batch [100]	Speed: 19037.71 samples/sec	accuracy=1.000000
INFO:root:Epoch[43] Batch [200]	Speed: 18445.72 samples/sec	accuracy=1.000000
INFO:root:Epoch[43] Batch [300]	Speed: 18505.69 samples/sec	accuracy=1.000000
INFO:root:Epoch[43] Batch [400]	Speed: 18585.23 samples/sec	accuracy=1.000000
INFO:root:Epoch[43] Train-accuracy=1.000000
INFO:root:Epoch[43] Time cost=0.434
INFO:root:Epoch[44] Batch [100]	Speed: 18894.11 samples/se

Unsurprisingly, we get to 100% training accuracy. What about **validation accuracy**?

In [15]:
val_iter = mx.io.NDArrayIter(data=X_valid,label=Y_valid, batch_size=batchsize)

metric = mx.metric.Accuracy()
mod.score(val_iter, metric)
print(metric.get())

('accuracy', 0.1085)


Bottom line: a neural network can learn ANYTHING. However, it will only predict correctly if your data set makes sense :)

This is how you would print samples and labels in the training set

In [11]:
val_iter.reset()
for batch in val_iter:
  print(batch.label)
  mod.forward(batch)
  prob = mod.get_outputs()[0].asnumpy()
  print(prob)

[
[ 6.  1.  4.  7.  8.  3.  6.  2.  4.  0.  8.  3.  4.  6.  8.  7.]
<NDArray 16 @cpu(0)>]
[[  1.37942648e-02   6.03255153e-01   2.06489302e-03   8.30709338e-02
    1.03880440e-04   5.65429311e-03   4.24057798e-04   1.77211404e-01
    4.91770841e-02   6.52440414e-02]
 [  8.68156016e-01   4.91774585e-07   2.29281606e-04   1.23517518e-03
    2.37869835e-06   4.97719797e-04   9.77883860e-03   5.09405835e-03
    4.54892255e-02   6.95168525e-02]
 [  7.68280864e-01   1.43530173e-03   5.37264347e-03   1.54011808e-02
    1.11141115e-01   8.03976599e-03   3.67176763e-05   4.43071080e-03
    2.47811479e-03   8.33835304e-02]
 [  9.12791610e-01   3.72620183e-03   2.75205262e-03   7.62972161e-02
    7.53777640e-05   3.93088732e-04   5.67593816e-05   3.49720079e-03
    1.26847463e-05   3.97736003e-04]
 [  3.84938670e-03   2.16755993e-03   2.88283359e-02   1.21681876e-01
    1.22812830e-01   5.86262380e-04   2.56185085e-02   3.50298011e-04
    5.74132195e-03   6.88363612e-01]
 [  3.27815115e-03   7.81

[[  1.00848777e-03   1.86742276e-01   2.61865847e-04   1.32809021e-03
    2.31349305e-03   3.01370572e-04   5.08721098e-02   6.39726082e-03
    5.76068123e-04   7.50198960e-01]
 [  1.64094294e-04   1.53913097e-05   1.90105609e-04   9.94634271e-01
    2.49209415e-06   8.03049770e-04   2.03003277e-04   1.41143464e-04
    3.13871331e-03   7.07797706e-04]
 [  2.01580450e-02   2.30296189e-03   6.96388662e-01   1.06040237e-03
    1.63822703e-03   8.84583915e-06   3.18873823e-02   3.71976494e-05
    2.46265441e-01   2.52819067e-04]
 [  1.40785414e-03   2.85767717e-04   2.05403940e-05   1.08414300e-01
    1.98363778e-05   3.60474386e-03   1.42936558e-02   3.48283770e-03
    8.59917760e-01   8.55258759e-03]
 [  9.28954422e-01   9.22629237e-03   9.67716624e-04   2.47957855e-02
    8.17820709e-03   1.05176563e-03   2.18487275e-03   7.62482305e-05
    1.97932851e-02   4.77134483e-03]
 [  4.34252955e-02   7.43183482e-05   3.02033406e-03   8.25518370e-02
    1.34504382e-02   6.66786975e-04   2.68080

[[  7.78579270e-05   3.73722683e-03   9.45603251e-01   1.21767458e-04
    1.04271551e-03   7.75052584e-04   5.62359625e-03   1.22084096e-02
    7.64707988e-03   2.31631361e-02]
 [  6.47183321e-03   2.85806730e-02   1.63784523e-06   2.57505346e-02
    1.60481548e-03   2.99682058e-02   3.13259847e-02   5.24849573e-04
    5.68627765e-05   8.75714660e-01]
 [  1.34842738e-03   6.50155604e-01   1.05500221e-03   2.09309999e-03
    5.33511222e-04   1.82903092e-03   6.86211511e-02   7.03883066e-04
    3.04149408e-02   2.43245319e-01]
 [  1.60184025e-03   4.20467183e-02   4.79126174e-04   6.09436072e-04
    6.57226294e-02   1.96402369e-04   3.27506930e-01   3.17646121e-03
    3.83281731e-04   5.58277249e-01]
 [  1.86505564e-03   1.80687092e-03   1.02620386e-02   6.75353557e-02
    6.57256678e-05   1.23617356e-04   4.93186344e-05   7.17152297e-01
    1.20019152e-07   2.01139629e-01]
 [  2.95974404e-01   8.76832288e-03   1.44869566e-01   8.37460568e-04
    3.89124215e-01   1.73457861e-02   1.92054