# Simple Multilayer-Perceptron for MNIST classification

Is there any framework out there in which it is easier to to build and train a network 
as Knet and Helferlein?

In [7]:
using Knet
using NNHelferlein
using MLDatasets: MNIST

### Get MNIST data from MLDatasets:

The data is already scaled to pixel values between 0.0 and 1.0.    
Only modification necessary is set the class number for the "0" to 10
(because in Julia we have no array-index 0):

In [8]:
mnist_dir = joinpath(NNHelferlein.DATA_DIR, "mnist")
xtrn,ytrn = MNIST.traindata(Float32, dir=mnist_dir)
ytrn[ytrn.==0] .= 10
@show dtrn = minibatch(xtrn, ytrn, 128; xsize = (28*28,:))

xvld,yvld = MNIST.testdata(Float32, dir=mnist_dir)
yvld[yvld.==0] .= 10
@show dvld = minibatch(xvld, yvld, 128; xsize = (28*28,:));

dtrn = minibatch(xtrn, ytrn, 128; xsize = (28 * 28, :)) = 468-element Knet.Train20.Data{Tuple{KnetArray{Float32}, Array{Int64}}}
dvld = minibatch(xvld, yvld, 128; xsize = (28 * 28, :)) = 78-element Knet.Train20.Data{Tuple{KnetArray{Float32}, Array{Int64}}}


The minibatch includes 2-tuples of 784x128 matrix with the flattened pixel data and a 128 vector with the teaching input; i.e. the labels in a range 1-10.

If a functional GPU is detected, the array type is `KnetArray`, otherwise its `Array`. 
Computations with KnetArrays are performed on the GPU - no need to care in the calling code!

Data looks like:

In [9]:
first(dtrn)[1]  # first minimatch:

784×128 Knet.KnetArrays.KnetMatrix{Float32}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

In [10]:
first(dtrn)[2]  # labels of first minibatch:

128-element Vector{Int64}:
  5
 10
  4
  1
  9
  2
  1
  3
  1
  4
  3
  5
  3
  ⋮
  9
  2
 10
 10
  2
 10
  2
  7
  1
  8
  6
  4

### Define LeNet with NNHelferlein types:

The wrapper type `Classifier` provides a signature with nll-loss 
(negative log-likelyhood; crossentropy for one-class classification tasks). 
For correct calculation of the nll, raw activations of the output-layer are 
needed (no activation function applied):

In [11]:
mlp = Classifier(Dense(784, 256),
                 Dense(256, 64), 
                 Dense(64, 10, actf=identity))

Classifier((Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(256,784)), P(Knet.KnetArrays.KnetVector{Float32}(256)), Knet.Ops20.sigm), Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(64,256)), P(Knet.KnetArrays.KnetVector{Float32}(64)), Knet.Ops20.sigm), Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(10,64)), P(Knet.KnetArrays.KnetVector{Float32}(10)), identity)))

### Train with Tensorboard logger:

Just some seconds on a GPU. 

Training curves can be visualised with TensorBoard, by pointing TensorBoard to the
specified log-directory:

In [12]:
tb_train!(mlp, Adam, dtrn, dvld, epochs=10,
        acc_fun=accuracy,
        eval_size=0.2, eval_freq=5, mb_loss_freq=100, 
        tb_name="mlp_run", tb_text="NNHelferlein example")

Training 10 epochs with 468 minibatches/epoch
    (and 78 validation mbs).
Evaluation is performed every 94 minibatches (with 16 mbs).
Watch the progress with TensorBoard at: /home/andreas/.julia/dev/NNHelferlein/examples/logs/mlp_run/2021-12-22T10-57-23


[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:17[39m


Training finished with:
Training loss:       0.04520638226571246
Training accuracy:   0.9873965010683761
Validation loss:     0.08399295689872442
Validation accuracy: 0.9752604166666666


Classifier((Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(256,784)), P(Knet.KnetArrays.KnetVector{Float32}(256)), Knet.Ops20.sigm), Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(64,256)), P(Knet.KnetArrays.KnetVector{Float32}(64)), Knet.Ops20.sigm), Dense(P(Knet.KnetArrays.KnetMatrix{Float32}(10,64)), P(Knet.KnetArrays.KnetVector{Float32}(10)), identity)))