In [2]:
var max = require('lodash/max');
var findIndex = require('lodash/findIndex');

'use strict'

# Neural Networks & Deep[ish] Learning

There are a few js neural network libraries around.

 - [synaptic](https://github.com/cazala/synaptic)
 - [convnetjs](http://cs.stanford.edu/people/karpathy/convnetjs/)
 - [char-rnn](https://github.com/garywang/char-rnn.js)
 - [neuro.js](https://github.com/janhuenermann/neurojs)
 - even a proposal for [tensorflow](https://github.com/node-tensorflow/node-tensorflow) in node!
 
 
We re going to focus on CONVNET.JS, built y folks at Standford its mature and stable. Although, synaptic looks great too and we will be borroeing from a lot of their excellent introductory docs.


# Neural Networks 101

Let's look at the first part of [synaptics](https://github.com/cazala/synaptic/wiki/Neural-Networks-101) 101 guide.

### Activation functions

As we saw a single neuron in a neural network encodes a weight vector and reduces applies this to a input vector, reducing it to a single value. This is the projection of the input on the weights within the n-dimensional weight space.

This reduced quantity is then put through an activation function on order to produce the neuron's output activation. 

There are many possible [activation functions](https://en.wikipedia.org/wiki/Activation_function) and any library you use will likely have multiple choices and thre is always scope to code your own.

Network convergence and stability can depend significantly on the choice ecause of the following:

 - blow up
 - [vanishing gradients](https://ayearofai.com/rohan-4-the-vanishing-gradient-problem-ec68f76ffb9b)



### Other Functions

Right back atthe start of the day we looked at loss and error functions. ConvNET docs talk about `loss layers` these layers of the network have an in built loss function that is used during training. ConvNET doesn't expose us to that detail however it's there is some other terminology in plat that it's worth mentioning:

 - softmax
 - SVM
 - 

## convnetjs

Convnet introduces for following main classes:

 - [Vol](https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol.js) - a 3d volume internally organised as a list
 - [Layer](https://github.com/karpathy/convnetjs/tree/master/src) - a layer in a network. There are different types of layer; input, fc (fully connected), loss (softmax)
 - [Net](https://github.com/karpathy/convnetjs/blob/master/src/convnet_net.js) - the network itself, consisting of layers. Net is responsible for pushing data `forward` through the layers to produce an output. During training and backpropgation it is responsile for calling the `backward` function to calculate gradients.
 - [Trainer](https://github.com/karpathy/convnetjs/blob/master/src/convnet_trainers.js) - takes a network, parameters, examples and the associated correct labels and it will train the network


The simplest possile Network 2D input, 1 output class and train it with the XOR function


 A | B | Out
---|---|---:
 0 | 0 | 0 
 1 | 0 | 1 
 0 | 1 | 1 
 1 | 1 | 0 

In [3]:
var convnetjs, {Vol, Net, Layer, Trainer} = require('convnetjs')

'use strict'

#### Training Data

Lets setup a full training dataset. Which is easy as we only have 4 cases, each input vector needs ot be packed within a `Vol`

From the [docs](http://cs.stanford.edu/people/karpathy/convnetjs/docs.html):

    // create a Vol of size 32x32x3, and filled with random numbers
    var v = new convnetjs.Vol(32, 32, 3);
    var v = new convnetjs.Vol(32, 32, 3, 0.0); // same volume but init with zeros
    var v = new convnetjs.Vol(1, 1, 3); // a 1x1x3 Vol with random numbers

    // you can also initialize with a specific list. E.g. create a 1x1x3 Vol:
    var v = new convnetjs.Vol([1.2, 3.5, 3.6]);



NOTE: Remember to draw this out on the whiteboard!

In [4]:
var X = [
    new Vol([0,0]),
    new Vol([1,0]),
    new Vol([0,1]),
    new Vol([1,1])
];

Our data is maintained within a property `w` on each vol and the gradients associated with these in `dw`. Check the value of these for our training samples

In [5]:
// console out the innards of a Vol here
console.log(Object.keys(X[0].dw))

[ '0', '1' ]


We will also need to show the network correct output values, so we need an output array, which is a plain list

In [6]:
var Y = [0,1,1,0]

In [7]:
var layer_defs = [];

layer_defs.push({type:'input', out_sx:1, out_sy:1, out_depth:2});

layer_defs.push({type:'softmax', num_classes:2});

// create a net
var net = new Net();
net.makeLayers(layer_defs);

In [8]:
var scores = net.forward(X[1]); // pass forward through network

// scores is now a Vol() of output activations
console.log('score for class 0 is assigned: ' + scores.w[0]);
console.log('score for class 1 is assigned: ' + scores.w[1]);
var maxp = max(scores.w);
console.log("Predicted Class: ", findIndex(scores.w, s => s === maxp));

score for class 0 is assigned: 0.6777023643113648
score for class 1 is assigned: 0.3222976356886352
Predicted Class:  0


#### Training

Now we are going to train the network with a single pass through our training data. 

In [9]:
var trainer = new Trainer(net, { learning_rate:0.01, l2_decay:0.001 });

for (var i = 0; i < 10000; i++) {
trainer.train(X[0], 0);
trainer.train(X[1], 1);
trainer.train(X[2], 1);
trainer.train(X[3], 0);
}



{ fwd_time: 0,
  bwd_time: 0,
  l2_decay_loss: 7.829163779883886e-7,
  l1_decay_loss: 0,
  cost_loss: 0.6981250709291885,
  softmax_loss: 0.6981250709291885,
  loss: 0.6981258538455665 }

Now let's build the layer definitions. In this first try, let's use the simplest possible network; 2 layers. 1x `input` layer and 1x loss layer for which we will use a softmax.

Try runing the following cell, 
 - what do you think of the output scores?
 - try different memers of our training set X[n]
 - try running the cell multiple times with the input as X[1], what do you notice? do you know why?

In [10]:
var scores2 = net.forward(X[0]);
console.log('probability that x is class 0: ' + scores2.w[0]);
console.log('probability that x is class 1: ' + scores2.w[1]);

probability that x is class 0: 0.49753693886314754
probability that x is class 1: 0.5024630611368525


In [11]:
function test(net, X, Y) {
    X.map((x, idx) => {
        var scores = net.forward(x);
        var maxp = max(scores.w);
        var c = findIndex(scores.w, s => s === maxp);
        var rw = Y[idx] === c ? "RIGHT" : "WRONG"
        console.log("Pred:", c, "Actual:", Y[idx], "Prob:", maxp, rw)
    });
}

In [12]:
test(net, X, Y)

Pred: 1 Actual: 0 Prob: 0.5024630611368525 WRONG
Pred: 1 Actual: 1 Prob: 0.5014116065045738 RIGHT
Pred: 1 Actual: 1 Prob: 0.5012767524559826 RIGHT
Pred: 1 Actual: 0 Prob: 0.5002252844101499 WRONG


What do you think of these results? great? poor?

Unless you are very luck these should be awful! cal you think of some reasons why?
