# Lasagne Notes

This document contains a series of notes on using Lasagne, starting with the sample code given in the Lasagne Github page. For more information on Lasagne, visit the Github page here: https://github.com/Lasagne/Lasagne 

The following example is from the Lasagne Github. It is not functional at the moment (given that there is no data to fill the input), but it demonstrates a basic example of how to use Lasagne to structure a simple neural network. 

In [2]:
import lasagne 
import theano 
import theano.tensor as T 


# Sample neural network from the Lasagne github. 
input_var = T.tensor4('X')
target_var = T.ivector('y')

# Create a convolutional neural network! 
from lasagne.nonlinearities import leaky_rectify, softmax 
network = lasagne.layers.InputLayer((None, 3, 32, 32), input_var)
network = lasagne.layers.Conv2DLayer(
    network, 
    64, 
    (3, 3), 
    nonlinearity=leaky_rectify
)

network = lasagne.layers.Conv2DLayer(
    network, 
    32, 
    (3, 3), 
    nonlinearity=leaky_rectify
)

network = lasagne.layers.Pool2DLayer(
    network,
    (3, 3),
    stride=2,
    mode='max'
)

network = lasagne.layers.DenseLayer(
    lasagne.layers.dropout(network, 0.5),
    128, 
    nonlinearity=leaky_rectify,
    W=lasagne.init.Orthogonal()
)

network = lasagne.layers.DenseLayer(
    lasagne.layers.dropout(network, 0.5),
    10,
    nonlinearity=softmax
)

prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.categorical_crossentropy(prediction, 
                                                   target_var)
loss = loss.mean() + 1e-4 * lasagne.regularization.regularize_network_params(
    network, lasagne.regularization.12)

params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.neterov_momentum(loss, params, 
                                          learning_rate=0.01,
                                          momentum=0.9)
train_fn = theano.function([input_var, target_var], 
                          loss, updates=updates)
for epoch in range(100):
    loss = 0
    for input_batch, target_batch in  training_data:
        loss += train_fn(input_batch, target_batch)
    print("Epock %d: Loss %g" % (epoch + 1, loss / len(training_data)))
    
test_prediction = lasagne.layers.get_output(network, deterministic=True)
predict_fn = theano.function([input_var], T.argmax(test_prediction, axis=1))
print("Predicted class for first test input: %r" % predict_fn(test_data[0]))



    This layer holds a symbolic variable that represents a network input. A
    variable can be specified when the layer is instantiated, else it is
    created.

    Parameters
    ----------
    shape : tuple of `int` or `None` elements
        The shape of the input. Any element can be `None` to indicate that the
        size of that dimension is not fixed at compile time.

    input_var : Theano symbolic variable or `None` (default: `None`)
        A variable representing a network input. If it is not provided, a
        variable will be created.

    Raises
    ------
    ValueError
        If the dimension of `input_var` is not equal to `len(shape)`

    Notes
    -----
    The first dimension usually indicates the batch size. If you specify it,
    Theano may apply more optimizations while compiling the training or
    prediction function, but the compiled function will not accept data of a
    different batch size at runtime. To compile for a variable batch size, set
    the f

This code may take some explanation, so let's do that:

### InputLayer

An input layer is the first layer in the network. It is the starting point for all data, and sets the initial dimensions. Every layer type in Lasagne has a large number of potential inputs to specify their dimensions and behaviour. An explanation of all of the potential parameters for any given layer type can be found the in the __doc__ for that layer. 

In particular, the Input Layer is supposed to be the first layer in the network.

### Conv2DLayer

The Conv2DLayer class is a description of a 2-dimensional Convolutional Layer. Convolutional layers are sparsely-connected (read, not fully connected) layers where each output node only depends on some marginal fraction of the input, instead of every input node. Obviously, as entropy increases the time complexity of any neural network algorithm would blaze on for all eternity if it were fully-connected, so convolutional layers are meant to reduce the total number of dependencies and VASTLY reduce the run-time for algorithms such as this. Large-scale image recognition makes use of convolutional layers for this purpose.

Convolutional layers are made up of some number of filters. Technically there is one filter for every output node, and that filter has some dimension, ie. the number of inputs it receives from the input layer, and a stride, which means how many nodes it skips. 

### Pool2DLayer

This layer type is meant to describe pooling layers, which are meant to cut down the total number of inputs for the network. Pooling layers are meant to reduce the complexity of the input before the final prediction is made.

### DropoutLayer

This layer type describes a dropout layer, which is a layer that has some probability for a particular input becoming 0. It randomly drops out certain inputs. The reasoning behind this is above my current pay-grade at the moment.

### DenseLayer

A DenseLayer is a layer that is fully-connected to the layer behind it. These can be seen as the direct analog of the Conv2DLayer's above, since whereas convolutional layers are only sparsely connected using filters, dense layers are densely connected, and every output is connected to every input. In the example above, the dense layers are only used toward the end since using them closer to the beginning, when there are far more nodes, would significantly increase the running time of the algorithm.