# Build Artificial Neural Networks

Artificial neural networks in BrainPy are used to build dynamical systems. Here we only talk about how to build a neural network and how to train it. 

The [brainpy.simulation.layers](../apis/auto/training/layers.rst) module provides various classes representing the layers of a neural network. All of them are subclasses of the ``brainpy.training.layers.Module`` base class.

In [1]:
import brainpy as bp
import brainpy.math as bm

bp.math.set_platform('cpu')

## Creating a layer

A layer can be created as an instance of a ``brainpy.training.layers.Module`` subclass. For example, a dense layer can be created as follows:

In [2]:
l = bp.layers.Dense(num_hidden=100, num_input=128) 

In [3]:
type(l)

brainpy.training.layers.dense.Dense

This will create a dense layer with 100 units, connected to another input layer with 128 dimension.

## Creating a network

Chaining layer instances together like this will allow you to specify your desired network structure. 

This can be done with inheritance  from ``brainpy.simulation.layers.Module``, 

In [4]:
class MLP(bp.layers.Module):
    def __init__(self, n_in, n_l1, n_l2, n_out):
        super(MLP, self).__init__()
        
        self.l1 = bp.layers.Dense(num_hidden=n_l1, num_input=n_in)
        self.l2 = bp.layers.Dense(num_hidden=n_l2, num_input=n_l1)
        self.l3 = bp.layers.Dense(num_hidden=n_out, num_input=n_l2)
        
    def update(self, x):
        x = bm.relu(self.l1(x))
        x = bm.relu(self.l2(x))
        x = self.l3(x)
        return x

In [5]:
mlp1 = MLP(10, 50, 100, 2)

Or using ``brainpy.simulation.layers.Sequential``, 

In [6]:
mlp2 = bp.layers.Sequential(
    l1=bp.layers.Dense(num_hidden=50, num_input=10),
    r1=bp.layers.Activation('relu'), 
    l2=bp.layers.Dense(num_hidden=100, num_input=50),
    r2=bp.layers.Activation('relu'), 
    l3=bp.layers.Dense(num_hidden=2, num_input=100),
)

## Naming a layer

For convenience, you can name a layer by specifying the name keyword argument:

In [7]:
l_hidden = bp.layers.Dense(num_hidden=50, num_input=10, name='hidden_layer')

## Initializing parameters

Many types of layers, such as ``brainpy.simulation.layers.Dense``, have trainable parameters. These are referred to by short names that match the conventions used in modern deep learning literature. For example, a weight matrix will usually be called *w*, and a bias vector will usually be *b*.

When creating a layer with trainable parameters, ``TrainVar`` will be created for them and initialized automatically. You can optionally specify your own initialization strategy by using keyword arguments that match the parameter variable names. For example:

In [8]:
l = bp.layers.Dense(num_hidden=50, num_input=10, w=bp.init.Normal(0.01))

The weight matrix *w* of this dense layer will be initialized using samples from a normal distribution with standard deviation 0.01 (see [brainpy.initialize](../apis/auto/training/initialize.rst) for more information).

There are several ways to manually initialize parameters:

- Tensors

If a tensor variable instance is provided, this is used unchanged as the parameter variable. For example:

In [9]:
w = bm.random.normal(0, 0.01, size=(10, 50))
bp.layers.Dense(num_hidden=50, num_input=10, w=w)

<brainpy.training.layers.dense.Dense at 0x1b157eeca90>

- callable

If a callable is provided (e.g. a function or a ``brainpy.training.initialize.Initializer`` instance), the callable will be called with the desired shape to generate suitable initial parameter values. The variable is then initialized with those values. For example:

In [10]:
bp.layers.Dense(num_hidden=50, num_input=10, w=bp.initialize.Normal(0.01))

<brainpy.training.layers.dense.Dense at 0x1b1606a4520>

Or, using a custom initialization function:

In [11]:
def init_w(shape):
    return bm.random.normal(0, 0.01, shape)

bp.layers.Dense(num_hidden=50, num_input=10, w=init_w)

<brainpy.training.layers.dense.Dense at 0x1b161763520>

Some types of parameter variables can also be set to ``None`` at initialization (e.g. biases). In that case, the parameter variable will be omitted. For example, creating a dense layer without biases is done as follows:

In [12]:
bp.layers.Dense(num_hidden=50, num_input=10, b=None)

<brainpy.training.layers.dense.Dense at 0x1b1606a4fd0>

## Setup a training

Here, we show an example to  train MLP to classify the MNIST images. 

In [13]:
import numpy as np
import tensorflow as tf

# Data
(X_train, Y_train), (X_test, Y_test) = tf.keras.datasets.mnist.load_data()
num_train, num_test = X_train.shape[0], X_test.shape[0]
num_dim = bp.tools.size2num(X_train.shape[1:])
X_train = np.asarray(X_train.reshape((num_train, num_dim)) / 255.0, dtype=bm.float_)
X_test = np.asarray(X_test.reshape((num_test, num_dim)) / 255.0, dtype=bm.float_)
Y_train = np.asarray(Y_train.flatten(), dtype=bm.float_)
Y_test = np.asarray(Y_test.flatten(), dtype=bm.float_)

In [14]:
model = MLP(n_in=num_dim, n_l1=256, n_l2=128, n_out=10)

In [15]:
opt = bm.optimizers.Momentum(lr=1e-3, train_vars=model.train_vars())

In [16]:
gv = bm.grad(lambda X, Y: bm.losses.cross_entropy_loss(model(X), Y),
             dyn_vars=model.vars(),
             grad_vars=model.train_vars(),
             return_value=True)

In [17]:
@bm.jit
@bm.function(nodes=(model, opt))
def train(x, y):
    grads, loss = gv(x, y)
    opt.update(grads=grads)
    return loss

In [18]:
predict = bm.jit(lambda X: bm.softmax(model(X)), dyn_vars=model.vars())

In [19]:
# Training
num_batch = 128
for epoch in range(30):
  # Train
  loss = []
  sel = np.arange(len(X_train))
  np.random.shuffle(sel)
  for it in range(0, X_train.shape[0], num_batch):
    l = train(X_train[sel[it:it + num_batch]], Y_train[sel[it:it + num_batch]])
    loss.append(l)

  # Eval
  test_predictions = predict(X_test).argmax(1)
  accuracy = np.array(test_predictions).flatten() == Y_test
  print(f'Epoch {epoch + 1:4d}  Train Loss {np.mean(loss):.3f}  Test Accuracy {100 * np.mean(accuracy):.3f}')

Epoch    1  Train Loss 1.248  Test Accuracy 86.880
Epoch    2  Train Loss 0.475  Test Accuracy 89.680
Epoch    3  Train Loss 0.371  Test Accuracy 90.870
Epoch    4  Train Loss 0.328  Test Accuracy 91.660
Epoch    5  Train Loss 0.300  Test Accuracy 92.410
Epoch    6  Train Loss 0.279  Test Accuracy 92.740
Epoch    7  Train Loss 0.263  Test Accuracy 93.170
Epoch    8  Train Loss 0.249  Test Accuracy 93.410
Epoch    9  Train Loss 0.236  Test Accuracy 93.740
Epoch   10  Train Loss 0.225  Test Accuracy 93.780
Epoch   11  Train Loss 0.215  Test Accuracy 94.020
Epoch   12  Train Loss 0.207  Test Accuracy 94.270
Epoch   13  Train Loss 0.198  Test Accuracy 94.490
Epoch   14  Train Loss 0.191  Test Accuracy 94.550
Epoch   15  Train Loss 0.184  Test Accuracy 94.760
Epoch   16  Train Loss 0.177  Test Accuracy 94.860
Epoch   17  Train Loss 0.171  Test Accuracy 95.020
Epoch   18  Train Loss 0.166  Test Accuracy 95.100
Epoch   19  Train Loss 0.160  Test Accuracy 95.300
Epoch   20  Train Loss 0.155  T