You will build a simple network to test with <a href="http://wwwold.ece.utep.edu/research/webfuzzy/docs/kk-thesis/kk-thesis-html/node19.html">the XOR problem</a>. Please follow thoroughly the steps below for setting up the topology of the network, building a dataset, and training the network.

## Building a Network

In PyBrain, networks are composed of Modules which are connected with Connections. You can think of a network as a directed acyclic graph, where the nodes are Modules and the edges are Connections. This makes PyBrain very flexible but it is also not necessary in all cases.
Thus, there is a simple way to create networks, which is the <tt>buildNetwork</tt> shortcut:

In [7]:
from pybrain.tools.shortcuts import buildNetwork
net = buildNetwork(2, 3, 1)

This call returns a network that has two inputs, three hidden and a single output neuron. In PyBrain, these layers are <tt>Module</tt> objects and they are already connected with <tt>FullConnection</tt> objects.
The net is already initialized with random values - we can already calculate its output.
For this we use the <tt>.activate()</tt> method, which expects a list, tuple or an array as input:

In [8]:
net.activate([2, 1])

array([ 1.05990648])

How can we examine the structure of our network somewhat closer? In PyBrain, every part of a network has a name by which you can access it. When building networks with the <tt>buildNetwork</tt> shortcut, the parts are named automatically:

In [9]:
net['in']

<LinearLayer 'in'>

In [10]:
net['hidden0']

<SigmoidLayer 'hidden0'>

In [11]:
net['out']

<LinearLayer 'out'>

The hidden layers have numbers at the end in order to distinguish between those.
Of course, we want more flexibility when building up networks. For instance, the hidden layer is constructed with the sigmoid squashing function per default: but in a lot of cases, this is not what we want. We can also supply different types of layers: 

In [12]:
from pybrain.structure import TanhLayer
net = buildNetwork(2, 3, 1, hiddenclass=TanhLayer)
net['hidden0']

<TanhLayer 'hidden0'>

There is more we can do. For example, we can also set a different class for the output layer:

In [13]:
from pybrain.structure import SoftmaxLayer
net = buildNetwork(2, 3, 2, hiddenclass=TanhLayer, outclass=SoftmaxLayer)
net.activate((2, 3))

array([ 0.79462493,  0.20537507])

We can also tell the network to use a bias:

In [14]:
net = buildNetwork(2, 3, 1, bias=True)
net['bias']

<BiasUnit 'bias'>

This approach has of course some restrictions: for example, we can only construct a feedforward topology. But it is possible to create very sophisticated architectures with PyBrain, and it is also one of the library's strength to do so.

## Building a DataSet

In order for our networks to learn anything, we need a dataset that contains inputs and targets. PyBrain has the <tt>pybrain.dataset</tt> package for this, and we will use the SupervisedDataSet class for our needs.
<p>The <tt>SupervisedDataSet</tt> class is used for standard supervised learning. It supports input and target values, whose size we have to specify on object creation:

In [15]:
from pybrain.datasets import SupervisedDataSet
ds = SupervisedDataSet(2, 1)

Here we have generated a dataset that supports two dimensional inputs and one dimensional targets.
<p>
A classic example for neural network training is the XOR function, so let's just build a dataset for this. We can do this by just adding samples to the dataset:

In [16]:
ds.addSample((0, 0), (0,))
ds.addSample((0, 1), (1,))
ds.addSample((1, 0), (1,))
ds.addSample((1, 1), (0,))

We now have a dataset that has 4 samples in it. We can check that with python's idiomatic way of checking the size of something:

In [17]:
len(ds)

4

We can also iterate over it in the standard way:

In [18]:
for inpt, target in ds:
    print inpt, target

[ 0.  0.] [ 0.]
[ 0.  1.] [ 1.]
[ 1.  0.] [ 1.]
[ 1.  1.] [ 0.]


We can access the input and target field directly as arrays:

In [19]:
ds['input']

array([[ 0.,  0.],
       [ 0.,  1.],
       [ 1.,  0.],
       [ 1.,  1.]])

In [20]:
ds['target']

array([[ 0.],
       [ 1.],
       [ 1.],
       [ 0.]])

## Training the Network

For adjusting parameters of modules in supervised learning, PyBrain has the concept of trainers. Trainers take a module and a dataset in order to train the module to fit the data in the dataset.
<p>A classic example for training is backpropagation. PyBrain comes with backpropagation, of course, and we will use the <tt>BackpropTrainer</tt> here:

In [21]:
from pybrain.supervised.trainers import BackpropTrainer

We have already build a dataset for XOR and we have also learned to build networks that can handle such problems. Let's just connect the two with a trainer:

In [22]:
net = buildNetwork(2, 3, 1,  hiddenclass=TanhLayer, bias=True)
trainer = BackpropTrainer(net, ds, momentum=0.9)

The trainer now knows about the network and the dataset and we can train the net on the data:

In [23]:
trainer.train()

0.81810168570349662

This call trains the net for one full epoch and returns the error (squared sum of differences between the target and the output). Training for another epoch should further reduce that error:

In [24]:
trainer.train()

0.6073940747412232

If we want to train the network until convergence, we must loop for several epochs:

In [26]:
for epoch in range(0,100):
    error = trainer.train()
print(error)

4.81421633717e-07


## Exercise

Write a script with all these steps:
<ul>
<li>create a network
<li>create a dataset for the XOR problem
<li>display the output of the network <b>before</b> training, for all the samples of the data set
<li>train the network until the error is less than 1e-06
<li>display the output of the network <b>after</b> training, for all the samples of the data set
</ul>