_Python_ package [neurolab](https://github.com/zueve/neurolab) provides a working environment for ANN.

# Perceptrons

We can implement a neural network having a single layer of perceptrons (apart from input units) using _neurolab_ package as an instance of the class `newp`. In order to do so we need to provide the following parameters:
* `minmax`: a list with the same length as the number of input neurons. The $i$-th element on this list is a list of two numbers, indicating the range of input values for the $i$-th neuron.
* `cn`: number of output neurons.
* `transf`: activation function (default value is threshold).

Therefore, when we choose 1 as the value of parameter `cn`, we will be representing a simple perceptron having as many inputs as the length of the list associated to `minmax`.

Let us start by creating a simple perceptron with two inputs, both of them ranging in $[0, 1]$, and with threshold activation function.

In [1]:
from neurolab import net

perceptron = net.newp(minmax=[[0, 1], [0, 1]], cn=1)

The instance that we just created has the following attributes:
* `inp_minmax`: range of input values.
* `co`: number of output neurons.
* `trainf`: training function (the only one specific for single-layer perceptrons is the Delta rule).
* `errorf`: error function (default value is half of SSE, _sum of squared errors_)

The layers of the neural network (input layer does not count, thus in our example there is only one) are stored in a list associated with the attribute `layers`. Each layer is an instance of the class `Layer` and has the following attributes:
* `ci`: number of inputs.
* `cn`: number of neurons on it.
* `co`: number of outputs.
* `np`: dictionary with an element `'b'` that stores an array with the neurons' biasses (terms $a_0 w_0$, default value is 0) and an element `'w'` that stores an array with the weights associated with the incoming connections arriving on each neuron (default value is 0).

In [2]:
print(perceptron.inp_minmax)
print(perceptron.co)
print(perceptron.trainf)
print(perceptron.errorf)
layer = perceptron.layers[0]
print(layer.ci)
print(layer.cn)
print(layer.co)
print(layer.np)

[[0. 1.]
 [0. 1.]]
1
Trainer(TrainDelta)
<neurolab.error.SSE object at 0x0000025EA79D7D90>
2
1
1
{'w': array([[0., 0.]]), 'b': array([0.])}


Next, let us train the perceptron so that it models the logic gate _and_.

First of all, let us define the training set. We shall do it indicating on one hand an array or list of lists with the imput values corresponding to the examples, and on the other hand a different array or list of lists with the expected ouput for each example.

In [3]:
import numpy

input_values = numpy.array([[0, 0], [0, 1], [1, 0], [1, 1]])
expected_outcomes = numpy.array([[0], [0], [0], [1]])

The method `step` allows us to calculate the output of the neural network for a single example, and the method `sim` for all the examples.

In [4]:
perceptron.step([1, 1])

array([0.])

In [5]:
perceptron.sim(input_values)

array([[0.],
       [0.],
       [0.],
       [0.]])

Let us check which is the initial error of the perceptron, before the training.

__Important__: the arguments of the error function must be arrays.

In [6]:
perceptron.errorf(expected_outcomes, perceptron.sim(input_values))

0.5

Let us next proceed to train the perceptron. We shall check that, as expected (since the training set is linearly separable), we are able to decrease the value of the error down to zero.

__Note__: the method `train` that runs the training algorithm on the neural network returns a list showing the value of the network error after each of the _epochs_. More precisely, an epoch represents the set of operations performed by the training algorithm until all the examples  of the training set have been considered.

In [7]:
perceptron.train(input_values, expected_outcomes)

The goal of learning is reached


[0.5, 1.5, 1.0, 0.5, 1.0, 0.0]

In [8]:
print(perceptron.layers[0].np)
print(perceptron.errorf(expected_outcomes, perceptron.sim(input_values)))

{'w': array([[0.02, 0.01]]), 'b': array([-0.02])}
0.0


# Feed forward perceptrons

Package _neurolab_ implements a feed forward artificial neural network as an instance of the class `newff`. In order to do so, we need to provide the following parameters:
* `minmax`: a list with the same length as the number of input neurons. The $i$-th element on this list is a list of two numbers, indicating the range of input values for the $i$-th neuron.
* `cn`: number of output neurons.
* `transf`: activation function (default value is threshold).

* `size`: a list with the same length as the number of layers (except the input layer). The $i$-th element on this list is a number, indicating the number of neurons for the $i$-th layer.
* `transf`: a list with the same length as the number of layers (except the input layer). The $i$-th element on this list is the activation function (default value is [hyperbolic tangent](https://en.wikipedia.org/wiki/Hyperbolic_functions) for the neurons of the $i$-th layer.

Next, let us create a neural network with two inputs ranging over $[0, 1]$, one hidden layer having two neurons and an output layer with only one neuron. All neurons should have the sigmoid function as activation function (you may look for further available activation functions at https://pythonhosted.org/neurolab/lib.html#module-neurolab.trans).

In [9]:
from neurolab import trans

sigmoid_act_fun = trans.LogSig()
my_net = net.newff(minmax=[[0, 1], [0, 1]], size=[2, 1], transf=[sigmoid_act_fun]*2)

The instance that we just created has the following attributes:
* `inp_minmax`: range of input values.
* `co`: number of output neurons.
* `trainf`: training function (default value is [Broyden–Fletcher–Goldfarb–Shanno algorithm](https://en.wikipedia.org/wiki/Broyden%E2%80%93Fletcher%E2%80%93Goldfarb%E2%80%93Shanno_algorithm)).
* `errorf`: error function (default value is half of SSE, _sum of squared errors_)

The layers of the neural network (input layer excluded) are stored in a list associated with the attribute `layers`. Each layer is an instance of the class `Layer` and has the following attributes:
* `ci`: number of inputs.
* `cn`: number of neurons on it.
* `co`: number of outputs.
* `np`: dictionary with an element `'b'` that stores an array with the neurons' biasses (terms $a_0 w_0$) and an element `'w'` that stores an array with the weights associated with the incoming connections arriving on each neuron. Default values for the biasses and the weights are calculated following the [Nguyen-Widrow initialization algorithm](https://web.stanford.edu/class/ee373b/nninitialization.pdf).

In [10]:
print(my_net.inp_minmax)
print(my_net.co)
print(my_net.trainf)
print(my_net.errorf)
hidden_layer = my_net.layers[0]
print(hidden_layer.ci)
print(hidden_layer.cn)
print(hidden_layer.co)
print(hidden_layer.np)
output_layer = my_net.layers[1]
print(output_layer.ci)
print(output_layer.cn)
print(output_layer.co)
print(output_layer.np)

[[0. 1.]
 [0. 1.]]
1
Trainer(TrainBFGS)
<neurolab.error.SSE object at 0x0000025EA8E10040>
2
2
2
{'w': array([[-2.50793972,  7.51200628],
       [-6.108985  ,  5.03987126]]), 'b': array([-1.04426859, -2.89068423])}
2
1
1
{'w': array([[ 2.6167767 , -4.95100795]]), 'b': array([2.33423124])}


It is possible to modify the initialization of the biases and weights, you may find available initialization options at https://pythonhosted.org/neurolab/lib.html#module-neurolab.init.<br>
Let us for example set all of them to zero, using the following instructions:

In [11]:
from neurolab import init

for l in my_net.layers:
    l.initf = init.init_zeros
my_net.init()
print(hidden_layer.np)
print(output_layer.np)

{'w': array([[0., 0.],
       [0., 0.]]), 'b': array([0., 0.])}
{'w': array([[0., 0.]]), 'b': array([0.])}


It is also possible to modify the training algorithm, you may find available implemented options at https://pythonhosted.org/neurolab/lib.html#module-neurolab.train.<br>
Let us for example switch to the _gradient descent backpropagation_, using the following instructions:

In [12]:
from neurolab import train

my_net.trainf = train.train_gd

Finally, we can also modify the error function to be used when training, you may find available options at https://pythonhosted.org/neurolab/lib.html#module-neurolab.error.<br>
Let us for example choose the _mean squared error_, using the following instructions:

In [13]:
from neurolab import error

my_net.errorf = error.MSE()

Next, let us train our neural network so that it models the behaviour of the _xor_ logic gate.

First, we need to split our training set into two components: on one hand an array or a list of lists with the input data corresponding to each example, *xor_in* , and on the other hand an array or list of lists with the correct expected ouput for each example, *xor_out* (remember that this time the training set is **not** linearly separable).

In [14]:
xor_in = numpy.array([[0, 0], [0, 1], [1, 0], [1, 1]])
xor_out = numpy.array([[0], [1], [1], [0]])

Let us measure which is the error associated to the initial neural network before the training starts:

In [15]:
print(my_net.sim(xor_in))
print(my_net.errorf(xor_out, my_net.sim(xor_in)))

[[0.5]
 [0.5]
 [0.5]
 [0.5]]
0.25


Let us now proceed to run the training process on the neural network. The functions involved in the training work over the following arguments:
* `lr`: _learning rate_, default value 0.01.
* `epochs`: maximum number of epochs, default value 500.
* `show`: number of epochs that should be executed between two messages in the output log, default value 100.
* `goal`: maximum error accepted (halting criterion), default value 0.01.

In [16]:
my_net.train(xor_in, xor_out, lr=0.1, epochs=50, show=10, goal=0.001)
my_net.sim(xor_in)

Epoch: 10; Error: 0.25;
Epoch: 20; Error: 0.25;
Epoch: 30; Error: 0.25;
Epoch: 40; Error: 0.25;
Epoch: 50; Error: 0.25;
The maximum number of train epochs is reached


array([[0.5],
       [0.5],
       [0.5],
       [0.5]])

Let us now try a different setting. If we reset the neural network and we choose random numbers as initial values for the weights, we obtain the following:

In [17]:
numpy.random.seed(3287426346)  # we set this init seed only for class, so that we always get
                               # the same random numbers and we can compare
my_net.reset()
for l in my_net.layers:
    l.initf = init.InitRand([-1, 1], 'bw')  # 'b' means biases will be modified,
                                            # and 'w' the weights
my_net.init()
my_net.train(xor_in, xor_out, lr=0.1, epochs=10000, show=1000, goal=0.001)
my_net.sim(xor_in)

Epoch: 1000; Error: 0.20624769635154383;
Epoch: 2000; Error: 0.11489105507175529;
Epoch: 3000; Error: 0.01232716148235405;
Epoch: 4000; Error: 0.005060602419320548;
Epoch: 5000; Error: 0.003064045758926594;
Epoch: 6000; Error: 0.0021673030896454206;
Epoch: 7000; Error: 0.00166543486478401;
Epoch: 8000; Error: 0.0013471000008847842;
Epoch: 9000; Error: 0.0011281639233705698;
The goal of learning is reached


array([[0.02815856],
       [0.96705024],
       [0.96701362],
       [0.03213734]])

# _Iris_ dataset

_Iris_ is a classic multivariant dataset that has been exhaustively studied and has become a standard reference when analysing the behaviour of different machine learning algorithms.

_Iris_ gathers four measurements (length and width of sepal and petal) of 50 flowers of each one of the following three species of lilies: _Iris setosa_, _Iris virginica_ and _Iris versicolor_.

Let us start by reading the data from the file `iris.csv` that has been provided together with the practice. It suffices to evaluate the following expressions:

In [18]:
import pandas

iris = pandas.read_csv('iris.csv', header=None,
                       names=['Sepal length', 'sepal width',
                              'petal length', 'petal width',
                              'Species'])
iris.head(10)  # Display ten first examples

Unnamed: 0,Sepal length,sepal width,petal length,petal width,Species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa
5,5.4,3.9,1.7,0.4,Iris-setosa
6,4.6,3.4,1.4,0.3,Iris-setosa
7,5.0,3.4,1.5,0.2,Iris-setosa
8,4.4,2.9,1.4,0.2,Iris-setosa
9,4.9,3.1,1.5,0.1,Iris-setosa


Next, let us move to use a numerical version of the species instead.<br>
Then, we should distribute the examples into two groups: training and test, and split each group into two components: input and expected output (goal).

In [19]:
#this piece of code might cause an error if wrong version of sklearn
#from sklearn import preprocessing
#from sklearn import model_selection

#iris_training, iris_test = model_selection.train_test_split(
#    iris, test_size=.33, random_state=2346523,
#    stratify=iris['Species'])

#ohe = preprocessing.OneHotEncoder(sparse = False)
#input_training = iris_training.iloc[:, :4]
#goal_training = ohe.fit_transform(iris_training['Species'].values.reshape(-1, 1))
#input_training = iris_test.iloc[:, :4]
#goal_training = ohe.transform(iris_test['Species'].values.reshape(-1,1))


#################
#try this instead if the previous does not work
import pandas
from sklearn import preprocessing
from sklearn import model_selection

iris2 = pandas.read_csv('iris_enc.csv', header=None,
                       names=['Sepal length', 'sepal width',
                              'petal length', 'petal width',
                              'Species'])
#iris2.head(10)  # Display ten first examples
iris_training, iris_test = model_selection.train_test_split(
    iris2, test_size=.33, random_state=2346523,
    stratify=iris['Species'])

ohe = preprocessing.OneHotEncoder(sparse = False)

input_training = iris_training.iloc[:, :4]
goal_training = ohe.fit_transform(iris_training['Species'].values.reshape(-1, 1))
goal_training[:10]  # this displays the 10 first expected output vectors (goal)
                    # associated with the training set examples

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [1., 0., 0.],
       [0., 0., 1.],
       [1., 0., 0.]])

In [20]:
input_test = iris_test.iloc[:, :4]
goal_test = ohe.transform(iris_test['Species'].values.reshape(-1,1))

In [21]:
print(input_training.head(10))
print(goal_training[:10])

     Sepal length  sepal width  petal length  petal width
45            4.8          3.0           1.4          0.3
50            7.0          3.2           4.7          1.4
136           6.3          3.4           5.6          2.4
124           6.7          3.3           5.7          2.1
8             4.4          2.9           1.4          0.2
121           5.6          2.8           4.9          2.0
133           6.3          2.8           5.1          1.5
49            5.0          3.3           1.4          0.2
131           7.9          3.8           6.4          2.0
11            4.8          3.4           1.6          0.2
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 1.]
 [1. 0. 0.]
 [0. 0. 1.]
 [0. 0. 1.]
 [1. 0. 0.]
 [0. 0. 1.]
 [1. 0. 0.]]


In [22]:
print(input_test.head(10))
print(goal_test[0:10])

     Sepal length  sepal width  petal length  petal width
57            4.9          2.4           3.3          1.0
32            5.2          4.1           1.5          0.1
87            6.3          2.3           4.4          1.3
17            5.1          3.5           1.4          0.3
79            5.7          2.6           3.5          1.0
138           6.0          3.0           4.8          1.8
73            6.1          2.8           4.7          1.2
82            5.8          2.7           3.9          1.2
33            5.5          4.2           1.4          0.2
2             4.7          3.2           1.3          0.2
[[0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [1. 0. 0.]]


__Exercise 1__: define a function **lily_species** that, given an array with three numbers as input, returns the position where the maximum value is.

In [23]:
def lily_species1(a):
    valM = max(a)
    for i in range(len(a)):
        if(a[i] == valM):
            return i
        
def lily_species2(a):
    b=list(a)
    return b.index(max(b))

def lily_species3(a):
    return numpy.argmax(a)

print(lily_species1(numpy.array([2, 5, 0])))
print(lily_species2(numpy.array([2, 5, 0])))
print(lily_species3(numpy.array([2, 5, 0])))

1
1
1


__Exercise 2__: Create a feed forward neural network having the following features:
1. Has four input neurons, one for each attribute of the iris dataset.
2. Has three output neurons, one for each species.
3. Has one hidden layer with two neurons.
4. All neurons of all layers use the sigmoid as activation function.
5. The initial biases and weights are all equal to zero.
6. Training method is gradient descent backpropagation.
7. The error function is the mean squared error.

Once you have created it, train the network over the sets `input_training` and `goal_training`.

In [24]:
netEx2 = net.newff(minmax=[[4.0, 8.5], [1.5, 5.0], [0.5, 7.5], [0.0, 3.0]], size=[2,3], transf=[sigmoid_act_fun, sigmoid_act_fun])
for l in netEx2.layers:
    l.initf = init.init_zeros
    
netEx2.init()

netEx2.trainf = train.train_gd

netEx2.errorf = error.MSE()

In [25]:
netEx2.init()

netEx2.train(input_training, goal_training, lr=0.1, epochs=50, show=10, goal=0.001)

netEx2.sim(input_training)

Epoch: 10; Error: 0.22058139862224968;
Epoch: 20; Error: 0.18035997103431817;
Epoch: 30; Error: 0.14500117995857245;
Epoch: 40; Error: 0.12488101563573291;
Epoch: 50; Error: 0.11810431266169673;
The maximum number of train epochs is reached


array([[0.81377302, 0.20256418, 0.09580015],
       [0.1541085 , 0.3726433 , 0.41682074],
       [0.05371086, 0.44790842, 0.59019955],
       [0.05519951, 0.44599777, 0.58599176],
       [0.81049929, 0.20349237, 0.09692265],
       [0.0569165 , 0.44385531, 0.58125429],
       [0.06364244, 0.43602665, 0.56378069],
       [0.8180967 , 0.20132377, 0.09431232],
       [0.05704024, 0.44370335, 0.58091751],
       [0.81570945, 0.20201073, 0.09513457],
       [0.08810825, 0.41305686, 0.51129527],
       [0.11920787, 0.39139886, 0.46076201],
       [0.81821718, 0.20128896, 0.09427077],
       [0.81955271, 0.20090221, 0.09380987],
       [0.82026445, 0.20069541, 0.09356398],
       [0.81485332, 0.20225583, 0.09542899],
       [0.15985321, 0.36993406, 0.41049478],
       [0.05510545, 0.44611701, 0.58625484],
       [0.14763911, 0.37580516, 0.42421449],
       [0.05184145, 0.45038209, 0.59562265],
       [0.05466093, 0.44668318, 0.5875031 ],
       [0.81388685, 0.20253174, 0.09576106],
       [0.

__Exercise 3__: Calculate the performance of the network that was trained on the previous exercise, using to this aim the sets `input_test` and `goal_test`. That is, calculate which fraction of the test set is getting the correct classification predicted by the network.<br>
__Hint:__ In order to translate the output of the network and obtain which is the species predicted, use the function from exercise 1.

In [26]:
print(goal_test[0:10])
print(input_test.head(10))

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [1. 0. 0.]]
     Sepal length  sepal width  petal length  petal width
57            4.9          2.4           3.3          1.0
32            5.2          4.1           1.5          0.1
87            6.3          2.3           4.4          1.3
17            5.1          3.5           1.4          0.3
79            5.7          2.6           3.5          1.0
138           6.0          3.0           4.8          1.8
73            6.1          2.8           4.7          1.2
82            5.8          2.7           3.9          1.2
33            5.5          4.2           1.4          0.2
2             4.7          3.2           1.3          0.2


__Exercise 4__: try to create different variants of the network from exercise 2, by modifying the number of hidden layers and/or the amount of neurons per layer, in such a way that the performance over the test set is improved.