# Training and Testing of Artificial Neural Networks

In [1]:
import numpy as np

## MADALINE

This is a 2 layered artificial neural network. In which, only first weight layer is trainable second layer is kept fixed. Also, it can have only on output unit. `MADALINE.py` contains a general model for **MADALINE**. First we will import model here:

In [2]:
from MADALINE import MADALINE

Next, we will define an instance of the model with 4 input units and 7 hidden units:

In [3]:
model = MADALINE(2, 2)

We can see what all variables the `model` contains:

In [4]:
print(vars(model))

{'W1': array([[-22.10280183,  -8.31868127],
       [  7.86545624,  -2.47535272]]), 'sum': array([[0.],
       [0.]]), 'W2': array([[-3.30342294, -2.93909068]]), 'b1': array([[ 7.97640437],
       [12.14962394]]), 'b2': array([[-1.52114289]]), 't': 0, 'vactivation': <numpy.vectorize object at 0x0000029F58A48A58>}


From above output, we can see that the model has 4 member variables namely, **W1**, **W2**, **b1** and **b2**. These corresponds to the weights for first layer, weights for second layer, biases for hidden layer and bias for output layer. Apart from these we have one more member variable **sum** which will be used to save the sums calculated while forward pass for hidden layer and this will be used to change the weights later while backpropagation.

To do the calculations manually, I will reinitiaze weights of **model** so that we wouldn't get random weights

In [5]:
model.W1 = np.array([[0.1, 0.2], [0.3, 0.4]])
model.b1 = np.array([[0.5], [0.5]])
model.W2 = np.array([[0.5, 0.6]])
model.b2 = np.array([[0.5]])

In [6]:
print(vars(model))

{'W1': array([[0.1, 0.2],
       [0.3, 0.4]]), 'sum': array([[0.],
       [0.]]), 'W2': array([[0.5, 0.6]]), 'b1': array([[0.5],
       [0.5]]), 'b2': array([[0.5]]), 't': 0, 'vactivation': <numpy.vectorize object at 0x0000029F58A48A58>}


I am going train the model to mimic XOR gate so for that we need to make data to pass through the neural net to learn.

In [7]:
data = [(np.array([[1], [1]]), np.array([[-1]])), 
        (np.array([[1], [-1]]), np.array([[1]])), 
        (np.array([[-1], [1]]), np.array([[1]])), 
        (np.array([[-1], [-1]]), np.array([[-1]]))]
print(data)

[(array([[1],
       [1]]), array([[-1]])), (array([[ 1],
       [-1]]), array([[1]])), (array([[-1],
       [ 1]]), array([[1]])), (array([[-1],
       [-1]]), array([[-1]]))]


Time to train the Madaline net, to show the weights I will train the net for each epoch separately instead of training for multiple epochs at once with 0.5 learning rate.

In [8]:
for d in data:
    model.train([d], 0.5, 1)
    print("sums:", model.sum)
    print("weights(trainable):", model.W1)
    print("Bias (trainable):", model.b1)

Epoch:  1
Accuracy: 0.0
sums: [[0.8]
 [1.2]]
weights(trainable): [[-0.8 -0.7]
 [-0.8 -0.7]]
Bias (trainable): [[-0.4]
 [-0.6]]
Epoch:  1
Accuracy: 0.0
sums: [[-0.5]
 [-0.7]]
weights(trainable): [[-0.05 -1.45]
 [-0.8  -0.7 ]]
Bias (trainable): [[ 0.35]
 [-0.6 ]]
Epoch:  1
Accuracy: 0.0
sums: [[-1.05]
 [-0.5 ]]
weights(trainable): [[-0.05 -1.45]
 [-1.55  0.05]]
Bias (trainable): [[0.35]
 [0.15]]
Epoch:  1
Accuracy: 0.0
sums: [[1.85]
 [1.65]]
weights(trainable): [[ 1.375 -0.025]
 [-0.225  1.375]]
Bias (trainable): [[-1.075]
 [-1.175]]


Here, these are reading for only one epoch for all the examples in dataset. I have calculated the weights and the weights are matching with these.

In [9]:
print("weights for second layer:", model.W2)

weights for second layer: [[0.5 0.6]]


We can see the first layer weights have been changed, whereas the weights for the second layer is same. Now, we will the model for 3 epochs and see the result:

In [10]:
model.train(data, 0.5, 3)

Epoch:  1
Accuracy: 50.0
Epoch:  2
Accuracy: 100.0
Epoch:  3
Accuracy: 100.0


Here, we can see the model has reached the 100% accuracy

## Kohonen Neural Network

This is an unsupersived neural network. It can be used to cluster the given inputs into specified number of outputs. To make clusters this model uses **Euclidean distance**.

In [11]:
from Kohonen import KohonenNN

In [12]:
model1 = KohonenNN(2, 2)

In [13]:
print(vars(model1))

{'W': array([[-0.77761275,  0.02104756],
       [-0.51500453, -0.83832322]])}


The member variable of the KohonenNN is **W**, which corresponds to the weights between input layer and the output layer and will be used to calculate the **Euclidean distance**.

As in the previous model we will again set the weights explicily to see the behaviour properly

In [14]:
model1.W = np.array([[0.1, 0.2], [0.3, 0.4]])
print(model1.W)

[[0.1 0.2]
 [0.3 0.4]]


First, we need to prepare the data for the model. We will use the same data as above but her we do not need the output values or desired values.

In [15]:
data = [np.array([[1], [1]]), 
        np.array([[1], [-1]]),
        np.array([[-1], [1]]),
        np.array([[-1], [-1]])]
print(data)

[array([[1],
       [1]]), array([[ 1],
       [-1]]), array([[-1],
       [ 1]]), array([[-1],
       [-1]])]


I have designed the model such that we can get the clusters directly. Next, I will show weights after each input and update for one epoch:

In [16]:
for d in data:
    print(model1.cluster([d], learning_rate=0.5, R=1, R_decay=0.5, lr_decay= 0.5, epochs=1))
    print("Weights:", model1.W)

Epoch: 1
Overall Euclidean distance value: 2.3
Topological param: 1
Learning rate: 0.25

#----------------------------------------------------------------#

Clusters:
cluster  1
[]
cluster  2
[array([[1],
       [1]])]
None
Weights: [[0.1 0.6]
 [0.3 0.7]]
Epoch: 1
Overall Euclidean distance value: 5.55
Topological param: 1
Learning rate: 0.25

#----------------------------------------------------------------#

Clusters:
cluster  1
[array([[ 1],
       [-1]])]
cluster  2
[]
None
Weights: [[ 0.55  0.6 ]
 [-0.35  0.7 ]]
Epoch: 1
Overall Euclidean distance value: 6.875000000000001
Topological param: 1
Learning rate: 0.25

#----------------------------------------------------------------#

Clusters:
cluster  1
[]
cluster  2
[array([[-1],
       [ 1]])]
None
Weights: [[ 0.55 -0.2 ]
 [-0.35  0.85]]
Epoch: 1
Overall Euclidean distance value: 6.8875
Topological param: 1
Learning rate: 0.25

#----------------------------------------------------------------#

Clusters:
cluster  1
[array([[-1],
  

I have calculated the weights and the weights are matching with these.

Now, we will train the model for 10 epochs and see the clusters formed:

In [17]:
model1.cluster(data, learning_rate=0.5, R=1, R_decay=0.5, lr_decay= 0.5, epochs=10)

Epoch: 1
Overall Euclidean distance value: 6.29296875
Topological param: 1
Learning rate: 0.25

#----------------------------------------------------------------#

Epoch: 2
Overall Euclidean distance value: 5.551797485351563
Topological param: 1
Learning rate: 0.125

#----------------------------------------------------------------#

Epoch: 3
Overall Euclidean distance value: 5.434660343080759
Topological param: 1
Learning rate: 0.0625

#----------------------------------------------------------------#

Epoch: 4
Overall Euclidean distance value: 5.401260014719867
Topological param: 1
Learning rate: 0.03125

#----------------------------------------------------------------#

Epoch: 5
Overall Euclidean distance value: 5.388931253025987
Topological param: 1
Learning rate: 0.015625

#----------------------------------------------------------------#

Epoch: 6
Overall Euclidean distance value: 5.383692340321264
Topological param: 1
Learning rate: 0.0078125

#---------------------------------

Checkout the clusters formed

## Backprop Neural Network

This neural network is based on the backpropagation algorithm for learning. This is the most efficient neural network among all listed here. The learning is based on error calculated between the actual or prdicted output and the target or desired output.

In [18]:
from BackpropNN import BackpropNN

In [19]:
model2 = BackpropNN(2, 2, 1, 0.5)

In [20]:
print(vars(model2))

{'Wi': array([[ -9.92664624, -15.43047047],
       [ 16.54829646,  10.834122  ]]), 'bi': array([[-2.31126544],
       [ 0.16810834]]), 'Wh': array([[-0.03756395, -1.78244269]]), 'bh': array([[8.58771292]]), 's': 0.5}


Again we will set weights explicitly

In [21]:
model2.Wi = np.array([[0.1, 0.2], [0.3, 0.4]])
model2.bi = np.array([[-0.5], [-0.5]])
model2.Wh = np.array([[0.5, 0.6]])
model2.bh = np.array([[-0.5]])

Prepared data for XOR:

In [22]:
data = [(np.array([[1], [1]]), np.array([[0]])),
       (np.array([[1], [-1]]), np.array([[1]])),
       (np.array([[-1], [1]]), np.array([[1]])),
       (np.array([[-1], [-1]]), np.array([[0]]))
       ]
print(data)

[(array([[1],
       [1]]), array([[0]])), (array([[ 1],
       [-1]]), array([[1]])), (array([[-1],
       [ 1]]), array([[1]])), (array([[-1],
       [-1]]), array([[0]]))]


In the next block, I have shown how the weights are after passing each example

In [23]:
for d in data:
    model2.train([d], learning_rate=0.05, epochs=1)
    print("input to hidden weights:", model2.Wi)
    print("hidden layer bias:", model2.bi)
    print("hidden to output weights:", model2.Wh)
    print("output layer bias:", model2.bh)

[[0.00789391 0.00789391]
 [0.0094727  0.0094727 ]]
Epoch: 1
Loss: 0.2566049211303281

#----------------------------------------------------------------#

input to hidden weights: [[0.0996053  0.1996053 ]
 [0.29952637 0.39952637]]
hidden layer bias: [[-0.5003947 ]
 [-0.50047363]]
hidden to output weights: [[0.49699268 0.59667639]]
output layer bias: [[-0.50633093]]
[[-0.00767012  0.00767012]
 [-0.00920849  0.00920849]]
Epoch: 1
Loss: 0.25514734621233925

#----------------------------------------------------------------#

input to hidden weights: [[0.09998881 0.1992218 ]
 [0.29998679 0.39906594]]
hidden layer bias: [[-0.50001119]
 [-0.50001321]]
hidden to output weights: [[0.49967906 0.59936272]]
output layer bias: [[-0.50001758]]
[[ 0.00774002 -0.00774002]
 [ 0.00928404 -0.00928404]]
Epoch: 1
Loss: 0.2506734199359892

#----------------------------------------------------------------#

input to hidden weights: [[0.09960181 0.1996088 ]
 [0.29952259 0.39953014]]
hidden layer bias: [[-0.499

I have calculated the weights and the weights are matching with these.

Here, the output is between the 0 and 1 so  we need to check accuracy separatel using predict function

In [24]:
accuracy = 0
for (x, y) in data:
    y_hat = 1 if model2.predict(x) >= 0.5 else 0
    print("prediction:", y_hat, "desired:", y[0, 0])
    if y_hat == y: accuracy += 1
print("accuracy:", (accuracy/len(data))*100)

prediction: 1 desired: 0
prediction: 0 desired: 1
prediction: 0 desired: 1
prediction: 0 desired: 0
accuracy: 25.0


The accuracy is low so we need to train more, I will train it for 5 epochs then we will see the accuracy:

In [25]:
model2.train(data, learning_rate=5, epochs=5)

[[0.0001491  0.00012687]
 [0.00028411 0.00025743]]
Epoch: 1
Loss: 0.2501214164702893

#----------------------------------------------------------------#

[[0.0001487  0.00012655]
 [0.00028292 0.00025643]]
Epoch: 2
Loss: 0.25011872659549883

#----------------------------------------------------------------#

[[0.00014824 0.00012614]
 [0.00028162 0.00025528]]
Epoch: 3
Loss: 0.25011662276670493

#----------------------------------------------------------------#

[[0.00014773 0.00012566]
 [0.00028025 0.00025404]]
Epoch: 4
Loss: 0.2501148840434798

#----------------------------------------------------------------#

[[0.00014718 0.00012513]
 [0.00027881 0.00025272]]
Epoch: 5
Loss: 0.2501133749335815

#----------------------------------------------------------------#



In [26]:
accuracy = 0
for (x, y) in data:
    y_hat = 1 if model2.predict(x) >= 0.5 else 0
    print("prediction:", y_hat, "desired:", y[0, 0])
    if y_hat == y: accuracy += 1
print("accuracy:", (accuracy/len(data))*100)

prediction: 1 desired: 0
prediction: 0 desired: 1
prediction: 1 desired: 1
prediction: 0 desired: 0
accuracy: 50.0


Here we can see the accuracy has increased like this we have to tune the weights and train again and again to get the desired accuracy