<a href="https://colab.research.google.com/github/anaustinbeing/neural-networks/blob/main/backpropagation_(iris).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Implementing Backpropagation on network to train weights.

### Loading the dataset:

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris_dataset = load_iris()
X = iris_dataset['data']
y = iris_dataset['target']

print(X.shape, y.shape)

(150, 4) (150,)


Splitting the dataset into training and testing:

In [None]:
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.1)
x_train.shape, y_train.shape, x_test.shape, y_test.shape

((135, 4), (135,), (15, 4), (15,))

In [None]:
y_train

array([1, 0, 1, 1, 2, 0, 1, 0, 0, 1, 1, 2, 2, 2, 1, 0, 0, 2, 1, 0, 0, 1,
       2, 2, 1, 1, 0, 2, 2, 2, 2, 2, 1, 2, 2, 1, 0, 0, 2, 2, 0, 2, 1, 2,
       0, 2, 0, 0, 1, 2, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 1, 2, 0, 2,
       0, 1, 0, 2, 2, 2, 0, 1, 0, 2, 0, 1, 2, 1, 1, 0, 0, 1, 1, 0, 2, 1,
       1, 0, 0, 2, 2, 1, 0, 0, 1, 2, 0, 2, 2, 2, 1, 2, 2, 1, 2, 0, 0, 2,
       0, 0, 2, 0, 1, 0, 2, 0, 0, 1, 0, 0, 2, 1, 0, 2, 2, 1, 1, 0, 1, 0,
       1, 0, 2])

### Implementing the BackPropagation algorithm:

Training the network using the training data (`x_train` and `y_train`):

In [None]:
import numpy as np
import matplotlib.pyplot as pl
%matplotlib inline
#
# function to compute the sigmoid
sigmoid = lambda x: 1/(1 + np.exp(-x));
#

def backprop(W1, W2, X, D):
    alpha = 0.9; # learning rate
    N=135;
    for k in range(0,N):
        x = X[k, :].T; #inputs from training data
        d = D[k]; # correct output from training data
        ##########################
        # forward propagation step
        ##########################
        # calculate the weighted sum of hidden node
        v1 = np.dot(W1,x);
        #pass the weighted sum to the activation function, this gives the outputs from hidden layer
        y1 = sigmoid(v1);
        #calculate the weighted sum of the output layer
        v = np.dot(W2,y1);
        # pass it to the activation function, this returns the output of the third layer
        y = sigmoid(v);
        #calculate the error, difference between correct output and computed output
        d = [1, 0, 0] if d == 0 else [0, 1, 0] if d == 1 else [0, 0, 1]
        e = d - y;

        
        #calculate delta, derivative of the activation function times the error
        # note that 𝜎′(𝑥)=𝜎(𝑥)∙(1− 𝜎(𝑥)) = y * (1-y)
        delta = y*(1-y)*e; # element wise multiplication
        ###########################
        # Backward propagation step
        ###########################
        # propagate the output node delta, δ, backward, and calculate the deltas of the hidden layer.
        e1 = np.dot(W2.T, delta);
        delta1 = y1*(1-y1)*e1;  # element wise multiplication
        #
        # Adjust the weights according to the learning rule
        delta1.shape=(5,1) # column vector of deltas for the hidden layer
        x.shape=(1,4) # row vector of the current input
        dW1 = alpha*np.dot(delta1,x);
        W1 = W1 + dW1;
        #
        delta.shape=(3,1)
        y1.shape = (1, 5)
        dW2 = alpha*np.dot(delta,y1);
        W2 = W2 + dW2;
    #
    return W1, W2;
#

# initialize the weights between input layer and hidden layer
W1 = 2*np.random.rand(5, 4) - 1;
# initialize the weights between hidden layer and output layer
W2 = 2*np.random.rand(3, 5) - 1;
#
# run the backprop algorithm to compute the weights
for epoch in range(1, 1000): # train
    print(W1.shape, W2.shape)
    W1, W2 = backprop(W1, W2, x_train, y_train);



Printing the weight vector values after training:

In [None]:
W1, W2

(array([[-22.06251282, -15.34686957,  28.23229195,  29.80824248],
        [ 21.39238287,  14.29741113, -21.42533827, -37.9360814 ],
        [  0.5895355 ,   2.81921785,  -4.51679652,  -1.47606093],
        [ -1.27612237,   0.19053071,  -0.42267135,   0.73349472],
        [-11.98576905, -10.79213252,  13.70770794,  20.47061534]]),
 array([[-5.28511788, -5.16201004, 10.53377644,  0.78178917, -5.86125507],
        [-3.41974898,  3.73617261, -8.92031611,  0.81532679, -2.40380057],
        [ 3.42462065, -3.74400213, -2.86919184,  0.71624717,  2.4070266 ]]))

`W1` is the weight vector between the input layer (4 neurons) and the hidden layer (5 neurons). The shape of `W1` is (5, 4).

`W2` is the weight vector between the hidden layer (5 neurons) and the output layer (3 neurons). The shape of `W2` is (3, 5).

### Testing the network against test data:

Testing on test data (`x_test` and `y_test`):

In [None]:
outputs = []
for k in range(x_test.shape[0]):
    x = x_test[k, :].T;
    d = y_test[k]; 
    v1 = np.dot(W1,x);
    y1 = sigmoid(v1);
    #calculate the weighted sum of the output layer
    v = np.dot(W2,y1);
    # pass it to the activation function, this returns the output of the third layer
    y = sigmoid(v);
    output = np.rint(y)
    output_class = 0 if output[0] == 1 else 1 if output[1] == 1 else 2
    print('\nThe iris species is:', end=' ')
    print('Setosa' if output_class == 0 
          else 'Virginica' if output_class == 1 
          else 'Versicolor' if output_class == 2 
          else None, '\nThe class value in the dataset is ', d)
    outputs.append(output_class)


The iris species is: Versicolor 
The class value in the dataset is  2

The iris species is: Versicolor 
The class value in the dataset is  2

The iris species is: Versicolor 
The class value in the dataset is  2

The iris species is: Setosa 
The class value in the dataset is  0

The iris species is: Virginica 
The class value in the dataset is  1

The iris species is: Versicolor 
The class value in the dataset is  2

The iris species is: Virginica 
The class value in the dataset is  1

The iris species is: Versicolor 
The class value in the dataset is  2

The iris species is: Virginica 
The class value in the dataset is  2

The iris species is: Virginica 
The class value in the dataset is  2

The iris species is: Virginica 
The class value in the dataset is  1

The iris species is: Setosa 
The class value in the dataset is  0

The iris species is: Virginica 
The class value in the dataset is  1

The iris species is: Virginica 
The class value in the dataset is  1

The iris species is:

There are three neurons in the output layer. All layers use Sigmoid as the activation function. So the output of each neuron is a value between 0 and 1.

We use `np.rint()` function on the output values to convert them to nearest whole number. And then if the output is `[1, 0, 0]`, the class is "Setosa". If the output is `[0, 1, 0]`, the class is "Viriginica". If the output is `[0, 0, 1]`, then the class is "Vericolor".

Finding how accurate the predictions are:

In [None]:
print('Predicted outputs: ', outputs)
print('Test data target column values (original): ', y_test)
print('Comparing the predicted output with test data target column values: ')
print(outputs == y_test)

Predicted outputs:  [2, 2, 2, 0, 1, 2, 1, 2, 1, 1, 1, 0, 1, 1, 0]
Test data target column values (original):  [2 2 2 0 1 2 1 2 2 2 1 0 1 1 0]
Comparing the predicted output with test data target column values: 
[ True  True  True  True  True  True  True  True False False  True  True
  True  True  True]


We see the predicted ouputs and the original output values.

Comparing, we see that 13 out of 15 predictions are correct while two are wrong. 

In [None]:
accuracy_score(y_test, outputs)

0.8666666666666667

The accuracy is 86.66%