<a href="https://colab.research.google.com/github/G-Karishni/Pattern_Recognition/blob/main/Iris_Backpropagation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Backpropagation - Supervised Learning

Import libraries

In [19]:
import numpy as np
import pandas as pd

Load data

In [20]:
iris = pd.read_csv("https://raw.githubusercontent.com/G-Karishni/Pattern_Recognition/main/Iris.csv")
iris = iris.sample(frac=1).reset_index(drop=True)

In [21]:
X = iris[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]
X = np.array(X)
X[:5]

array([[6.4, 2.7, 5.3, 1.9],
       [6.7, 3. , 5. , 1.7],
       [6.3, 3.3, 6. , 2.5],
       [6.6, 3. , 4.4, 1.4],
       [5.2, 3.5, 1.5, 0.2]])

One hot coding converts [0,1,2 = 'Setosa', 'Versicolor', 'Virginica'] to ([1, 0, 0], [0, 1, 0], [0, 0, 1]) form.

In [22]:
from sklearn.preprocessing import OneHotEncoder
one_hot_encoder = OneHotEncoder(sparse=False)

Y = iris.Species
Y = one_hot_encoder.fit_transform(np.array(Y).reshape(-1, 1))
Y[:5]

array([[0., 0., 1.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [1., 0., 0.]])

Split train and test set

In [23]:
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.15)
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.1)

The main function is NeuralNetwork, which will train the network for the specified number of epochs.

*   epochs: Number of epochs. Here, epochs = 10.
*   nodes: A list of integers. Each integer denotes the number of nodes in each layer. The length of this list denotes the number of layers.
*   lr: The learning rate of the back-propagation training algorithm.Here, lr = 0.15.

In [24]:
def NeuralNetwork(X_train, Y_train, X_val=None, Y_val=None, epochs=10, nodes=[], lr=0.15):
    hidden_layers = len(nodes) - 1
    weights = InitializeWeights(nodes)

    for epoch in range(1, epochs+1):
        weights = Train(X_train, Y_train, lr, weights)

        if(epoch % 20 == 0):
            print("Epoch {}".format(epoch))
            print("Training Accuracy:{}".format(Accuracy(X_train, Y_train, weights)))
            if X_val.any():
                print("Validation Accuracy:{}".format(Accuracy(X_val, Y_val, weights)))
            
    return weights

At first, the weights of the network will get randomly initialized by InitializeWeights. Each element in the weights list represents a hidden layer. This function takes as input nodes and returns a multi-dimensional array, weights.

In [25]:
def InitializeWeights(nodes):
    """Initialize weights with random values in [-1, 1] (including bias)"""
    layers, weights = len(nodes), []
    
    for i in range(1, layers):
        w = [[np.random.uniform(-1, 1) for k in range(nodes[i-1] + 1)]
              for j in range(nodes[i])]
        weights.append(np.matrix(w))
    
    return weights

*FeedForward propagation*

1.   Each layer receives an input and computes an output. The output is computed by first calculating the dot product between the input and the weights of the layer and then passing this dot product through an activation function (in this case, the sigmoid function).
2.   The output of each layer is the input of the next.
3.   The input of the first layer is the feature vector.
4.   The output of the final layer is the prediction of the network.



In [26]:
def ForwardPropagation(x, weights, layers):
    activations, layer_input = [x], x
    for j in range(layers):
        activation = Sigmoid(np.dot(layer_input, weights[j].T))
        activations.append(activation)
        layer_input = np.append(1, activation) # Augment with bias
    
    return activations

*Backward Propagation*

1.  Calculate error at final output.
2.  Propagate error backwards through the layers and perform corrections.
3.  Calculate Delta: Back-propagated error of current layer times Sigmoid derivation of current layer activation.
4.  Update Weights between current layer and previous layer: Multiply delta with activation of previous layer and learning rate, and add this product to weights of previous layer.
5.  Calculate error for current layer. Remove the bias from the weights of the previous layer and multiply the result with delta to get error.

In [27]:
def BackPropagation(y, activations, weights, layers):
    outputFinal = activations[-1]
    error = np.matrix(y - outputFinal) # Error at output
    
    for j in range(layers, 0, -1):
        currActivation = activations[j]
        
        if(j > 1):
            # Augment previous activation
            prevActivation = np.append(1, activations[j-1])
        else:
            # First hidden layer, prevActivation is input (without bias)
            prevActivation = activations[0]
        
        delta = np.multiply(error, SigmoidDerivative(currActivation))
        weights[j-1] += lr * np.multiply(delta.T, prevActivation)
        w = np.delete(weights[j-1], [0], axis=1) # Remove bias from weights
        error = np.dot(delta, w) # Calculate error for current layer
    
    return weights

Then, in each epoch, the weights will be updated by Train

In [28]:
def Train(X, Y, lr, weights):
    layers = len(weights)
    for i in range(len(X)):
        x, y = X[i], Y[i]
        x = np.matrix(np.append(1, x)) # Augment feature vector
        
        activations = ForwardPropagation(x, weights, layers)
        weights = BackPropagation(y, activations, weights, layers)

    return weights

Activation function

In [29]:
def Sigmoid(x):
    return 1 / (1 + np.exp(-x))

def SigmoidDerivative(x):
    return np.multiply(x, 1-x)

In [30]:
def Predict(item, weights):
    layers = len(weights)
    item = np.append(1, item) # Augment feature vector
    
    ##_Forward Propagation_##
    activations = ForwardPropagation(item, weights, layers)
    
    outputFinal = activations[-1].A1
    index = FindMaxActivation(outputFinal)

    # Initialize prediction vector to zeros
    y = [0 for i in range(len(outputFinal))]
    y[index] = 1  # Set guessed class to 1

    return y # Return prediction vector

In [31]:
def FindMaxActivation(output):
    """Find max activation in output"""
    m, index = output[0], 0
    for i in range(1, len(output)):
        if(output[i] > m):
            m, index = output[i], i
    
    return index

Model evaluation

Finally, every 20 epochs accuracy both for the training and validation sets will be printed by the Accuracy function.

In [32]:
def Accuracy(X, Y, weights):
    """Run set through network, find overall accuracy"""
    correct = 0

    for i in range(len(X)):
        x, y = X[i], list(Y[i])
        guess = Predict(x, weights)

        if(y == guess):
            # Guessed correctly
            correct += 1

    return correct / len(X)

In [33]:
f = len(X[0]) # Number of features
o = len(Y[0]) # Number of outputs / classes

layers = [f, 5, 10, o] # Number of nodes in layers
lr, epochs = 0.15, 100

weights = NeuralNetwork(X_train, Y_train, X_val, Y_val, epochs=epochs, nodes=layers, lr=lr);

Epoch 20
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0
Epoch 40
Training Accuracy:0.956140350877193
Validation Accuracy:1.0
Epoch 60
Training Accuracy:0.9035087719298246
Validation Accuracy:1.0
Epoch 80
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0
Epoch 100
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0


In [34]:
print("Testing Accuracy: {}".format(Accuracy(X_test, Y_test, weights)))

Testing Accuracy: 0.9565217391304348
