**Aim:** To build an Artificial Neural Network by implementing the Backpropagation algorithm and test the same using appropriate data sets.


**Theory:** Artificial Neural Networks are a branch of Artificial Intelligence and Machine Learning which involve solving Machine Learning prediction problems.
They are inspired form Human Neural Network and replicate the functionality of human nervous system involving neurons , dendrites etc.

ANN consists of several nodes which behave as neurons. The nodes are connected by links (wires) for communication with one another. Nodes take input data to perform small operations on trained data and results of these operations are passed to other nodes (neurons). The output at the node is called its node value

Neural networks have a remarkable ability to retrieve meaningful data from imprecise data, that is used in detecting trends and extract patterns which are difficult to understand either by computer or humans. A trained NN can be made an "expert" in information that has been given to analyse and can be used for provide projections.

An ANN includes a huge number of processors working parallely, which are arranged in layers. The first layer receives the raw data as input, similar to optic nerves of human eye visual processing. Every successive layer receives the raw input data as output from the previous layer, similar to neurons of optic nerve receiving signals from those close to it. The final layer generates output. Below image shows several layers.
They are multiple types of Neural Networks like Freeforward , Convolutional , Recurrent , Modular.
Artificial Neural Networks are used in multiple fields like pattern recognition , facial recognition , handwriting identification , speech to text transfer , weather prediction , stock market prediction , path/fare prediction.

Backpropagation is an algorithm commonly used to train neural networks. When the neural network is initialized, weights are set for its individual elements, called neurons. Inputs are loaded, they are passed through the network of neurons, and the network provides an output for each one, given the initial weights. Backpropagation helps to adjust the weights of the neurons so that the result comes closer and closer to the known true result.
It involves following steps
1. Visualizing the input data
2. Deciding the shapes of Weight and bias matrix
3. Initializing matrix, function to be used
4. Implementing the forward propagation method
5. Implementing the cost calculation
6. Backpropagation and optimizing
7. prediction and visualizing the output


**Code:**

Dataset Used is **Iris Dataset**<br>
Importing Libraries

In [5]:
import pandas as pd
import numpy as np    
from sklearn.preprocessing import LabelEncoder

In [6]:
iris = pd.read_csv('datasets_19_420_Iris.csv')
iris = iris.sample(frac=1).reset_index(drop=True) 

In [7]:
X = iris[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]
X = np.array(X)
X[:5]

array([[5.5, 2.5, 4. , 1.3],
       [4.8, 3.1, 1.6, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.9, 3. , 5.1, 1.8],
       [5.4, 3.9, 1.7, 0.4]])

In [8]:
from sklearn.preprocessing import OneHotEncoder
one_hot_encoder = OneHotEncoder(sparse=False)

Y = iris.Species
Y = one_hot_encoder.fit_transform(np.array(Y).reshape(-1, 1))
Y[:5]

array([[0., 1., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [1., 0., 0.]])

In [9]:
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.15)
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.1)

In [10]:
def NeuralNetwork(X_train, Y_train, X_val=None, Y_val=None, epochs=10, nodes=[], lr=0.15):
    hidden_layers = len(nodes) - 1
    weights = InitializeWeights(nodes)

    for epoch in range(1, epochs+1):
        weights = Train(X_train, Y_train, lr, weights)

        if(epoch % 20 == 0):
            print("Epoch {}".format(epoch))
            print("Training Accuracy:{}".format(Accuracy(X_train, Y_train, weights)))
            if X_val.any():
                print("Validation Accuracy:{}".format(Accuracy(X_val, Y_val, weights)))
            
    return weights

In [11]:
def InitializeWeights(nodes):
    """Initialize weights with random values in [-1, 1] (including bias)"""
    layers, weights = len(nodes), []
    
    for i in range(1, layers):
        w = [[np.random.uniform(-1, 1) for k in range(nodes[i-1] + 1)]
              for j in range(nodes[i])]
        weights.append(np.matrix(w))
    
    return weights

In [12]:
def ForwardPropagation(x, weights, layers):
    activations, layer_input = [x], x
    for j in range(layers):
        activation = Sigmoid(np.dot(layer_input, weights[j].T))
        activations.append(activation)
        layer_input = np.append(1, activation) # Augment with bias
    
    return activations

In [13]:
def BackPropagation(y, activations, weights, layers):
    outputFinal = activations[-1]
    error = np.matrix(y - outputFinal) # Error at output
    
    for j in range(layers, 0, -1):
        currActivation = activations[j]
        
        if(j > 1):
            # Augment previous activation
            prevActivation = np.append(1, activations[j-1])
        else:
            # First hidden layer, prevActivation is input (without bias)
            prevActivation = activations[0]
        
        delta = np.multiply(error, SigmoidDerivative(currActivation))
        weights[j-1] += lr * np.multiply(delta.T, prevActivation)

        w = np.delete(weights[j-1], [0], axis=1)         
        error = np.dot(delta, w) # Calculate error for current layer
    
    return weights


In [14]:
def Train(X, Y, lr, weights):
    layers = len(weights)
    for i in range(len(X)):
        x, y = X[i], Y[i]
        x = np.matrix(np.append(1, x)) # Augment feature vector
        
        activations = ForwardPropagation(x, weights, layers)
        weights = BackPropagation(y, activations, weights, layers)

    return weights

In [15]:
def Sigmoid(x):
    return 1 / (1 + np.exp(-x))

def SigmoidDerivative(x):
    return np.multiply(x, 1-x)

In [16]:
def Predict(item, weights):
    layers = len(weights)
    item = np.append(1, item) # Augment feature vector
    
    ##_Forward Propagation_##
    activations = ForwardPropagation(item, weights, layers)
    
    outputFinal = activations[-1].A1
    index = FindMaxActivation(outputFinal)

    # Initialize prediction vector to zeros
    y = [0 for i in range(len(outputFinal))]
    y[index] = 1  # Set guessed class to 1

    return y # Return prediction vector


def FindMaxActivation(output):
    """Find max activation in output"""
    m, index = output[0], 0
    for i in range(1, len(output)):
        if(output[i] > m):
            m, index = output[i], i
    
    return index

In [17]:
def Accuracy(X, Y, weights):
    """Run set through network, find overall accuracy"""
    correct = 0

    for i in range(len(X)):
        x, y = X[i], list(Y[i])
        guess = Predict(x, weights)

        if(y == guess):
            # Guessed correctly
            correct += 1

    return correct / len(X)

In [18]:
f = len(X[0]) # Number of features
o = len(Y[0]) # Number of outputs / classes

layers = [f, 5, 10, o] # Number of nodes in layers
lr, epochs = 0.15, 100

weights = NeuralNetwork(X_train, Y_train, X_val, Y_val, epochs=epochs, nodes=layers, lr=lr);

Epoch 20
Training Accuracy:0.868421052631579
Validation Accuracy:0.7692307692307693
Epoch 40
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0
Epoch 60
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0
Epoch 80
Training Accuracy:0.9649122807017544
Validation Accuracy:1.0
Epoch 100
Training Accuracy:0.9736842105263158
Validation Accuracy:1.0


In [19]:
print("Testing Accuracy: {}".format(Accuracy(X_test, Y_test, weights)))

Testing Accuracy: 0.9565217391304348


In [20]:
from sklearn.metrics import confusion_matrix

In [22]:
y_pred=[]
for i in range(len(X_test)):
    x= X_test[i]
    guess = Predict(x, weights)
    y_pred.append(guess)
    

In [40]:
def fun(ls):
    if ls[0]==1:
        return "s"
    elif ls[1]==1:
        return "versi"
    else :
        return "virginca"

In [41]:
y_final_pred=[]
y_test_final=[]
for i in range(len(X_test)):
    y_final_pred.append(fun(y_pred[i]))
    y_test_final.append(fun(Y_test[i]))

In [33]:
confusion_matrix(y_test_final,y_final_pred)

array([[9, 0, 0],
       [0, 7, 1],
       [0, 0, 6]], dtype=int64)

In [43]:
from sklearn import metrics
# Print the confusion matrix
s="Setosa"
versi="Versicolor"
virginca="Viginica"
print(metrics.confusion_matrix(y_test_final,y_final_pred))

# Print the precision and recall, among other metrics
print(metrics.classification_report(y_test_final,y_final_pred, digits=3))

[[9 0 0]
 [0 7 1]
 [0 0 6]]
              precision    recall  f1-score   support

           s      1.000     1.000     1.000         9
       versi      1.000     0.875     0.933         8
    virginca      0.857     1.000     0.923         6

    accuracy                          0.957        23
   macro avg      0.952     0.958     0.952        23
weighted avg      0.963     0.957     0.957        23

