<a href="https://colab.research.google.com/github/riya461/Learn_ML/blob/main/Algorithms/Backpropogation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

$$ \mathbf{a}^{(n)} = \sigma(\mathbf{z}^{(n)}) $$
$$ \mathbf{z}^{(n)} = \mathbf{W}^{(n)}\mathbf{a}^{(n-1)} + \mathbf{b}^{(n)} $$


$$ \sigma(\mathbf{z}) = \frac{1}{1 + \exp(-\mathbf{z})} $$


Backpropagation can be used for both classification and regression problems

In classification , best results are achieved when the network has one neuron in the output layer for each class value.

One Hot Encoding - A binary classification problem with the class values of A and B. These expected outputs would have to be transformed into binary vectors with one column for each class value. Such as [1, 0] and [0, 1] for A and B respectively.

#  Neural Network

1. Initialize network
2. Forward propogate
3. Back propogate error
4. Train network
5. Predict

### Neuron
- each has a set of weights need to be maintained
- one weight for each input
- additional weight for bias
- dictionary to represent each neuron

### Network
- organised into layers
- input layer - row from our training dataset
- then a hidden layer
- followed by output layer - one neuron for each class value

**Layers as arrays of dictionaries **

**Whole network as an array of layers **



In [79]:
from random import random,seed
from math import exp



# Initialise Network
- create new neural network ready for training
- 3 parameters
  - number of input
  - number of neurons in hidden layer
  - number of output
- **n_hidden** neurons in hidden layer - each has **n_inputs + 1** weights for each neuron in hidden layer - one for each input column in dataset and one for bias
- **n_outputs** neurons - each has **n_hidden + 1** weights



In [80]:
def initialize_network(n_inputs, n_hidden, n_outputs):
  network = list()
  hidden_layer = [ {'weights':
                    [random() for i in range(n_inputs + 1)]}
                   for i in range(n_hidden)
                   ]
  network.append(hidden_layer)
  output_layer = [ {'weights':
                    [random() for i in range(n_hidden + 1)]}
                   for i in range(n_outputs)
                   ]
  network.append(output_layer)
  return network

In [81]:
seed(1)
network = initialize_network(2,1,2)
for layer in network:
  print(layer)

[{'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}]
[{'weights': [0.2550690257394217, 0.49543508709194095]}, {'weights': [0.4494910647887381, 0.651592972722763]}]


# Forward propogate
- calculate output from a neural network by propogating an inout signal through each ayer until output layer output its values

1. Neuron Activation
2. Neuron Transfer
3. Forward Propogation

### Neuron activation
- input - row from training dataset or output from each neuron in hidden layer


In [82]:
# Neuron activation - weighted sum of inputs
def activate(weights,inputs):
  print("weights",len(weights))
  print("input",len(inputs))
  activation = weights[-1] # bias is here assumed as the last weight in list
  for i in range(len(weights)-1):
    print(i)
    activation += weights[i] * inputs[i]
  return activation


### Neuron Transfer
- traditionally sigmoid activation function ( logistic ) but also can use `tanh` - hyperbolic , rectifier transfer function etc
- Sigmoid - 0-1 between value


In [83]:
# Neuron Transfer
def transfer(activation):
  return 1.0/1.0 + exp(-activation)

### Forward Propogation
- output from one layer become input to neurons on next layer

In [84]:
# Forward propogation
def forward_propagate(network, row):
  inputs = row
  for layer in network:
    new_inputs = []
    for neuron in layer:
      activation = activate(neuron['weights'], inputs)
      neuron['output'] = transfer(activation)
      new_inputs.append(neuron['output'])
    inputs = new_inputs
  return inputs

In [85]:
# test
seed(1)
network = initialize_network(2,1,2)
for layer in network:
  print(layer)
row = [1,0, None ]
output = forward_propagate(network,row)
print(output)

[{'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}]
[{'weights': [0.2550690257394217, 0.49543508709194095]}, {'weights': [0.4494910647887381, 0.651592972722763]}]
weights 3
input 3
0
1
weights 2
input 1
0
weights 2
input 1
0
[1.425538171328089, 1.2768792176668917]


None of the values makes sense now

# Back Propogate Error

1. Transfer Derivative
2. Error Backpropogation

### Transfer Derivative
- output from a neuron - calculate it's slope
- using sigmoid transfer function

In [86]:
def transfer_derivative(output):
  return output * (1.0 - output)

### Error Backpropogation
- expected : the expected output value for neuron
- output : value for neuron
- transfer_derivative() : calculates slope of neuron's output value

- **error_j** : error signal from jth neuron
- **weight_k** : weight that connects kth neuron to current

In [87]:
def backward_propagate_error(network, expected):
  for i in reversed(range(len(network))):
    layer = network[i]
    errors = list()
    if i != len(network)-1:
      for j in range(len(layer)):
        error = 0.0
        for neuron in network[i + 1]:
          error += (neuron['weights'][j] * neuron['delta'])
          errors.append(error)
    else:
      for j in range(len(layer)):
        neuron = layer[j]
        errors.append(neuron['output'] - expected[j])
      for j in range(len(layer)):
        neuron = layer[j]
        neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

In [88]:
network = [[{'output': 0.7105668883115941, 'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}],
 [{'output': 0.6213859615555266, 'weights': [0.2550690257394217, 0.49543508709194095]}, {'output': 0.6573693455986976, 'weights': [0.4494910647887381, 0.651592972722763]}]]
expected = [0, 1]
backward_propogate_error(network, expected)
for layer in network:
  print(layer)

[{'output': 0.7105668883115941, 'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}]
[{'output': 0.6213859615555266, 'weights': [0.2550690257394217, 0.49543508709194095], 'delta': 0.14619064683582808}, {'output': 0.6573693455986976, 'weights': [0.4494910647887381, 0.651592972722763], 'delta': -0.0771723774346327}]


# Train Network

- trained using stochastic gradient descent
- involves multiple iterations

1. Update Weights
2. Train Network

**Stochastic gradient descent**


In [89]:
# Update Weights
# learning_rate - control how much to change weight to correct for error
# error - calculated by backpropogation procedure for neuron
# input - value that caused the error
def update_weights(network, row, l_rate):
  for i in range(len(network)):
    inputs = row[:-1]
    if i != 0:
      inputs = [neuron['output'] for neuron in network[i - 1]]
      for neuron in network[i]:
        for j in range(len(inputs)):
          neuron['weights'][j] -= l_rate * neuron['delta'] * inputs[j]
          neuron['weights'][-1] -= l_rate * neuron['delta']

In [90]:

# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
  for epoch in range(n_epoch):
    sum_error = 0
  for row in train:
    outputs = forward_propagate(network, row)
    expected = [0 for i in range(n_outputs)]
    expected[row[-1]] = 1
    sum_error += sum([(expected[i]-outputs[i])**2 for i in range(len(expected))])
    backward_propagate_error(network, expected)
    update_weights(network, row, l_rate)
    print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))


In [91]:

# Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
 [1.465489372,2.362125076,0],
 [3.396561688,4.400293529,0],
 [1.38807019,1.850220317,0],
 [3.06407232,3.005305973,0],
 [7.627531214,2.759262235,1],
 [5.332441248,2.088626775,1],
 [6.922596716,1.77106367,1],
 [8.675418651,-0.242068655,1],
 [7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))
network = initialize_network(n_inputs, 2, n_outputs)
train_network(network, dataset, 0.5, 20, n_outputs)
for layer in network:
  print(layer)

weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=1.610
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=2.855
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=4.073
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=5.213
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=6.358
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=7.680
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=8.881
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
>epoch=19, lrate=0.500, error=10.040
weights

# Predict
- class prediction
- class value largest possibility
- arg max function

In [93]:
def predict(network, row):
 outputs = forward_propagate(network, row)
 return outputs.index(max(outputs))

In [94]:
for row in dataset:
 prediction = predict(network, row)
 print('Expected=%d, Got=%d' % (row[-1], prediction))

weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=0, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=0, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=0, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=0, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=0, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=1, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=1, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=1, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
weights 3
input 2
0
1
weights 3
input 2
0
1
Expected=1, Got=1
weights 3
input 3
0
1
weights 3
input 3
0
1
we