The backpropagationis the technique still used to train large deep learning networks.

The Backpropagation algorithm is a supervised learning method for multilayer feed-forward networks from the field of Artificial Neural Networks. 

Technically, the backpropagation algorithm is a method for training the weights in a multilayer feed-forward neural network. As such, it requires a network structure to be defined of one or more layers where one layer is fully connected to the next layer. A standard network structure is one input layer, one hidden layer, and one output layer.

This example broken into below steps:
1. Initialize Network.
2. Forward Propagate.
3. Back Propagate Error.
4. Train Network.
5. Predict.
6. Seeds Dataset Case Study.

Each neuron has a set of weights and several other properties which needs to be stored during training. So we will consider dictionary  data structure to store the properties of the neuron.

A network is organized into layers. The input layer is really just a row from our training dataset. The first real layer is the hidden layer. This is followed by the output layer that has one neuron for each class value.

In [1]:
def initialize_network(num_of_inputs, num_of_hidden_neurons, num_of_outputs):
    network = list()
    hidden_layer = [{'weights':[random() for i in range(num_of_inputs+1)]} for i in range(num_of_hidden_neurons)]
    network.append(hidden_layer)
    output_layer = [{'weights':[random() for i in range(num_of_hidden_neurons+1)]} for i in range(num_of_outputs)]
    network.append(output_layer)
    return network

In [11]:
from math import exp
from random import seed
from random import random
seed(1)
network = initialize_network(2, 1, 2)
for layer in network:
    print(layer)

[{'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}]
[{'weights': [0.2550690257394217, 0.49543508709194095]}, {'weights': [0.4494910647887381, 0.651592972722763]}]


We can calculate an output from a neural network by propagating an input signal through each layer until the output layer outputs its values. We call this forward-propagation.

We can split forward propagation down into three parts:

Neuron Activation.
Neuron Transfer.
Forward Propagation.

Activation is calculated as "activation = sum(weight_i * input_i) + bias" weighted sum of inputs.

In [7]:
def activate(weights, inputs):
    activation = weights[-1]
    for i in range(len(weights)-1):
        activation += inputs[i]*weights[i]
    return activation

Once a neuron is activated, we need to transfer the activation to see what the neuron output actually is.

It is traditional to use the sigmoid activation function

for tansfer we will use follwing sigmoid function "output = 1 / (1 + e^(-activation))"

In [12]:
def transfer(activation):
    return 1/(1+exp(-activation))

In [13]:
def forward_propagate(network, row):
    inputs = row
    for layer in network:
        new_inputs = []
        for neuron in layer:
            activation = activate(neuron['weights'], inputs)
            neuron['output'] = transfer(activation)
            new_inputs.append(neuron['output'])
        inputs = new_inputs
    return inputs

In [None]:
# test forward propagation
network = [[{'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}],
		[{'weights': [0.2550690257394217, 0.49543508709194095]}, {'weights': [0.4494910647887381, 0.651592972722763]}]]
row = [1, 0, None]
output = forward_propagate(network, row)
print(output)

The deravative of the sigmoid function is "derivative = output * (1.0 - output)"

In [15]:
def transfer_derivative(output):
	return output * (1.0 - output)

The error for a given neuron can be calculated as follows: error = (expected - output) * transfer_derivative(output)

This error calculation is used for neurons in the output layer. The expected value is the class value itself. In the hidden layer, things are a little more complicated.

The error signal for a neuron in the hidden layer is calculated as the weighted error of each neuron in the output layer. Think of the error traveling back along the weights of the output layer to the neurons in the hidden layer.

The back-propagated error signal is accumulated and then used to determine the error for the neuron in the hidden layer, as follows: error = (weight_k * error_j) * transfer_derivative(output)

In [17]:
def backward_propagate_error(network, expected):
	for i in reversed(range(len(network))):
		layer = network[i]
		errors = list()
		if i != len(network)-1:
			for j in range(len(layer)):
				error = 0.0
				for neuron in network[i + 1]:
					error += (neuron['weights'][j] * neuron['delta'])
				errors.append(error)
		else:
			for j in range(len(layer)):
				neuron = layer[j]
				errors.append(expected[j] - neuron['output'])
		for j in range(len(layer)):
			neuron = layer[j]
			neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

In [18]:
# test backpropagation of error
network = [[{'output': 0.7105668883115941, 'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614]}],
		[{'output': 0.6213859615555266, 'weights': [0.2550690257394217, 0.49543508709194095]}, {'output': 0.6573693455986976, 'weights': [0.4494910647887381, 0.651592972722763]}]]
expected = [0, 1]
backward_propagate_error(network, expected)
for layer in network:
	print(layer)

[{'output': 0.7105668883115941, 'weights': [0.13436424411240122, 0.8474337369372327, 0.763774618976614], 'delta': -0.0005348048046610517}]
[{'output': 0.6213859615555266, 'weights': [0.2550690257394217, 0.49543508709194095], 'delta': -0.14619064683582808}, {'output': 0.6573693455986976, 'weights': [0.4494910647887381, 0.651592972722763], 'delta': 0.0771723774346327}]


Now let’s use the backpropagation of error to train the network.

Training involves multiple iterations of exposing a training dataset to the network and for each row of data forward propagating the inputs, backpropagating the error and updating the network weights.

Once errors are calculated for each neuron in the network via the back propagation method above, they can be used to update weights.

Network weights are updated as follows:weight = weight + learning_rate * error * input

In [19]:
def update_weights(network, row, l_rate):
	for i in range(len(network)):
		inputs = row[:-1]
		if i != 0:
			inputs = [neuron['output'] for neuron in network[i - 1]]
		for neuron in network[i]:
			for j in range(len(inputs)):
				neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
			neuron['weights'][-1] += l_rate * neuron['delta']

Training involves first looping for a fixed number of epochs and within each epoch updating the network for each row in the training dataset.

Because updates are made for each training pattern, this type of learning is called online learning. If errors were accumulated across an epoch before updating the weights, this is called batch learning or batch gradient descent.

In [20]:
def train_network(network, train, l_rate, n_epoch, n_outputs):
	for epoch in range(n_epoch):
		sum_error = 0
		for row in train:
			outputs = forward_propagate(network, row)
			expected = [0 for i in range(n_outputs)]
			expected[row[-1]] = 1
			sum_error += sum([(expected[i]-outputs[i])**2 for i in range(len(expected))])
			backward_propagate_error(network, expected)
			update_weights(network, row, l_rate)
		print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))

In [21]:
# Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
	[1.465489372,2.362125076,0],
	[3.396561688,4.400293529,0],
	[1.38807019,1.850220317,0],
	[3.06407232,3.005305973,0],
	[7.627531214,2.759262235,1],
	[5.332441248,2.088626775,1],
	[6.922596716,1.77106367,1],
	[8.675418651,-0.242068655,1],
	[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))
network = initialize_network(n_inputs, 2, n_outputs)
train_network(network, dataset, 0.5, 20, n_outputs)
for layer in network:
	print(layer)

>epoch=0, lrate=0.500, error=6.350
>epoch=1, lrate=0.500, error=5.531
>epoch=2, lrate=0.500, error=5.221
>epoch=3, lrate=0.500, error=4.951
>epoch=4, lrate=0.500, error=4.519
>epoch=5, lrate=0.500, error=4.173
>epoch=6, lrate=0.500, error=3.835
>epoch=7, lrate=0.500, error=3.506
>epoch=8, lrate=0.500, error=3.192
>epoch=9, lrate=0.500, error=2.898
>epoch=10, lrate=0.500, error=2.626
>epoch=11, lrate=0.500, error=2.377
>epoch=12, lrate=0.500, error=2.153
>epoch=13, lrate=0.500, error=1.953
>epoch=14, lrate=0.500, error=1.774
>epoch=15, lrate=0.500, error=1.614
>epoch=16, lrate=0.500, error=1.472
>epoch=17, lrate=0.500, error=1.346
>epoch=18, lrate=0.500, error=1.233
>epoch=19, lrate=0.500, error=1.132
[{'weights': [-1.4688375095432327, 1.850887325439514, 1.0858178629550297], 'output': 0.029980305604426185, 'delta': -0.0059546604162323625}, {'weights': [0.37711098142462157, -0.0625909894552989, 0.2765123702642716], 'output': 0.9456229000211323, 'delta': 0.0026279652850863837}]
[{'weights

In [34]:
# Backpropagation Algorithm With Stochastic Gradient Descent
def back_propagation(train, test, l_rate, n_epoch, n_hidden):
	n_inputs = len(train[0]) - 1
	n_outputs = len(set([row[-1] for row in train]))
	network = initialize_network(n_inputs, n_hidden, n_outputs)
	train_network(network, train, l_rate, n_epoch, n_outputs)
	predictions = list()
	for row in test:
		prediction = predict(network, row)
		predictions.append(prediction)
	return(predictions)

In [22]:
# Make a prediction with a network
def predict(network, row):
	outputs = forward_propagate(network, row)
	return outputs.index(max(outputs))

In [23]:
from random import seed
from random import randrange
from random import random
from csv import reader
from math import exp

In [24]:
# Load a CSV file
def load_csv(filename):
	dataset = list()
	with open(filename, 'r') as file:
		csv_reader = reader(file)
		for row in csv_reader:
			if not row:
				continue
			dataset.append(row)
	return dataset

In [25]:
# Convert string column to float
def str_column_to_float(dataset, column):
	for row in dataset:
		row[column] = float(row[column].strip())

In [26]:
# Convert string column to integer
def str_column_to_int(dataset, column):
	class_values = [row[column] for row in dataset]
	unique = set(class_values)
	lookup = dict()
	for i, value in enumerate(unique):
		lookup[value] = i
	for row in dataset:
		row[column] = lookup[row[column]]
	return lookup

In [27]:
# Find the min and max values for each column
def dataset_minmax(dataset):
	minmax = list()
	stats = [[min(column), max(column)] for column in zip(*dataset)]
	return stats

In [28]:
# Rescale dataset columns to the range 0-1
def normalize_dataset(dataset, minmax):
	for row in dataset:
		for i in range(len(row)-1):
			row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])

In [29]:
# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
	dataset_split = list()
	dataset_copy = list(dataset)
	fold_size = int(len(dataset) / n_folds)
	for i in range(n_folds):
		fold = list()
		while len(fold) < fold_size:
			index = randrange(len(dataset_copy))
			fold.append(dataset_copy.pop(index))
		dataset_split.append(fold)
	return dataset_split

In [30]:
# Calculate accuracy percentage
def accuracy_metric(actual, predicted):
	correct = 0
	for i in range(len(actual)):
		if actual[i] == predicted[i]:
			correct += 1
	return correct / float(len(actual)) * 100.0

In [31]:
# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm, n_folds, *args):
	folds = cross_validation_split(dataset, n_folds)
	scores = list()
	for fold in folds:
		train_set = list(folds)
		train_set.remove(fold)
		train_set = sum(train_set, [])
		test_set = list()
		for row in fold:
			row_copy = list(row)
			test_set.append(row_copy)
			row_copy[-1] = None
		predicted = algorithm(train_set, test_set, *args)
		actual = [row[-1] for row in fold]
		accuracy = accuracy_metric(actual, predicted)
		scores.append(accuracy)
	return scores

In [35]:
# Test Backprop on Seeds dataset
seed(1)
# load and prepare data
filename = 'wheat-seeds.csv'
dataset = load_csv(filename)
for i in range(len(dataset[0])-1):
	str_column_to_float(dataset, i)
# convert class column to integers
str_column_to_int(dataset, len(dataset[0])-1)
# normalize input variables
minmax = dataset_minmax(dataset)
normalize_dataset(dataset, minmax)
# evaluate algorithm
n_folds = 5
l_rate = 0.3
n_epoch = 500
n_hidden = 5
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))


>epoch=0, lrate=0.300, error=129.898
>epoch=1, lrate=0.300, error=107.123
>epoch=2, lrate=0.300, error=97.406
>epoch=3, lrate=0.300, error=86.025
>epoch=4, lrate=0.300, error=76.640
>epoch=5, lrate=0.300, error=70.130
>epoch=6, lrate=0.300, error=65.591
>epoch=7, lrate=0.300, error=62.087
>epoch=8, lrate=0.300, error=58.997
>epoch=9, lrate=0.300, error=55.957
>epoch=10, lrate=0.300, error=52.779
>epoch=11, lrate=0.300, error=49.414
>epoch=12, lrate=0.300, error=45.940
>epoch=13, lrate=0.300, error=42.536
>epoch=14, lrate=0.300, error=39.394
>epoch=15, lrate=0.300, error=36.630
>epoch=16, lrate=0.300, error=34.261
>epoch=17, lrate=0.300, error=32.250
>epoch=18, lrate=0.300, error=30.542
>epoch=19, lrate=0.300, error=29.084
>epoch=20, lrate=0.300, error=27.828
>epoch=21, lrate=0.300, error=26.735
>epoch=22, lrate=0.300, error=25.777
>epoch=23, lrate=0.300, error=24.929
>epoch=24, lrate=0.300, error=24.173
>epoch=25, lrate=0.300, error=23.495
>epoch=26, lrate=0.300, error=22.883
>epoch=27

>epoch=220, lrate=0.300, error=9.303
>epoch=221, lrate=0.300, error=9.278
>epoch=222, lrate=0.300, error=9.253
>epoch=223, lrate=0.300, error=9.229
>epoch=224, lrate=0.300, error=9.204
>epoch=225, lrate=0.300, error=9.180
>epoch=226, lrate=0.300, error=9.156
>epoch=227, lrate=0.300, error=9.132
>epoch=228, lrate=0.300, error=9.107
>epoch=229, lrate=0.300, error=9.083
>epoch=230, lrate=0.300, error=9.059
>epoch=231, lrate=0.300, error=9.035
>epoch=232, lrate=0.300, error=9.011
>epoch=233, lrate=0.300, error=8.987
>epoch=234, lrate=0.300, error=8.963
>epoch=235, lrate=0.300, error=8.939
>epoch=236, lrate=0.300, error=8.914
>epoch=237, lrate=0.300, error=8.890
>epoch=238, lrate=0.300, error=8.866
>epoch=239, lrate=0.300, error=8.841
>epoch=240, lrate=0.300, error=8.817
>epoch=241, lrate=0.300, error=8.792
>epoch=242, lrate=0.300, error=8.767
>epoch=243, lrate=0.300, error=8.742
>epoch=244, lrate=0.300, error=8.717
>epoch=245, lrate=0.300, error=8.692
>epoch=246, lrate=0.300, error=8.667
>

>epoch=449, lrate=0.300, error=5.087
>epoch=450, lrate=0.300, error=5.071
>epoch=451, lrate=0.300, error=5.055
>epoch=452, lrate=0.300, error=5.040
>epoch=453, lrate=0.300, error=5.024
>epoch=454, lrate=0.300, error=5.009
>epoch=455, lrate=0.300, error=4.993
>epoch=456, lrate=0.300, error=4.977
>epoch=457, lrate=0.300, error=4.962
>epoch=458, lrate=0.300, error=4.946
>epoch=459, lrate=0.300, error=4.931
>epoch=460, lrate=0.300, error=4.915
>epoch=461, lrate=0.300, error=4.899
>epoch=462, lrate=0.300, error=4.884
>epoch=463, lrate=0.300, error=4.868
>epoch=464, lrate=0.300, error=4.852
>epoch=465, lrate=0.300, error=4.837
>epoch=466, lrate=0.300, error=4.821
>epoch=467, lrate=0.300, error=4.805
>epoch=468, lrate=0.300, error=4.790
>epoch=469, lrate=0.300, error=4.774
>epoch=470, lrate=0.300, error=4.758
>epoch=471, lrate=0.300, error=4.743
>epoch=472, lrate=0.300, error=4.727
>epoch=473, lrate=0.300, error=4.711
>epoch=474, lrate=0.300, error=4.695
>epoch=475, lrate=0.300, error=4.680
>

>epoch=171, lrate=0.300, error=10.397
>epoch=172, lrate=0.300, error=10.380
>epoch=173, lrate=0.300, error=10.363
>epoch=174, lrate=0.300, error=10.345
>epoch=175, lrate=0.300, error=10.328
>epoch=176, lrate=0.300, error=10.311
>epoch=177, lrate=0.300, error=10.294
>epoch=178, lrate=0.300, error=10.277
>epoch=179, lrate=0.300, error=10.260
>epoch=180, lrate=0.300, error=10.243
>epoch=181, lrate=0.300, error=10.226
>epoch=182, lrate=0.300, error=10.209
>epoch=183, lrate=0.300, error=10.193
>epoch=184, lrate=0.300, error=10.176
>epoch=185, lrate=0.300, error=10.159
>epoch=186, lrate=0.300, error=10.142
>epoch=187, lrate=0.300, error=10.126
>epoch=188, lrate=0.300, error=10.109
>epoch=189, lrate=0.300, error=10.092
>epoch=190, lrate=0.300, error=10.076
>epoch=191, lrate=0.300, error=10.059
>epoch=192, lrate=0.300, error=10.042
>epoch=193, lrate=0.300, error=10.026
>epoch=194, lrate=0.300, error=10.009
>epoch=195, lrate=0.300, error=9.993
>epoch=196, lrate=0.300, error=9.976
>epoch=197, lr

>epoch=391, lrate=0.300, error=7.630
>epoch=392, lrate=0.300, error=7.622
>epoch=393, lrate=0.300, error=7.615
>epoch=394, lrate=0.300, error=7.607
>epoch=395, lrate=0.300, error=7.600
>epoch=396, lrate=0.300, error=7.592
>epoch=397, lrate=0.300, error=7.585
>epoch=398, lrate=0.300, error=7.578
>epoch=399, lrate=0.300, error=7.570
>epoch=400, lrate=0.300, error=7.563
>epoch=401, lrate=0.300, error=7.556
>epoch=402, lrate=0.300, error=7.549
>epoch=403, lrate=0.300, error=7.542
>epoch=404, lrate=0.300, error=7.535
>epoch=405, lrate=0.300, error=7.528
>epoch=406, lrate=0.300, error=7.521
>epoch=407, lrate=0.300, error=7.514
>epoch=408, lrate=0.300, error=7.507
>epoch=409, lrate=0.300, error=7.500
>epoch=410, lrate=0.300, error=7.493
>epoch=411, lrate=0.300, error=7.487
>epoch=412, lrate=0.300, error=7.480
>epoch=413, lrate=0.300, error=7.473
>epoch=414, lrate=0.300, error=7.466
>epoch=415, lrate=0.300, error=7.460
>epoch=416, lrate=0.300, error=7.453
>epoch=417, lrate=0.300, error=7.446
>

>epoch=125, lrate=0.300, error=11.370
>epoch=126, lrate=0.300, error=11.331
>epoch=127, lrate=0.300, error=11.293
>epoch=128, lrate=0.300, error=11.255
>epoch=129, lrate=0.300, error=11.218
>epoch=130, lrate=0.300, error=11.182
>epoch=131, lrate=0.300, error=11.145
>epoch=132, lrate=0.300, error=11.110
>epoch=133, lrate=0.300, error=11.074
>epoch=134, lrate=0.300, error=11.040
>epoch=135, lrate=0.300, error=11.005
>epoch=136, lrate=0.300, error=10.971
>epoch=137, lrate=0.300, error=10.938
>epoch=138, lrate=0.300, error=10.905
>epoch=139, lrate=0.300, error=10.872
>epoch=140, lrate=0.300, error=10.840
>epoch=141, lrate=0.300, error=10.808
>epoch=142, lrate=0.300, error=10.777
>epoch=143, lrate=0.300, error=10.746
>epoch=144, lrate=0.300, error=10.715
>epoch=145, lrate=0.300, error=10.685
>epoch=146, lrate=0.300, error=10.655
>epoch=147, lrate=0.300, error=10.625
>epoch=148, lrate=0.300, error=10.596
>epoch=149, lrate=0.300, error=10.567
>epoch=150, lrate=0.300, error=10.539
>epoch=151, 

>epoch=354, lrate=0.300, error=7.827
>epoch=355, lrate=0.300, error=7.820
>epoch=356, lrate=0.300, error=7.812
>epoch=357, lrate=0.300, error=7.805
>epoch=358, lrate=0.300, error=7.798
>epoch=359, lrate=0.300, error=7.791
>epoch=360, lrate=0.300, error=7.783
>epoch=361, lrate=0.300, error=7.776
>epoch=362, lrate=0.300, error=7.769
>epoch=363, lrate=0.300, error=7.762
>epoch=364, lrate=0.300, error=7.755
>epoch=365, lrate=0.300, error=7.748
>epoch=366, lrate=0.300, error=7.741
>epoch=367, lrate=0.300, error=7.734
>epoch=368, lrate=0.300, error=7.727
>epoch=369, lrate=0.300, error=7.720
>epoch=370, lrate=0.300, error=7.713
>epoch=371, lrate=0.300, error=7.706
>epoch=372, lrate=0.300, error=7.699
>epoch=373, lrate=0.300, error=7.692
>epoch=374, lrate=0.300, error=7.686
>epoch=375, lrate=0.300, error=7.679
>epoch=376, lrate=0.300, error=7.672
>epoch=377, lrate=0.300, error=7.665
>epoch=378, lrate=0.300, error=7.658
>epoch=379, lrate=0.300, error=7.652
>epoch=380, lrate=0.300, error=7.645
>

>epoch=84, lrate=0.300, error=14.096
>epoch=85, lrate=0.300, error=14.049
>epoch=86, lrate=0.300, error=14.002
>epoch=87, lrate=0.300, error=13.956
>epoch=88, lrate=0.300, error=13.911
>epoch=89, lrate=0.300, error=13.866
>epoch=90, lrate=0.300, error=13.821
>epoch=91, lrate=0.300, error=13.777
>epoch=92, lrate=0.300, error=13.734
>epoch=93, lrate=0.300, error=13.691
>epoch=94, lrate=0.300, error=13.648
>epoch=95, lrate=0.300, error=13.606
>epoch=96, lrate=0.300, error=13.564
>epoch=97, lrate=0.300, error=13.523
>epoch=98, lrate=0.300, error=13.482
>epoch=99, lrate=0.300, error=13.441
>epoch=100, lrate=0.300, error=13.401
>epoch=101, lrate=0.300, error=13.361
>epoch=102, lrate=0.300, error=13.322
>epoch=103, lrate=0.300, error=13.283
>epoch=104, lrate=0.300, error=13.244
>epoch=105, lrate=0.300, error=13.205
>epoch=106, lrate=0.300, error=13.167
>epoch=107, lrate=0.300, error=13.129
>epoch=108, lrate=0.300, error=13.092
>epoch=109, lrate=0.300, error=13.054
>epoch=110, lrate=0.300, err

>epoch=324, lrate=0.300, error=8.259
>epoch=325, lrate=0.300, error=8.244
>epoch=326, lrate=0.300, error=8.229
>epoch=327, lrate=0.300, error=8.214
>epoch=328, lrate=0.300, error=8.199
>epoch=329, lrate=0.300, error=8.184
>epoch=330, lrate=0.300, error=8.169
>epoch=331, lrate=0.300, error=8.154
>epoch=332, lrate=0.300, error=8.139
>epoch=333, lrate=0.300, error=8.124
>epoch=334, lrate=0.300, error=8.109
>epoch=335, lrate=0.300, error=8.094
>epoch=336, lrate=0.300, error=8.079
>epoch=337, lrate=0.300, error=8.064
>epoch=338, lrate=0.300, error=8.049
>epoch=339, lrate=0.300, error=8.034
>epoch=340, lrate=0.300, error=8.020
>epoch=341, lrate=0.300, error=8.005
>epoch=342, lrate=0.300, error=7.990
>epoch=343, lrate=0.300, error=7.975
>epoch=344, lrate=0.300, error=7.961
>epoch=345, lrate=0.300, error=7.946
>epoch=346, lrate=0.300, error=7.931
>epoch=347, lrate=0.300, error=7.916
>epoch=348, lrate=0.300, error=7.902
>epoch=349, lrate=0.300, error=7.887
>epoch=350, lrate=0.300, error=7.873
>

>epoch=55, lrate=0.300, error=17.544
>epoch=56, lrate=0.300, error=17.436
>epoch=57, lrate=0.300, error=17.331
>epoch=58, lrate=0.300, error=17.227
>epoch=59, lrate=0.300, error=17.123
>epoch=60, lrate=0.300, error=17.021
>epoch=61, lrate=0.300, error=16.920
>epoch=62, lrate=0.300, error=16.820
>epoch=63, lrate=0.300, error=16.720
>epoch=64, lrate=0.300, error=16.621
>epoch=65, lrate=0.300, error=16.522
>epoch=66, lrate=0.300, error=16.423
>epoch=67, lrate=0.300, error=16.324
>epoch=68, lrate=0.300, error=16.226
>epoch=69, lrate=0.300, error=16.128
>epoch=70, lrate=0.300, error=16.029
>epoch=71, lrate=0.300, error=15.931
>epoch=72, lrate=0.300, error=15.833
>epoch=73, lrate=0.300, error=15.734
>epoch=74, lrate=0.300, error=15.636
>epoch=75, lrate=0.300, error=15.537
>epoch=76, lrate=0.300, error=15.439
>epoch=77, lrate=0.300, error=15.341
>epoch=78, lrate=0.300, error=15.243
>epoch=79, lrate=0.300, error=15.145
>epoch=80, lrate=0.300, error=15.047
>epoch=81, lrate=0.300, error=14.950
>

>epoch=295, lrate=0.300, error=7.423
>epoch=296, lrate=0.300, error=7.411
>epoch=297, lrate=0.300, error=7.399
>epoch=298, lrate=0.300, error=7.387
>epoch=299, lrate=0.300, error=7.376
>epoch=300, lrate=0.300, error=7.364
>epoch=301, lrate=0.300, error=7.352
>epoch=302, lrate=0.300, error=7.341
>epoch=303, lrate=0.300, error=7.330
>epoch=304, lrate=0.300, error=7.318
>epoch=305, lrate=0.300, error=7.307
>epoch=306, lrate=0.300, error=7.295
>epoch=307, lrate=0.300, error=7.284
>epoch=308, lrate=0.300, error=7.273
>epoch=309, lrate=0.300, error=7.262
>epoch=310, lrate=0.300, error=7.251
>epoch=311, lrate=0.300, error=7.240
>epoch=312, lrate=0.300, error=7.229
>epoch=313, lrate=0.300, error=7.218
>epoch=314, lrate=0.300, error=7.207
>epoch=315, lrate=0.300, error=7.196
>epoch=316, lrate=0.300, error=7.185
>epoch=317, lrate=0.300, error=7.174
>epoch=318, lrate=0.300, error=7.163
>epoch=319, lrate=0.300, error=7.152
>epoch=320, lrate=0.300, error=7.142
>epoch=321, lrate=0.300, error=7.131
>

In [1]:
#Refference : https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/