# **Tasks**

Go to this [url](https://drive.google.com/file/d/1PU1AQVCgnFL2XbgIJRl3TwkHY5Ag3qIv/view?usp=sharing) and download the data first. In order to know more about the dataset please refer to these links - [UCI/iris](https://archive.ics.uci.edu/ml/datasets/Iris), or [Kaggle/iris_dataset](https://www.kaggle.com/uciml/iris).

**The "species" field refers to the Predicted attribute: class of iris plant.**

 Now try to do the following:

1. Apply an ANN to find/predict/classify the class of IRIS plants.
2. Apply different termination criteria to see if the accuracy changes.

In [31]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## **Loading the dataset**

In [32]:
from random import seed
from random import randrange
from random import random
from csv import reader
from math import exp

In [33]:
# Load a CSV file
def load_csv(filename):
	dataset = list()
	with open(filename, 'r') as file:
		csv_reader = reader(file)
		for row in csv_reader:
			if not row:
				continue
			dataset.append(row)
	return dataset

In [34]:
# Test Backprop on Seeds dataset
seed(1)
#load and prepare data
filename = '/content/drive/MyDrive/Iris.csv'
dataset = load_csv(filename)

In [19]:
dataset[:10]

[['Id',
  'SepalLengthCm',
  'SepalWidthCm',
  'PetalLengthCm',
  'PetalWidthCm',
  'Species'],
 ['1', '5.1', '3.5', '1.4', '0.2', 'Iris-setosa'],
 ['2', '4.9', '3.0', '1.4', '0.2', 'Iris-setosa'],
 ['3', '4.7', '3.2', '1.3', '0.2', 'Iris-setosa'],
 ['4', '4.6', '3.1', '1.5', '0.2', 'Iris-setosa'],
 ['5', '5.0', '3.6', '1.4', '0.2', 'Iris-setosa'],
 ['6', '5.4', '3.9', '1.7', '0.4', 'Iris-setosa'],
 ['7', '4.6', '3.4', '1.4', '0.3', 'Iris-setosa'],
 ['8', '5.0', '3.4', '1.5', '0.2', 'Iris-setosa'],
 ['9', '4.4', '2.9', '1.4', '0.2', 'Iris-setosa']]

## **Dataset Preprocessing**

As a neural network does not understand anything rather than numbers, then let's convert every entity into numbers first.

In [20]:
dataset[:10]

[['Id',
  'SepalLengthCm',
  'SepalWidthCm',
  'PetalLengthCm',
  'PetalWidthCm',
  'Species'],
 ['1', '5.1', '3.5', '1.4', '0.2', 'Iris-setosa'],
 ['2', '4.9', '3.0', '1.4', '0.2', 'Iris-setosa'],
 ['3', '4.7', '3.2', '1.3', '0.2', 'Iris-setosa'],
 ['4', '4.6', '3.1', '1.5', '0.2', 'Iris-setosa'],
 ['5', '5.0', '3.6', '1.4', '0.2', 'Iris-setosa'],
 ['6', '5.4', '3.9', '1.7', '0.4', 'Iris-setosa'],
 ['7', '4.6', '3.4', '1.4', '0.3', 'Iris-setosa'],
 ['8', '5.0', '3.4', '1.5', '0.2', 'Iris-setosa'],
 ['9', '4.4', '2.9', '1.4', '0.2', 'Iris-setosa']]

In [29]:
# Convert string column to float
def str_column_to_float(dataset, column):
	for row in dataset:
		row[column] = float(row[column].strip())

# Convert string column to integer
def str_column_to_int(dataset, column):
	class_values = [row[column] for row in dataset]
	unique = set(class_values)
	print(unique)
	lookup = dict()
	for i, value in enumerate(unique):
		lookup[value] = i
	for row in dataset:
		row[column] = lookup[row[column]]
	return lookup


In [30]:
for i in range(len(dataset[0])-1):
    str_column_to_float(dataset, i)

    # convert class column to integers
str_column_to_int(dataset, len(dataset[0])-1)

{'Iris-virginica', 'Iris-versicolor', 'Iris-setosa'}

{'Iris-virginica': 0, 'Iris-versicolor': 1, 'Iris-setosa': 2}

ValueError: ignored

In [23]:
dataset[:10]

[['Id', 'SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm', 2],
 ['1', '5.1', '3.5', '1.4', '0.2', 0],
 ['2', '4.9', '3.0', '1.4', '0.2', 0],
 ['3', '4.7', '3.2', '1.3', '0.2', 0],
 ['4', '4.6', '3.1', '1.5', '0.2', 0],
 ['5', '5.0', '3.6', '1.4', '0.2', 0],
 ['6', '5.4', '3.9', '1.7', '0.4', 0],
 ['7', '4.6', '3.4', '1.4', '0.3', 0],
 ['8', '5.0', '3.4', '1.5', '0.2', 0],
 ['9', '4.4', '2.9', '1.4', '0.2', 0]]

# **Input-Output Encoding**

One possible drawback of neural networks is that all attribute values must be encoded in a standardized manner, taking values between zero and 1, even for categorical variables. Later, when we examine the details of the back-propagation algorithm, we shall understand why this is necessary.

For now, however, how does one go about standardizing all the attribute values?

In [27]:
# Find the min and max values for each column
def dataset_minmax(dataset):
    minmax = list()
    stats = [[min(column), max(column)] for column in zip(*dataset)]
    return stats

# Rescale dataset columns to the range 0-1
def normalize_dataset(dataset, minmax):
    for row in dataset:
        for i in range(len(row)-1):
            row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])

In [28]:
# normalize input variables
minmax = dataset_minmax(dataset)
normalize_dataset(dataset, minmax)

TypeError: ignored

In [None]:
dataset[:10]

[[0.0,
  0.22222222222222213,
  0.6249999999999999,
  0.06779661016949151,
  0.04166666666666667,
  0],
 [0.006711409395973154,
  0.1666666666666668,
  0.41666666666666663,
  0.06779661016949151,
  0.04166666666666667,
  0],
 [0.013422818791946308,
  0.11111111111111119,
  0.5,
  0.05084745762711865,
  0.04166666666666667,
  0],
 [0.020134228187919462,
  0.08333333333333327,
  0.4583333333333333,
  0.0847457627118644,
  0.04166666666666667,
  0],
 [0.026845637583892617,
  0.19444444444444448,
  0.6666666666666666,
  0.06779661016949151,
  0.04166666666666667,
  0],
 [0.03355704697986577,
  0.30555555555555564,
  0.7916666666666665,
  0.11864406779661016,
  0.12500000000000003,
  0],
 [0.040268456375838924,
  0.08333333333333327,
  0.5833333333333333,
  0.06779661016949151,
  0.08333333333333333,
  0],
 [0.04697986577181208,
  0.19444444444444448,
  0.5833333333333333,
  0.0847457627118644,
  0.04166666666666667,
  0],
 [0.053691275167785234,
  0.027777777777777922,
  0.3749999999999999

Neural Network

In [None]:
# Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):
	network = list()
	hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
	network.append(hidden_layer)
	output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in range(n_outputs)]
	network.append(output_layer)
	return network

In [None]:
# Update network weights with error
def update_weights(network, row, l_rate):
	for i in range(len(network)):
		inputs = row[:-1]
		if i != 0:
			inputs = [neuron['output'] for neuron in network[i - 1]]
		for neuron in network[i]:
			for j in range(len(inputs)):
				neuron['weights'][j] -= l_rate * neuron['delta'] * inputs[j]
			neuron['weights'][-1] -= l_rate * neuron['delta']

Sigmoid Activation Function

In [None]:
# Calculate neuron activation for an input
def activate(weights, inputs):
	activation = weights[-1]
	for i in range(len(weights)-1):
		activation += weights[i] * inputs[i]
	return activation

# Transfer neuron activation
def transfer(activation):
	return 1.0 / (1.0 + exp(-activation))

Back propagation

In [None]:
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
    for epoch in range(n_epoch):
        sum_error = 0
        for row in train:
            outputs = forward_propagate(network, row)
            expected = [0 for i in range(n_outputs)]
            expected[row[-1]] = 1
            sum_error += sum([(expected[i]-outputs[i])**2 for i in range(len(expected))])
            backward_propagate_error(network, expected)
            update_weights(network, row, l_rate)
        print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))

Gradient Descent Method

In [None]:
# Forward propagate input to a network output
def forward_propagate(network, row):
	inputs = row
	for layer in network:
		new_inputs = []
		for neuron in layer:
			activation = activate(neuron['weights'], inputs)
			neuron['output'] = transfer(activation)
			new_inputs.append(neuron['output'])
		inputs = new_inputs
	return inputs

In [None]:
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
    for epoch in range(n_epoch):
        sum_error = 0
        for row in train:
            outputs = forward_propagate(network, row)
            expected = [0 for i in range(n_outputs)]
            expected[row[-1]] = 1
            sum_error += sum([(expected[i]-outputs[i])**2 for i in range(len(expected))])
            backward_propagate_error(network, expected)
            update_weights(network, row, l_rate)
        print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))

Back Propagation Rules

In [None]:
# Backpropagate error and store in neurons
def backward_propagate_error(network, expected):
	for i in reversed(range(len(network))):
		layer = network[i]
		errors = list()
		if i != len(network)-1:
			for j in range(len(layer)):
				error = 0.0
				for neuron in network[i + 1]:
					error += (neuron['weights'][j] * neuron['delta'])
				errors.append(error)
		else:
			for j in range(len(layer)):
				neuron = layer[j]
				errors.append(neuron['output'] - expected[j])
		for j in range(len(layer)):
			neuron = layer[j]
			neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])


In [None]:
# Backpropagation Algorithm With Stochastic Gradient Descent
def back_propagation(train, test, l_rate, n_epoch, n_hidden):
	n_inputs = len(train[0]) - 1
	n_outputs = len(set([row[-1] for row in train]))
	network = initialize_network(n_inputs, n_hidden, n_outputs)
	train_network(network, train, l_rate, n_epoch, n_outputs)
	predictions = list()
	for row in test:
		prediction = predict(network, row)
		predictions.append(prediction)
	return(predictions)

In [None]:
# Calculate the derivative of an neuron output
def transfer_derivative(output):
	return output * (1.0 - output)

Termination Criteria



In [None]:
# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
	dataset_split = list()
	dataset_copy = list(dataset)
	fold_size = int(len(dataset) / n_folds)
	for i in range(n_folds):
		fold = list()
		while len(fold) < fold_size:
			index = randrange(len(dataset_copy))
			fold.append(dataset_copy.pop(index))
		dataset_split.append(fold)
	return dataset_split

# Calculate accuracy percentage
def accuracy_metric(actual, predicted):
	correct = 0
	for i in range(len(actual)):
		if actual[i] == predicted[i]:
			correct += 1
	return correct / float(len(actual)) * 100.0

# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm, n_folds, *args):
	folds = cross_validation_split(dataset, n_folds)
	scores = list()
	for fold in folds:
		train_set = list(folds)
		train_set.remove(fold)
		train_set = sum(train_set, [])
		test_set = list()
		for row in fold:
			row_copy = list(row)
			test_set.append(row_copy)
			row_copy[-1] = None
		predicted = algorithm(train_set, test_set, *args)
		actual = [row[-1] for row in fold]
		accuracy = accuracy_metric(actual, predicted)
		scores.append(accuracy)
	return scores

In [None]:
# Make a prediction with a network
def predict(network, row):
	outputs = forward_propagate(network, row)
	return outputs.index(max(outputs))

Learning Rate

Trial 1

In [None]:
n_folds = 5
l_rate = 0.1
n_epoch = 500
n_hidden = 21
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))

>epoch=0, lrate=0.100, error=239.894
>epoch=1, lrate=0.100, error=239.890
>epoch=2, lrate=0.100, error=239.886
>epoch=3, lrate=0.100, error=239.882
>epoch=4, lrate=0.100, error=239.878
>epoch=5, lrate=0.100, error=239.874
>epoch=6, lrate=0.100, error=239.869
>epoch=7, lrate=0.100, error=239.863
>epoch=8, lrate=0.100, error=239.857
>epoch=9, lrate=0.100, error=239.850
>epoch=10, lrate=0.100, error=239.843
>epoch=11, lrate=0.100, error=239.835
>epoch=12, lrate=0.100, error=239.826
>epoch=13, lrate=0.100, error=239.815
>epoch=14, lrate=0.100, error=239.803
>epoch=15, lrate=0.100, error=239.789
>epoch=16, lrate=0.100, error=239.773
>epoch=17, lrate=0.100, error=239.753
>epoch=18, lrate=0.100, error=239.730
>epoch=19, lrate=0.100, error=239.700
>epoch=20, lrate=0.100, error=239.662
>epoch=21, lrate=0.100, error=239.611
>epoch=22, lrate=0.100, error=239.537
>epoch=23, lrate=0.100, error=239.420
>epoch=24, lrate=0.100, error=239.203
>epoch=25, lrate=0.100, error=238.630
>epoch=26, lrate=0.100

For, n_folds = 5 l_rate = 0.1 n_epoch = 500 n_hidden = 21

Scores: [96.66666666666667, 96.66666666666667, 100.0, 96.66666666666667, 96.66666666666667]

Mean Accuracy: 97.333%

Trial 2

In [None]:
n_folds = 6
l_rate = 0.2
n_epoch = 600
n_hidden = 31
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))

>epoch=0, lrate=0.200, error=249.997
>epoch=1, lrate=0.200, error=249.997
>epoch=2, lrate=0.200, error=249.997
>epoch=3, lrate=0.200, error=249.997
>epoch=4, lrate=0.200, error=249.997
>epoch=5, lrate=0.200, error=249.997
>epoch=6, lrate=0.200, error=249.997
>epoch=7, lrate=0.200, error=249.997
>epoch=8, lrate=0.200, error=249.997
>epoch=9, lrate=0.200, error=249.997
>epoch=10, lrate=0.200, error=249.997
>epoch=11, lrate=0.200, error=249.997
>epoch=12, lrate=0.200, error=249.997
>epoch=13, lrate=0.200, error=249.996
>epoch=14, lrate=0.200, error=249.996
>epoch=15, lrate=0.200, error=249.996
>epoch=16, lrate=0.200, error=249.996
>epoch=17, lrate=0.200, error=249.996
>epoch=18, lrate=0.200, error=249.996
>epoch=19, lrate=0.200, error=249.996
>epoch=20, lrate=0.200, error=249.996
>epoch=21, lrate=0.200, error=249.996
>epoch=22, lrate=0.200, error=249.996
>epoch=23, lrate=0.200, error=249.996
>epoch=24, lrate=0.200, error=249.996
>epoch=25, lrate=0.200, error=249.996
>epoch=26, lrate=0.200

KeyboardInterrupt: ignored

For, n_folds = 6, l_rate = 0.2, n_epoch = 600, n_hidden = 31

Scores: [28.000000000000004, 100.0, 100.0, 100.0, 36.0, 24.0]

Mean Accuracy: 64.667%

Here accuracy decreased significantly, let's tweak the values separately now to see which variable is responsible



Trial 3

In [None]:
n_folds = 7
l_rate = 0.3
n_epoch = 700
n_hidden = 41
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))

>epoch=0, lrate=0.300, error=252.000
>epoch=1, lrate=0.300, error=252.000
>epoch=2, lrate=0.300, error=252.000
>epoch=3, lrate=0.300, error=252.000
>epoch=4, lrate=0.300, error=252.000
>epoch=5, lrate=0.300, error=252.000
>epoch=6, lrate=0.300, error=252.000
>epoch=7, lrate=0.300, error=252.000
>epoch=8, lrate=0.300, error=252.000
>epoch=9, lrate=0.300, error=252.000
>epoch=10, lrate=0.300, error=252.000
>epoch=11, lrate=0.300, error=252.000
>epoch=12, lrate=0.300, error=252.000
>epoch=13, lrate=0.300, error=252.000
>epoch=14, lrate=0.300, error=252.000
>epoch=15, lrate=0.300, error=252.000
>epoch=16, lrate=0.300, error=252.000
>epoch=17, lrate=0.300, error=252.000
>epoch=18, lrate=0.300, error=252.000
>epoch=19, lrate=0.300, error=252.000
>epoch=20, lrate=0.300, error=252.000
>epoch=21, lrate=0.300, error=252.000
>epoch=22, lrate=0.300, error=252.000
>epoch=23, lrate=0.300, error=252.000
>epoch=24, lrate=0.300, error=252.000
>epoch=25, lrate=0.300, error=252.000
>epoch=26, lrate=0.300

For n_folds = 7, l_rate = 0.3, n_epoch = 700, n_hidden = 41

Scores: [28.57142857142857, 33.33333333333333, 14.285714285714285, 28.57142857142857, 47.61904761904761, 28.57142857142857, 42.857142857142854]

Mean Accuracy: 31.973%

Here we can see that the accuracy remains still low even after decreasing n_epoch. That mean n_hidden layer is responsible for such low accuracy. Now let's see what happens if we decrease it less than 21.