<a href="https://colab.research.google.com/github/MihaiDogariu/Keysight-Deep-Learning-Fundamentals/blob/main/Unit%20%234%20-%20Basic%20neural%20network%20learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Basic neural network learning

This notebook introduces the backpropagation algorithm on a simple task, namely classifying iris flowers based on some of their characteristics. We also point out the importance of the bias term in neural nets by leaving it out for this example and observing its influence on the final outcome.


In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

First, we need to load the data. The Iris dataset is a very popular one for machine learning classifications. It contains 150 entries, corresponding to 3 classes. Each of the 3 iris types has 50 occurences in the dataset. For each sample we have 4 features available: sepal length, sepal width, petal length, petal width.

In [None]:
# load the "Iris" dataset: https://archive.ics.uci.edu/ml/datasets/iris
data = load_iris()
# Extract data and labels from the dataset
X=data.data
y=data.target
# Split the data in a 80-20 ratio for train-test, respectively
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20, random_state=4)

In [None]:
# Set the hyperparameters
learning_rate = 0.1
iterations = 5000
N = y_train.size

# Setting the network's dimensions
input_size = 4 # corresponding to the 4 input features
hidden_size = 3
output_size = 3 # corresponding to the 3 target classes

# We will store the results in a pd.DataFrame for easier post-processing
results = pd.DataFrame(columns=["mse", "accuracy"])

We need to setup the weights of a 2 layer network.

In [None]:
# Initialize the weights between the first and the hidden layer
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))

# Initialize the weights between the hidden and the output layer
W2 = np.random.normal(scale=0.5, size=(hidden_size , output_size))

We also need to define ourselves the functions that we will use: the activation, the cost function, the way the metric (accuracy) will be computed and the transformation to one-hot encoded vectors.

In [None]:
# Defining the sigmoid function
def sigmoid(x):
  return 1 / (1 + np.exp(-x))

# Defining the Mean Squared Error
def mean_squared_error(y_pred, y_true):
  return ((y_pred - y_true)**2).sum() / (2*y_pred.size)

# Defining accuracy
def accuracy(y_pred, y_true):
  acc = y_pred.argmax(axis=1) == y_true.argmax(axis=1)
  return acc.mean()

# Defining one-hot encoding
def one_hot(x):
  result = np.zeros((x.size, x.max()+1))
  result[np.arange(x.size), x] = 1
  return result

Training the model.

In [None]:
# Transform the labels into one-hot encoded vectors
one_hot_y_train = one_hot(y_train)
one_hot_y_test = one_hot(y_test)

mse_list = []
acc_list = []

for itr in range(iterations):
  # Forward propagate the input through the first layer
  A1 = sigmoid(np.dot(X_train, W1))

  # Forward propagate the output of the first layer towards the output
  A2 = sigmoid(np.dot(A1, W2))

  # Compute the cost function (mse) and the metric (acc)
  mse = mean_squared_error(A2, one_hot_y_train)
  acc = accuracy(A2, one_hot_y_train)

  # Save both mse and acc
  mse_list.append(mse)
  acc_list.append(acc)

  # Backpropagation
  # Compute the gradient for the weights between the output and the hidden layer
  E1 = A2 - one_hot_y_train
  dW1 = E1 * A2 * (1 - A2)

  # Compute the gradient for the weights between the hidden and the input layer
  E2 = np.dot(dW1, W2.T)
  dW2 = E2 * A1 * (1 - A1)

  # Update the weights
  W2_update = np.dot(A1.T, dW1) / N
  W1_update = np.dot(X_train.T, dW2) / N

  W2 = W2 - learning_rate * W2_update
  W1 = W1 - learning_rate * W1_update

Test the model by running forward propagation on the test dataset.

In [None]:
# Rulam modelul antrenat pe baza de date de test
Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)

Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)

acc = accuracy(A2, one_hot_y_test)
plt.plot(np.arange(iterations), acc_list)
plt.show()
print("Accuracy: {}".format(acc))