<a href="https://colab.research.google.com/github/MihaiDogariu/Keysight-Deep-Learning-Fundamentals--v2-/blob/main/scripts/Unit_7_Fundamental_elements_of_a_neural_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fundamental elements of a neural network
This notebook approaches several aspects of neural networks. This particular application aims to forecast wether it will rain or not in the following day, based on the current day's observation. Therefore, this problem can be seen as a classification task.

In [1]:
import torch
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns
import matplotlib.pyplot as plt

RANDOM_SEED = 1 # for reproducibility
torch.manual_seed(RANDOM_SEED)

<torch._C.Generator at 0x1c1ac8647b0>

## 1. Choosing the data

Download the ["Rain in Australia"](https://www.kaggle.com/datasets/jsphyg/weather-dataset-rattle-package?resource=download) dataset:

In [3]:
!gdown 1D-Ua952YzK95yPCJxzr8SWu7ZJGVROJd

Downloading...
From: https://drive.google.com/uc?id=1D-Ua952YzK95yPCJxzr8SWu7ZJGVROJd
To: D:\Keysight\Programs\weatherAUS.csv

  0%|          | 0.00/14.1M [00:00<?, ?B/s]
  7%|7         | 1.05M/14.1M [00:00<00:01, 8.28MB/s]
 30%|##9       | 4.19M/14.1M [00:00<00:00, 20.1MB/s]
 48%|####8     | 6.82M/14.1M [00:00<00:00, 18.3MB/s]
 78%|#######8  | 11.0M/14.1M [00:00<00:00, 24.9MB/s]
100%|##########| 14.1M/14.1M [00:00<00:00, 26.5MB/s]
100%|##########| 14.1M/14.1M [00:00<00:00, 23.3MB/s]


Each dataset entry contains 23 fields <-> 23 descriptors/features. We can have a look at its content by calling `pd.DataFrame.head()` on it.

In [None]:
df = pd.read_csv('weatherAUS.csv')
print(df.shape)
print(list(df.columns))
df.head(5)

##2. Data pre-processing

In [None]:
# The dataset contains a large set of attributes, but we are not interested in all of them
# Let us make a list with the interesting attributes only
keep = ['MinTemp', 'MaxTemp', 'Rainfall', 'Humidity3pm', 'Pressure9am', 'RainToday', 'RainTomorrow']

# We keep only the attributes that belong to the above list
df_keep = df[keep]

# Replace literal strings such as yes/no with numerical values 1/0
df_keep['RainToday'].replace({'No': 0, 'Yes': 1}, inplace = True)
df_keep['RainTomorrow'].replace({'No': 0, 'Yes': 1}, inplace = True)

# The dataset also contains NaN values, which must be dealt with. Simplest solution is to eliminate them, altogether.
df_keep = df_keep.dropna(how='any')
df_keep.head(5)

Let's see the number of rainy and not rainy days

In [None]:
print("No. of days when it did not rain: ", df_keep['RainTomorrow'].value_counts()[0])
print("No. of days when it rained: ", df_keep['RainTomorrow'].value_counts()[1])

# Display these values as a ratio of the entire dataset
df_keep['RainTomorrow'].value_counts() / df_keep.shape[0]

 We can observe that the dataset is not balanced, having 3.5 times more days when it did not rain as compared to the number of days when it rained.

In [None]:
# We can balance the dataset by under-sampling the over-represented subset
under_sample = True

if under_sample:
  rain = df_keep[df_keep['RainTomorrow']==1] # select only the entries where rain was present
  no_rain = df_keep[df_keep['RainTomorrow']==0] # select only the entries where rain was not present
  no_rain = no_rain.sample(n=len(rain)) # pick len(rain) samples randomly from no_rain
  df_keep = pd.concat([rain,no_rain],axis=0) # concatenate the 2 subsets and obtain a balanced dataset

print("No. of days when it did not rain: ", df_keep['RainTomorrow'].value_counts()[0])
print("No. of days when it rained: ", df_keep['RainTomorrow'].value_counts()[1])

The newly formed dataset will be now split into inputs and outputs. The outputs represent the labels associated with the input data. In this context, the labels are the decision rain/no rain.

In [None]:
x = df_keep[keep[:-1]]
y = df_keep[keep[-1]]

We split the dataset after the 70-15-15 train-val-test rule. The function `train_test_split` will split the data in only two subsets. Therefore, we must call it twice:
1.   Divide the original dataset into `train_val` and `test`;
2.   Divide `train_val` into `train` and `val`.



In [None]:
train_ratio = 0.7
val_ratio = 0.15
test_ratio = 0.15

# Divide the original dataset into 'train_val' and 'test':
x_train_val, x_test, y_train_val, y_test = train_test_split(x, y, test_size=test_ratio)
print('train_val subset dimensions: {}\ttest subset dimensions: {}'.format(x_train_val.shape, x_test.shape))

# Divide 'train_val' into 'train' and 'val'
x_train, x_val, y_train, y_val = train_test_split(x_train_val, y_train_val, test_size=val_ratio/(val_ratio + train_ratio))
print('train subset dimensions: {}\tval subset dimensions: {}\ttest subset dimensions:{}'.format(x_train.shape, x_val.shape, x_test.shape))

We switch the data from `pandas.DataFrame` format to `Tensor` format in order for them to be processed by PyTorch.

In [None]:
x_train = torch.from_numpy(x_train.to_numpy()).float()
y_train = torch.squeeze(torch.from_numpy(y_train.to_numpy()).float())

x_val = torch.from_numpy(x_val.to_numpy()).float()
y_val = torch.squeeze(torch.from_numpy(y_val.to_numpy()).float())

x_test = torch.from_numpy(x_test.to_numpy()).float()
y_test = torch.squeeze(torch.from_numpy(y_test.to_numpy()).float())

print(x_train.shape, y_train.shape)
print(x_val.shape, y_val.shape)
print(x_test.shape, y_test.shape)

##3. Training and validating the model

The first step is creating the neural network model. In PyTorch, models are usually created as a class that inherits the `torch.nn.Module` class. The model must define 2 functions:


1.   `__init__()` - sets the general component structure of the network;
2.   `forward()` - sets the behaviour for the feed-forward.



In [None]:
class Net(torch.nn.Module):

  def __init__(self, n_features): # the n_features argument is used to establish the dimension of the input layer
    super(Net, self).__init__()
    self.fc1 = torch.nn.Linear(n_features, 5) # fully connected layer that connects the input to a 5-neuron layer
    self.fc2 = torch.nn.Linear(5, 3) # fully connected layer that connects the previous layer to a 3-neuron layer
    self.fc3 = torch.nn.Linear(3, 1) # fully connected layer that connects the previous layer to a 1-neuron layer - binary decision

  def forward(self, x): # applying the activation function after running each fully connected layer
    x = torch.nn.functional.relu(self.fc1(x))
    x = torch.nn.functional.relu(self.fc2(x))
    return torch.sigmoid(self.fc3(x)) # sigmoid for binary classification

Initializing the network, the cost function and the optimizier

In [None]:
net=Net(x_train.shape[1]) # it calls the Net.__init()__ function for model initialization
criterion = torch.nn.BCELoss() # Binary Cross-Entropy Loss is the loss function
optimizer = torch.optim.Adam(net.parameters(), lr=0.01) # Adam optimizer

Loading the functions and the data on GPU

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

x_train = x_train.to(device)
y_train = y_train.to(device)

x_val = x_val.to(device)
y_val = y_val.to(device)

x_test = x_test.to(device)
y_test = y_test.to(device)

net = net.to(device)

criterion = criterion.to(device)

Auxiliary functions

In [None]:
# Compute accuracy
def calculate_accuracy(y_true, y_pred):
  predicted = y_pred.ge(.5).view(-1)
  return (y_true == predicted).sum().float() / len(y_true)

# Define a rounding function
def round_tensor(t, decimal_places=3):
  return round(t.item(), decimal_places)

Model training

In [None]:
for epoch in range(1000):

    y_pred = net(x_train) # forward propagation
    y_pred = torch.squeeze(y_pred)
    train_loss = criterion(y_pred, y_train) # compute cost function

    if epoch % 100 == 0: # perform validation once in a while
      train_acc = calculate_accuracy(y_train, y_pred) # compute train accuracy

      y_val_pred = net(x_val) # forward propagation
      y_pred = torch.squeeze(y_pred)
      y_val_pred = torch.squeeze(y_val_pred)

      val_loss = criterion(y_val_pred, y_val) # compute cost function

      val_acc = calculate_accuracy(y_val, y_val_pred) # compute validation accuracy
      print("epoch {}\nTrain set - loss: {}, accuracy: {}\nTest  set - loss: {}, accuracy: {}"
            .format(epoch,
                    round_tensor(train_loss), round_tensor(train_acc),
                    round_tensor(val_loss), round_tensor(val_acc)))

    optimizer.zero_grad() # erase existing gradients

    train_loss.backward() # compute gradients for current iteration

    optimizer.step() # perform weights update

After training and optimization on `train` and `val`, we perform another test on the `test` subset.

In [None]:
classes = ['No rain', 'Raining']

y_pred = net(x_test)

y_pred = y_pred.ge(.5).view(-1).cpu()
y_test = y_test.cpu()

print(classification_report(y_test, y_pred, target_names=classes))

In [None]:
cm = confusion_matrix(y_test, y_pred)
df_cm = pd.DataFrame(cm, index=classes, columns=classes)

hmap = sns.heatmap(df_cm, annot=True, fmt="d")
hmap.yaxis.set_ticklabels(hmap.yaxis.get_ticklabels(), rotation=0, ha='right')
hmap.xaxis.set_ticklabels(hmap.xaxis.get_ticklabels(), rotation=30, ha='right')
plt.ylabel('True label')
plt.xlabel('Predicted label');