# Learning Objectives

Based on the historic IRIS dataset, we'll train a simplified neural network like in the 90's.

### Learning Objectives

- define a training and testing `DataSet` class for your model
- define a model class (`Module`) for a simple 1 hidden layer NN.
- execute the training process (epoch, batches of data)

### Requirements

To benefit from this content, it is preferable to know:
- how Neural Nets work (backprop)

In [1]:
import torch

You should know already about the [Iris flower data set](https://en.wikipedia.org/wiki/Iris_flower_data_set). Here's a short description:
- 4 numerical attributes
- 1 multi-class target (values here in `[0,1,2]` code for flower class).
- fairly easily separable classes

![Iris dataset scatter plot](https://upload.wikimedia.org/wikipedia/commons/thumb/5/56/Iris_dataset_scatterplot.svg/1200px-Iris_dataset_scatterplot.svg.png)

## 1. Creating classes and instances

### 1.1. `DataSet` and `DataLoader`

We're getting the data from scikit-learn.

In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

data = load_iris()

X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.33)

We're packaging it in a `DataSet`.

In [3]:
from torch.utils.data.dataset import Dataset
import numpy as np

# this converts a multi-label column (1D tensor) into one-hot vectors (2D tensor)
def one_hot(x, class_count):
    return torch.eye(class_count)[x,:]

# see examples at https://github.com/utkuozbulak/pytorch-custom-dataset-examples
class BasicLabelledDataset(Dataset):
    def __init__(self, inputs_array, targets_array):
        self.inputs_array = inputs_array # numpy
        self.inputs_tensor = torch.tensor(self.inputs_array).float()
        self.targets_array = targets_array # numpy
        self.class_count = len(np.unique(targets_array))
        if self.class_count > 2:
            self.targets_tensor = one_hot(self.targets_array, self.class_count)

    def __getitem__(self, index):
        return (self.inputs_tensor[index], self.targets_tensor[index])

    def __len__(self):
        return len(self.targets_array)

iris_training_dataset = BasicLabelledDataset(X_train, y_train)
iris_testing_dataset = BasicLabelledDataset(X_test, y_test)

This custom class will be used by a `DataLoader` to create batches of data to feed into the NN.

In [4]:
# this will batch the data for you (given you have a DataSet for it)
iris_training_loader = torch.utils.data.DataLoader(
    dataset=iris_training_dataset,
    batch_size=10,
    shuffle=True
)

### 1.2. NN model class as a `Module`

The model is defined as a `Module`. It requires the `__init__()` and `forward()` functions.

Note: the `backward()` is computed by autograd based on the definition of the `forward()`.

![simple neural network model for IRIS data](https://i1.wp.com/www.parallelr.com/wp-content/uploads/2016/02/iris_network.png?resize=456%2C277)

_Note: see at the end of this notebook for a **simplification of this model** definition using `Sequential`

In [5]:
class BasicNeuralNet(torch.nn.Module):
    def __init__(self, input_size, output_size, hidden_size):
        super(BasicNeuralNet, self).__init__()
        self.x_to_z = torch.nn.Linear(input_size, hidden_size, bias=True)
        self.z_to_h = torch.nn.Sigmoid()
        self.h_to_s = torch.nn.Linear(hidden_size, output_size, bias=True)
        self.s_to_y = torch.nn.Softmax(dim=1)
        
    def forward(self, x):
        z = self.x_to_z(x)  # Linear
        h = self.z_to_h(z) # Sigmoid
        s = self.h_to_s(h)  # Linear
        y = self.s_to_y(s)  # SoftMax
        return y

In [6]:
# this creates an instance with the right sizes
# but don't do anything else
model = BasicNeuralNet(
    4,  # input has size 4 (attributes)
    3,  # output has size 3 (one-hot, 3 classes)
    6   # hidden layer (param)
)

## 2. Training

### 2.1. Creating an optimizer

We'll just apply SGD with a specific criterion (MSELoss). SGD is initialized on the `parameters` of the `IrisNN` instance.

In [7]:
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
criterion = torch.nn.MSELoss()

### 2.2. Iterating on epochs and batches

In [8]:
epochs = 1000

for epoch in range(epochs):  # loop over the dataset multiple times
    running_loss = 0.0  # will store loss for the entire dataset
    
    for i, data in enumerate(iris_training_loader, 0):  # iterate on batches
        # note: data here is a whole batch, a tensor of the data of that batch

        # zero the parameter gradients
        optimizer.zero_grad()

        # get the inputs; data is a list of [inputs, labels]
        inputs, targets = data

        # forward prop based on attribute
        outputs = model(inputs)  
        
        # computing loss (NOTE: this is a tensor as well)
        loss = criterion(outputs, targets)
        
        # backward prop based on expected value
        loss.backward()
        
        # apply backward prop on parameters
        optimizer.step()

        # print statistics
        running_loss += loss.item()
    
    # just printing for 20 steps
    if epoch % (epochs // 20) == 0:
        print('[epoch=%d]\t loss=%.3f' % (epoch, running_loss))

print('Finished Training')

[epoch=0]	 loss=2.343
[epoch=50]	 loss=1.575
[epoch=100]	 loss=1.120
[epoch=150]	 loss=0.887
[epoch=200]	 loss=0.632
[epoch=250]	 loss=0.435
[epoch=300]	 loss=0.318
[epoch=350]	 loss=0.263
[epoch=400]	 loss=0.208
[epoch=450]	 loss=0.175
[epoch=500]	 loss=0.159
[epoch=550]	 loss=0.138
[epoch=600]	 loss=0.124
[epoch=650]	 loss=0.113
[epoch=700]	 loss=0.111
[epoch=750]	 loss=0.099
[epoch=800]	 loss=0.097
[epoch=850]	 loss=0.090
[epoch=900]	 loss=0.081
[epoch=950]	 loss=0.084
Finished Training


## 3. Testing

We'll compute accuracy from scratch here.

In [9]:
# batch the testing data as well
iris_testing_loader = torch.utils.data.DataLoader(
    dataset=iris_testing_dataset,
    batch_size=10,
    shuffle=True
)

correct = 0
total = 0

with torch.no_grad():  # deactivate autograd during testing
    for data in iris_testing_loader:  # iterate on batches
        # get testing data batch
        inputs, targets = data
        
        # apply the NN
        outputs = model(inputs)                 # compute output class tensor
        predicted = torch.argmax(outputs, dim=1)  # get argmax of P(y_hat|x)
        actual = torch.argmax(targets, dim=1)     # get y

        # compute score
        total += targets.size(0)
        correct += (predicted == actual).sum().item()

print("Accuracy: {:2f}".format(100 * correct / total))

Accuracy: 94.000000


# Notes

The definition of the model only consists in a sequence of layers. Fairly simple. There's a class for that, so you don't need to define your own class: `Sequential`. You can try again from step 2 with this model:

In [10]:
input_size = 4  # input has size 4 (attributes)
output_size = 3  # output has size 3 (one-hot, 3 classes)
hidden_size = 6   # hidden layer (param)

model = torch.nn.Sequential(
    torch.nn.Linear(input_size, hidden_size, bias=True),
    torch.nn.Sigmoid(),
    torch.nn.Linear(hidden_size, output_size, bias=True),
    torch.nn.Softmax(dim=1)
)