<a href="https://colab.research.google.com/github/Martinmbiro/Pytorch-classification-basics/blob/main/02%20Model%20building.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Data loading & model building**
> For this notebook, I'll functionizing the code to load the [Iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set) as well as build a neural network for the multi-class classification task

### Loading data
> Here, I functionize the code for loading the Iris dataset. The function will execute the following steps:
+ Load the Iris dataset
+ Preprocess the data (scale and take care of missing values)
+ Return `X`, `y` and `target labels`

In [1]:
# import
import numpy as np
# for type hinting
from typing import Tuple

In [2]:
# define function to load dataset
def load_dataset() -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
  # make necessary imports
  from sklearn import datasets
  # for data preprocessing
  from sklearn.preprocessing import MinMaxScaler
  from sklearn.impute import SimpleImputer
  from sklearn.pipeline import make_pipeline

  # load the dataset
  iris = datasets.load_iris()

  # make pipeline for preprocessing the features
  preprocessor = make_pipeline(
      SimpleImputer(strategy='median'), # handle missing values (if any)
      MinMaxScaler() # scale data from 0 -> 1
  )

  # preprocess the features
  X = preprocessor.fit_transform(iris.data)
  # get labels
  y = iris.target

  # return X, y, target_names
  return X, y, iris.target_names

In [3]:
# test out the function
X, y, target_names = load_dataset()

In [4]:
# first 5 rows in X
X[:5]

array([[0.22222222, 0.625     , 0.06779661, 0.04166667],
       [0.16666667, 0.41666667, 0.06779661, 0.04166667],
       [0.11111111, 0.5       , 0.05084746, 0.04166667],
       [0.08333333, 0.45833333, 0.08474576, 0.04166667],
       [0.19444444, 0.66666667, 0.06779661, 0.04166667]])

In [5]:
# last 10 values in y
y[140:]

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [6]:
# get random labels
ls = [y[i] for i in np.random.randint(0, 149, size=(5,))]

# print target names of random labels in y
[target_names[x] for x in ls]

['versicolor', 'setosa', 'versicolor', 'setosa', 'versicolor']

In [40]:
# how many unique elements do we have as target?
np.unique(y).size

3

> 🎆 **Bingo!**

> The function to load our data seems to be working fine. Now let's build the model

### Building the model
> ✋ **Info**  

> For the neural network,
+ We'll stack [`nn.Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#linear) with [`nn.Sequential`](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential) such that we'll end up with two _hidden layers_ between the _input_ and _output layers._
+ [`nn.Relu`](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU) activation function will be used to introduce non-linearity
+ Since this is a multi-class classification task (we have `3` distinct classes) , we'll have the output layer with `3` neurons

In [8]:
# import torch
import torch, torch.nn as nn
torch.__version__

'2.5.1+cu121'

In [9]:
# stacking the layers
model = nn.Sequential(
    nn.Linear(in_features=X.shape[1], out_features=8), # input -> hidden layer 1
    nn.ReLU(),
    nn.Linear(8, 8), # hidden layer 1 -> hidden layer 2
    nn.ReLU(),
    nn.Linear(8, 3), # hidden layer 2 -> outplut layer
)

#### Visualize the structure of the neural network
> To do this, we'll use [`torchinfo`](https://github.com/TylerYep/torchinfo)

In [10]:
# install the library
!pip install torchinfo
# import summary
from torchinfo import summary

Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl.metadata (21 kB)
Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


In [13]:
summary(model, input_size=(4,))

Layer (type:depth-idx)                   Output Shape              Param #
Sequential                               [3]                       --
├─Linear: 1-1                            [8]                       40
├─ReLU: 1-2                              [8]                       --
├─Linear: 1-3                            [8]                       72
├─ReLU: 1-4                              [8]                       --
├─Linear: 1-5                            [3]                       27
Total params: 139
Trainable params: 139
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

> ▶️ **Up Next**

> Model training and evaluation