<a href="https://colab.research.google.com/github/Apoak/Deep-Learning-Projects/blob/main/Basic_NN_in_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Lab 3.1: Basic Neural Network in PyTorch

Let's create a linear classifier one more time, but using PyTorch's automatic differentiation and optimization algorithms.  Then you will extend the perceptron into a multi-layer perceptron (MLP).

In [None]:
import numpy as np
import torch

We need to explicitly tell PyTorch when creating a tensor that we are interested in later computing its gradient

In [None]:
a = torch.tensor(5.,requires_grad=True)
a

In [None]:
b = torch.tensor(6.,requires_grad=True)
c = 2*a+3*b
c

To extract the gradients, we first need to call `backward()`.

In [None]:
c.backward()

Now to get the gradient of any variable with respect to `c`, we simply access the `grad` attribute of that variable.

In [None]:
a.grad

In [None]:
b.grad

Let's load and format the Palmer penguins dataset for multi-class classification.

In [None]:
!pip install scikit-learn palmerpenguins mlxtend
from palmerpenguins import load_penguins
from matplotlib import pyplot as plt

In [None]:
df = load_penguins()

# drop rows with missing values
df.dropna(inplace=True)

# get two features
X = df[['flipper_length_mm','bill_length_mm']].values

# convert species labels to integers
y = df['species'].map({'Adelie':0,'Chinstrap':1,'Gentoo':2}).values

To make the learning algorithm work more smoothly, we we will subtract the mean of each feature.

Here `np.mean` calculates a mean, and `axis=0` tells NumPy to calculate the mean over the rows (calculate the mean of each column).

In [None]:
X -= np.mean(X,axis=0)

Now we will convert our `X` and `y` arrays to torch Tensors.

In [None]:
X = torch.tensor(X).float()
y = torch.tensor(y).long()
print(X.shape)
print(y)

In [None]:
from torch import nn

The `torch.nn.Sequential` class creates a feed-forward network from a list of `nn.Module` objects.  Here we provide a single `nn.Linear` class which performs an affine transformation ($Wx+b$) so that we will have a linear classifier.

In [None]:
linear_model = torch.nn.Sequential(
    torch.nn.Linear(2,3),
    # two inputs, three outputs
)

Now we create a cross-entropy loss function object and a stochastic gradient descent (SGD) optimizer.

In [None]:
loss_fn = torch.nn.CrossEntropyLoss()

In [None]:
lr = 1e-2
opt = torch.optim.SGD(linear_model.parameters(), lr=lr)

Finally we can iteratively optimize the model.

In [None]:
epochs = 100
for epoch in range(epochs):
    opt.zero_grad() # zero out the gradients

    z = linear_model(X) # compute z values
    loss = loss_fn(z,y) # compute loss

    loss.backward() # compute gradients

    opt.step() # apply gradients

    print(f'epoch {epoch}: loss is {loss.item()}')

In [None]:
# calculating accuracy
y_pred = linear_model(X)
num_pred = y_pred.size(dim=0)
max_val, max_idx = torch.max(y_pred, dim = 1)

total = 0
# iterate through the tensor, look to see if the y_pred matched the value of the actual y
# print(max_idx)
for i in range(num_pred):
  if max_idx[i] == y[i]:
    total += 1

accuracy = total/num_pred
print(accuracy)

### Exercises

Extend the above code to implement an MLP with a single hidden layer of size 100.

Write code to compute the accuracy of each model.

Can you get the MLP to outperform the linear model?

In [None]:
multilayer_model = torch.nn.Sequential(
    torch.nn.Linear(2,100),
    torch.nn.ReLU(),
    torch.nn.Linear(100,3),
    # two inputs, three outputs
)

loss_fn = torch.nn.CrossEntropyLoss()
opt = torch.optim.SGD(multilayer_model.parameters(), lr=lr)

In [None]:
epochs = 100
for epoch in range(epochs):
    opt.zero_grad() # zero out the gradients

    z = multilayer_model(X) # compute z values
    loss = loss_fn(z,y) # compute loss

    loss.backward() # compute gradients

    opt.step() # apply gradients

    print(f'epoch {epoch}: loss is {loss.item()}')

In [None]:
# calculating accuracy
y_pred = multilayer_model(X)
num_pred = y_pred.size(dim=0)
max_val, max_idx = torch.max(y_pred, dim = 1)

total = 0
# iterate through the tensor, look to see if the y_pred matched the value of the actual y
#print(max_idx)
for i in range(num_pred):
  if max_idx[i] == y[i]:
    total += 1

accuracy = total/num_pred
print(accuracy)


