# *Iris* flower classification with PyTorch
In this notebook, you will reimplement your 3-layer (4-16-16-3) fully connected
network from the first project. We will skip some of the steps for the sake of
brevity, this is just to get you familiar with the PyTorch environment.

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import torch

DEVICE = torch.device("cpu") # Put your device string here.

We will first setup our dataset as before:

In [2]:
iris = load_iris()
train_x, test_x, train_y, test_y = train_test_split(iris.data, iris.target, test_size=0.2)
train_x = torch.from_numpy(train_x).to(torch.float32)
train_y = torch.from_numpy(train_y).to(torch.long)
test_x = torch.from_numpy(test_x).to(torch.float32)
test_y = torch.from_numpy(test_y)

Now we will define our new PyTorch model! In Torch, models are defined as
classes that extend `nn.Module`, similar to how we defined our MLP in micrograd.

In [3]:
import torch.nn as nn
import torch.nn.functional as F

class IrisNet(nn.Module):
  # First, we must define our constructor.
  def __init__(self):
    super(IrisNet, self).__init__()
    # TODO: Define our layers here (4, 16, 16, 3)
    self.layer1 = nn.Linear(4, 16)
    self.layer2 = nn.Linear(16, 16)
    self.layer3 = nn.Linear(16, 3)
  
  # Now we need to instruct how to compute the forward pass.
  def forward(self, x):
    x = F.relu(self.layer1(x))
    x = F.relu(self.layer2(x))
    x = self.layer3(x)
    return x

Now that our model is defined, we can instantiate it here:

In [4]:
model = IrisNet().to(DEVICE) # The to() method allows us to do calculations on the GPU.

Again, let's try evaluating our model on the first flower in the training set.
We don't need to implement our own softmax function this time, as it is built-in
to PyTorch!

In [5]:
F.softmax(model(train_x[0]))

  F.softmax(model(train_x[0]))


tensor([0.4786, 0.3113, 0.2101], grad_fn=<SoftmaxBackward0>)

Of course, we need to train the model. But torch makes this easy with built-in
optimizers and loss functions.

In [6]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # In practice, we can use other optimizers, such as Adam.

for epoch in range(1000):
  # forward
  scores = model(train_x) # Note we can just pass the whole dataset at once!
  loss = loss_fn(scores, F.one_hot(train_y).float())

  # backward
  optimizer.zero_grad() # Zero out the gradients.
  loss.backward() # Compute the gradients.
  optimizer.step() # Update the parameters automatically!

  if epoch % 100 == 0:
    print(f"Epoch {epoch} | Accuracy: {torch.sum(torch.argmax(scores, dim=1) == train_y).item() / len(train_y)}")

Epoch 0 | Accuracy: 0.36666666666666664
Epoch 100 | Accuracy: 0.675
Epoch 200 | Accuracy: 0.8583333333333333
Epoch 300 | Accuracy: 0.9333333333333333
Epoch 400 | Accuracy: 0.9666666666666667
Epoch 500 | Accuracy: 0.975
Epoch 600 | Accuracy: 0.9916666666666667
Epoch 700 | Accuracy: 0.9916666666666667
Epoch 800 | Accuracy: 0.9916666666666667
Epoch 900 | Accuracy: 0.9916666666666667


In [7]:
print("Final accuracy:", torch.sum(torch.argmax(model(test_x), dim=1) == test_y).item() / len(test_y))

Final accuracy: 0.9666666666666667
