# MNIST Model From "Scratch" -- Attempt 5

Goal: To reproduce the third piece of Chapter 4, in which we move from the "from scratch" way to utilizing fastai Learners and pytorch nets.

### Copied Piece 
Mostly my own way, but also checking shapes with the reference along the way.

In [1]:
from fastai.data.all import untar_data, URLs
from pathlib import Path
from PIL import Image
import torch
from numpy import *

path = untar_data(URLs.MNIST_SAMPLE)
Path.BASE_PATH = path
threes = (path/'train'/'3').ls().sorted()
sevens = (path/'train'/'7').ls().sorted()
seven_tensors = [torch.as_tensor(array(Image.open(o))) for o in sevens]
three_tensors = [torch.as_tensor(array(Image.open(o))) for o in threes]

In [2]:
stacked_sevens = torch.stack(seven_tensors).float() / 255
stacked_threes = torch.stack(three_tensors).float() / 255
stacked_sevens.shape, stacked_threes.shape

(torch.Size([6265, 28, 28]), torch.Size([6131, 28, 28]))

In [3]:
train_x = torch.cat([stacked_threes, stacked_sevens]).view(-1, 28*28)
train_x.shape

torch.Size([12396, 784])

In [4]:
# This might be where I messed up last time...

train_y = torch.as_tensor(
    array([1]*len(threes) + [0]*len(sevens))
).unsqueeze(1)
train_x.shape,train_y.shape

(torch.Size([12396, 784]), torch.Size([12396, 1]))

In [5]:
dset = list(zip(train_x,train_y))
x,y = dset[0]
x.shape,y

(torch.Size([784]), tensor([1]))

In [6]:
valid_3_tens = torch.stack([torch.as_tensor(array(Image.open(o))) 
                            for o in (path/'valid'/'3').ls()])
valid_3_tens = valid_3_tens.float()/255
valid_7_tens = torch.stack([torch.as_tensor(array(Image.open(o))) 
                            for o in (path/'valid'/'7').ls()])
valid_7_tens = valid_7_tens.float()/255
valid_3_tens.shape,valid_7_tens.shape

(torch.Size([1010, 28, 28]), torch.Size([1028, 28, 28]))

In [7]:
valid_x = torch.cat([valid_3_tens, valid_7_tens]).view(-1, 28*28)
valid_y = torch.as_tensor(array([1]*len(valid_3_tens) + [0]*len(valid_7_tens))).unsqueeze(1)
valid_dset = list(zip(valid_x,valid_y))

In [8]:
def param_init(shape): return torch.randn(shape).requires_grad_()

In [9]:
weights = param_init((28*28, 1))
bias = param_init(1)
weights.shape, bias.shape

(torch.Size([784, 1]), torch.Size([1]))

In [10]:
weights[0:5], bias

(tensor([[ 0.8479],
         [-2.1539],
         [-0.1274],
         [ 0.0936],
         [-1.0767]], grad_fn=<SliceBackward0>),
 tensor([-0.0205], requires_grad=True))

In [11]:
def mnist_loss(preds, tars): return torch.where(tars == 1, 1-preds, preds).mean()

In [12]:
def accuracy(preds, tars): return ((preds > 0.5) == tars).float().mean()

In [13]:
test_preds = torch.as_tensor([0.4, 0.7, 0.1])
test_tars = torch.as_tensor([0, 0, 1])
test_loss = mnist_loss(test_preds, test_tars)
test_acc = accuracy(test_preds, test_tars)
test_loss.item(), test_acc.item()

(0.6666666865348816, 0.3333333432674408)

In [14]:
def step(lr=1):
    for p in (weights, bias):
        p.data -= p.grad * lr
        p.grad = None

In [15]:
def linNet(xb): return xb@weights + bias

In [16]:
from fastai.data.load import DataLoader

dset = DataLoader(dset, bs=256)
dset.one_batch()[0].shape, dset.one_batch()[1].shape

(torch.Size([256, 784]), torch.Size([256, 1]))

In [17]:
def one_epoch():
    for xb, yb in dset:
        preds = linNet(xb).sigmoid_()
        loss = mnist_loss(preds, yb)
        # print(f"preds: {preds[10:12]}, yb: {yb[10:12]}, loss: {loss}")
        loss.backward()
        step()

In [18]:
def get_accuracy():
    with torch.no_grad():
        acc = torch.as_tensor([accuracy(linNet(xb).sigmoid_(), yb) for xb, yb in valid_dset]).mean()
    return acc

In [19]:
get_accuracy()

tensor(0.3273)

In [20]:
one_epoch()
get_accuracy()

tensor(0.6546)

In [21]:
def run_n_epochs(n):
    for i in range(n):
        one_epoch()
        print(f"acc: {round(get_accuracy().item(), 4)}, loss: {round(calc_loss().item(), 4)}")

In [22]:
def calc_loss():
    with torch.no_grad():
        return torch.as_tensor([mnist_loss(linNet(xb).sigmoid_(), yb) for xb, yb in dset]).mean()

In [23]:
run_n_epochs(10)

acc: 0.6658, loss: 0.3273
acc: 0.842, loss: 0.1613
acc: 0.9097, loss: 0.1022
acc: 0.9303, loss: 0.0783
acc: 0.9421, loss: 0.0647
acc: 0.9504, loss: 0.0561
acc: 0.9534, loss: 0.0502
acc: 0.9578, loss: 0.0457
acc: 0.9622, loss: 0.0424
acc: 0.9637, loss: 0.0399


### Starting Without Reference Here

Okay, so let's try to remember what we need to do next...

My guess is that we're going to need the following:
- a Learner object
- pytorch nets

And I think we can keep the loss and accuracy functions we wrote...maybe the accuracy has a built-in equivalent for one of the APIs?

In [46]:
from fastai.basics import Learner
# from fastai.data.core import Dataloaders
from fastai.data.core import DataLoaders 

dls = DataLoaders(dset)
print(dls)

learner = Learner(dls, linNet, loss_func=mnist_loss)

<fastai.data.core.DataLoaders object at 0x7f12c20cef20>


In [48]:
learner.fit(5)

AttributeError: 'function' object has no attribute 'parameters'

So at this point, the model that we wrote doesn't have its own params. So we'll move to using the pytorch models.

In [53]:
from torch import nn

linear2 = nn.Linear(28*28, 1)

In [54]:
learner = Learner(dls, linear2, loss_func=mnist_loss)

In [55]:
learner.fit(5)

IndexError: list index out of range

So we didn't add the validation set to the dataloaders object.

In [58]:
valid_dset_dl = DataLoader(valid_dset, bs=256)

dls = DataLoaders(dset, valid_dset_dl)

In [59]:
learner = Learner(dls, linear2, loss_func=mnist_loss)

In [61]:
learner.fit(5)

[0, 0.6831236481666565, -0.8868663311004639, '00:00']
[1, -0.05327514931559563, -1.8254307508468628, '00:00']
[2, -0.8207753300666809, -2.7827961444854736, '00:00']
[3, -1.6448577642440796, -3.747657060623169, '00:00']
[4, -2.5066113471984863, -4.716002464294434, '00:00']


Not sure what these numbers are, but I do know that I skipped the sigmoid bit...again!

In [69]:
linear3 = nn.Sequential(
    nn.Linear(28*28, 1),
    nn.Sigmoid()
)

learner = Learner(dls, linear3, loss_func=mnist_loss)
learner.fit(5)

[0, 0.6135900020599365, 0.4180690050125122, '00:00']
[1, 0.4393807053565979, 0.23326237499713898, '00:00']
[2, 0.2810157835483551, 0.14868852496147156, '00:00']
[3, 0.19064800441265106, 0.11205869913101196, '00:00']
[4, 0.1383933573961258, 0.09233704209327698, '00:00']


Better? I'm not sure. This was pretty good for a first take at this material, so I'm going to look now and see what I got wrong.

Things I didn't do:
- create an optimizer
- use SGD in the learner
- import whatever is need to make the table output look better (something in fastbook?)

In [70]:
from fastai.basics import SGD

# from chapter (more or less)
learner2 = Learner(dls, nn.Linear(28*28, 1), opt_func=SGD, loss_func=mnist_loss, metrics=accuracy)
learner2.fit(5)

[0, 0.2988013029098511, 0.13028033077716827, 0.8498528003692627, '00:00']
[1, 0.07242824137210846, -0.20603643357753754, 0.9430814385414124, '00:00']
[2, -0.18895958364009857, -0.5423532724380493, 0.9582924246788025, '00:00']
[3, -0.4732130169868469, -0.8786702156066895, 0.9587831497192383, '00:00']
[4, -0.770643413066864, -1.21498703956604, 0.9612364768981934, '00:00']


Solid run!