# Chapter 4, Exercise 1: Implement your own Learner

> Create your own implmentation of Learner from scratch, based on the training loop shown in this chapter.

As a reminder, the loop is:

- Init
- Predict
- Loss 
- Gradient
- Step
- Stop

Let's start with the boilerplate:

In [1]:
%matplotlib inline
from matplotlib import pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

torch.set_printoptions(edgeitems=2)
torch.manual_seed(42) # Life, the Universe, and Everything

<torch._C.Generator at 0x7f328d652c90>

I'll use the signature from the book; however, for now I'm going to leave out metrics.  I may come back to this later.

Let's create our model.  [Weights are  initialized for us](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear).

In [2]:
my_model = nn.Sequential(
    nn.Linear(in_features=28*28, out_features=30),
    nn.ReLU(),
    nn.Linear(30, 1)
)

Next, the data loader...which I guess means we'll need some data.  We'll use the FastAI 3/7 image set.

In [3]:
from fastai.vision.all import *
path = untar_data(URLs.MNIST_SAMPLE)
path.ls()

(#3) [Path('/home/aardvark/.fastai/data/mnist_sample/labels.csv'),Path('/home/aardvark/.fastai/data/mnist_sample/valid'),Path('/home/aardvark/.fastai/data/mnist_sample/train')]

Let's load those into tensors:

In [4]:
training_3 = torch.stack([tensor(Image.open(o)) for o in (path /'train/3').ls().sorted()]).float() / 255.0
training_7 = torch.stack([tensor(Image.open(o)) for o in (path /'train/7').ls().sorted()]).float() / 255.0
len(training_3), len(training_7)

(6131, 6265)

In [5]:
all_training_data = torch.cat([training_3, training_7])
all_training_data.shape

torch.Size([12396, 28, 28])

Let's take a moment to remember how to change the shape to match our model input:

In [6]:
training_3[0].view(-1).shape, all_training_data.view(-1, 28*28).shape

(torch.Size([784]), torch.Size([12396, 784]))

Time for some labels.

In [7]:
labels_3 = torch.ones([len(training_3)])
labels_7 = torch.zeros([len(training_7)])
all_training_labels = torch.cat([labels_3, labels_7])
all_training_labels.shape

torch.Size([12396])

Now time for the loader:

In [8]:
class NoMoreData(Exception):
    '''Nothing left, yo
    '''

class MyLoader():
    def __init__(self, training_data, labels, batch_size=1):
        assert len(training_data) == len(labels)
        self.length = len(training_data)
        self.training_data = training_data
        self.labels = labels
        self.batch_size = batch_size
        self.counter = 0
        
    def next(self):
        '''Yield next batch of data.
        '''
        if self.counter == -1:
            # Using this as a signal we're at the end of our rope
            raise NoMoreData
        if (self.length - self.counter > self.batch_size):
            start = self.counter
            end = self.counter + self.batch_size
            self.counter += self.batch_size
            training_data_to_return =  self.training_data[start:end]
            training_labels_to_return = self.labels[start:end]
        elif (self.length - self.counter <= self.batch_size):
            start = self.counter
            self.counter = -1
            training_data_to_return = self.training_data[start:]
            training_labels_to_return = self.labels[start:]
        return (training_data_to_return, training_labels_to_return)
    
    def reset(self):
        '''Reset counter so we can get more data
        '''
        self.counter = 0

It would be interesting to try and make this more like C library calls (not sure what the usual practice is there, but I'll bet you I'm not following it 🤣).  It would also be interesting to make this a Python yielder (oh, there's a better term for that...).  But for now, I'll stick with this.

In [9]:
my_training_loader = MyLoader(training_data=all_training_data.view(-1, 28*28), labels=all_training_labels)

Let's make sure this works as expected:

In [10]:
a, b = my_training_loader.next()
a.shape, b.shape, my_training_loader.counter

(torch.Size([1, 784]), torch.Size([1]), 1)

We'll reset the counter and check again:

In [11]:
my_training_loader.reset()
my_training_loader.counter

0

Next up would be optimizer.  I'm going to use the PyTorch SGD optimizer here:

In [12]:
my_optimizer = optim.SGD(my_model.parameters(), lr=0.01, momentum=0.9)

As for loss function, let's just go simple and use mse.   We'll save something fancier for when we get into MNIST.

In [13]:
def my_loss(predicted, actual):
    return (torch.mean(predicted - actual))**2

Now it's time to try some training!

In [14]:
class MyLearner():
    
    def __init__(self, loader=None, model=None, opt_func=None, loss_func=None, metrics={'loss': 0}):
        self.loader = loader
        self.model = model
        self.opt_func = opt_func
        self.loss_func = loss_func
        self.metrics = metrics
        
    def fit(self, epochs=10, verbose=False):
        '''Fit method 
        '''
        for i in range(0, epochs):
            while True:
                try:
                    X, y = self.loader.next()
                    X = torch.squeeze(X[0])
                    pred = self.model(X)
                    loss = self.loss_func(pred, y)
                    self.metrics['loss'] += loss
                    loss.backward()
                    self.opt_func.step()
                    self.opt_func.zero_grad()
                    if verbose is True and self.loader.counter % 1000 == 0:
                        print("Pred: {}, Actual: {}, Loss: {}".format(pred, y, loss))
                except NoMoreData:
                    self.loader.reset()
                    
                    break
                    
    def fit_once(self):
        X, y = self.loader.next()
        X = torch.squeeze(X[0])
        pred = self.model(X)
        loss = self.loss_func(pred, y)
        loss.backward()
        self.opt_func.step()
        self.opt_func.zero_grad()
        return X
                                                                                        
    def print_epoch_loss(self):
        self.metrics['loss'] /= self.loader.length
        print("Epoch loss: {}".format(self.metrics.loss))
        self.metrics.loss = 0

And now to put it all together:

In [15]:
my_learner = MyLearner(loader=my_training_loader, model=my_model, opt_func=my_optimizer,
                     loss_func=my_loss)


Let's try out the `fit_once()` method:

In [16]:
X = my_learner.fit_once()

We'll reset the counter...

In [17]:
my_learner.loader.reset()
my_learner.loader.counter

0

Now let's try it out!

In [None]:
my_learner.fit()

# Status

## Predictions

Sigh...I'm getting `tensor([nan], grad_fn=<AddBackward0>)` when running my model.  I'm not sure what I'm doing wrong.  When I run the kernel from the start, the `my_model(X)` call does work...

## Overall

I think this is a bit of a mess at the moment.  

This chapter of the book has two big sections

- One where a simple linear module is used to solve a parabola, with all of the steps being coded from scratch;
- And one where a neural network coded in PyTorch is used to build a model for MNIST.

The two exercises match these sections.

However, the approach I've taken above is a mix of those two approaches.  The `MyLearner` class uses the signature from the PyTorch model and the steps of the first.  `my_model` is a PyTorch model.  Making these things match each other is a bit messy.