<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#Improvements-pursued-in-this-notebook" data-toc-modified-id="Improvements-pursued-in-this-notebook-0.1"><span class="toc-item-num">0.1&nbsp;&nbsp;</span>Improvements pursued in this notebook</a></span></li></ul></li><li><span><a href="#First-round-of-improvements:-multi-label-classifier-&amp;-sessions" data-toc-modified-id="First-round-of-improvements:-multi-label-classifier-&amp;-sessions-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>First round of improvements: multi-label classifier &amp; sessions</a></span></li></ul></div>

## Improvements pursued in this notebook

1. Change from binary classifier to multi category classifier:
    - add ims 0-9
    - add change loss fxn to cross entropy loss w/ softmax
    - change shape of final activation from 1 to 10
    - change label to 1HE
2. Add RGB

Super-short version with all of the helpers:

# First round of improvements: multi-label classifier & sessions

**Create tools for training a model:**

In [None]:
from fastai.vision.all import *

### Data ###
def init_data(path, im_size, n_cls, batch_size):
    ## Train
    # ims
    for i in range(n_cls):
        new_ims = torch.stack(
            [tensor(Image.open(fn)) for fn in (path/'training'/f'{i}').ls()]
        ).float()/255
        if i == 0: ims = new_ims
        else: ims = torch.cat([ims,new_ims])
    train_ims = ims.view(-1,im_size)
    # lbls
    train_lbls = []
    for i in range(n_cls):
        l = L([0]*n_cls)
        l[i] = 1
        train_lbls += [l] * len((path/'training'/f'{i}').ls())    
    train_lbls = tensor(train_lbls)
    ## Valid
    # ims
    for i in range(n_cls):
        new_ims = torch.stack(
            [tensor(Image.open(fn)) for fn in (path/'testing'/f'{i}').ls()]
        ).float()/255
        if i == 0: ims = new_ims
        else: ims = torch.cat([ims,new_ims])
    valid_ims = ims.view(-1,im_size)
    # lbls
    valid_lbls = []
    for i in range(n_cls):
        l = L([0]*n_cls)
        l[i] = 1
        valid_lbls += [l] * len((path/'testing'/f'{i}').ls())    
    valid_lbls = tensor(valid_lbls)
    ## DataLoaders
    train_ds = L(zip(train_ims, train_lbls))
    valid_ds = L(zip(valid_ims, valid_lbls))
    train_dl = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
    valid_dl = DataLoader(valid_ds, batch_size=batch_size, shuffle=True)
    return train_dl

### Model ###
def init_mod(im_size, n_cls, hidden_params):
    mod = nn.Sequential(
        nn.Linear(im_size,hidden_params),
        nn.ReLU(),
        nn.Linear(hidden_params,n_cls)
    )
    return mod

### Loss ###
def softmax(t):
    if len(t.shape) == 1: return torch.exp(t) / torch.exp(t).sum()
    else:                 return torch.exp(t) / torch.exp(t).sum(dim=1, keepdim=True)
def loss(yp, y):
    return (1 - (y*softmax(yp)).sum(dim=1, keepdim=True)).mean()

### Calculate gradients for use in train_once ###
def calc_grad(x,y,mod):
    yp = mod(x)     # get predictions
    ls = loss(yp,y) # calculate loss
    ls.backward()   # take gradient w.r.t. loss

### Create SGD Stepper; args = (mod.parameters(), lr) ###
class ParamStepper:
    def __init__(self, p, lr): self.p,self.lr = list(p),lr # initialize w/ mod.params & lr
        
    def step(self, *args, **kwargs):                       # take step
        for o in self.p: o.data -= o.grad.data * self.lr
            
    def zero_grad(self, *args, **kwargs):                  # reset grad
        for o in self.p: o.grad = None

### Train parameters by performing SGD on each mini-batch in the dl ###
def train_one_epoch(dl, mod, stepper):
    for xb,yb in dl:           # for every minibatch (xb,yb) in the dataloader:
        calc_grad(xb, yb, mod) # calc grad(loss(mod(xb),yb))
        stepper.step()         # take step
        stepper.zero_grad()    # reset grad

### Get accuracy of mod on a mini-batch ###
def mb_acc(yp,y):
    yp_max,yp_i = torch.max(yp, dim=1, keepdim=True)
    y_max, y_i  = torch.max(y,  dim=1, keepdim=True)
    return (yp_i==y_i).float().mean()
        
### Get accuracy of mod on a dataloader (takes avg of all mbs in dl) ###
def epoch_acc(dl, mod):
    a = [mb_acc(mod(xb), yb) for xb,yb in dl]
    return round(torch.stack(a).mean().item(), 5)          # avg acc over all mini-batches

### Run `train_once` `epochs` times given data `dl`, model `mod`, and stepper `stepper`
def train_n_epochs(dl, mod, stepper, nepochs):
    l = L()
    for i in range(nepochs):
        print('.',end='')
        train_one_epoch(dl, mod, stepper)
        l += epoch_acc(dl, mod)
    print('',end='\t')
    return l

### Perform n training sessions ###
def train_n_sessions(dl, im_size, n_cls, hidden_params, nepochs, lr, nsessions):
    l = L()
    print('Progress:',end='\n')
    for i in range(nsessions):
        print(i,end='')
        mod = init_mod(im_size, n_cls, hidden_params)
        stepper = ParamStepper(mod.parameters(), lr)
        l += train_n_epochs(dl, mod, stepper, nepochs)
    print('Done')
    return tensor(l).reshape(nsessions,nepochs)

**Train model once to make sure it works:**

In [None]:
### Train n epochs ###
# params
path          = untar_data(URLs.MNIST)
n_cls         = 10
im_size       = 28*28
batch_size    = 64*2*2*2
hidden_params = 30
lr            = .1
nepochs       = 20

# inits
dl            = init_data(path, im_size, n_cls, batch_size)
mod           = init_mod(im_size, n_cls, hidden_params)
stepper       = ParamStepper(mod.parameters(), lr)

# train
train_n_epochs(dl, mod, stepper, nepochs)

....................	

(#20) [0.1911,0.46966,0.64053,0.65437,0.66237,0.66664,0.73297,0.74156,0.74684,0.74923...]

**Train model n sessions to find max accuracy:**

In [None]:
### Train n sessions ###
# params
path          = untar_data(URLs.MNIST)
n_cls         = 10
im_size       = 28*28
batch_size    = 64*2*2*2

# inits
dl = init_data(path, im_size, n_cls, batch_size)

# train
test1 = train_n_sessions(dl, im_size, n_cls, hidden_params=30, nepochs=50, lr=.1, nsessions=8)

# print max acc
print("Max accuracy:",test1.max().item())

# print training sessions
pd.DataFrame(test1.numpy())

Progress:
0..................................................	1..................................................	2..................................................	3..................................................	4..................................................	5..................................................	6..................................................	7..................................................	Done
Max accuracy: 0.9184899926185608


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,40,41,42,43,44,45,46,47,48,49
0,0.19816,0.46343,0.64134,0.65687,0.6637,0.66759,0.73384,0.74391,0.7494,0.75184,...,0.85144,0.85216,0.85299,0.85288,0.85288,0.85292,0.85344,0.85316,0.85412,0.85388
1,0.19196,0.37878,0.57947,0.64849,0.6572,0.66395,0.66559,0.66729,0.66886,0.67025,...,0.75618,0.7563,0.75632,0.75618,0.75589,0.75707,0.75724,0.75636,0.09861,0.0984
2,0.33979,0.5415,0.62218,0.71938,0.73084,0.73652,0.74053,0.74377,0.74542,0.74738,...,0.83571,0.83559,0.83667,0.83729,0.83769,0.83786,0.83818,0.83907,0.83917,0.83998
3,0.32561,0.39221,0.55516,0.57349,0.65191,0.66174,0.66531,0.6668,0.68988,0.73967,...,0.83193,0.83251,0.83293,0.83427,0.83475,0.83443,0.09854,0.09861,0.0989,0.09911
4,0.4204,0.52091,0.63931,0.7216,0.73606,0.80492,0.81567,0.82236,0.82609,0.82894,...,0.91747,0.09883,0.09897,0.09875,0.09868,0.09854,0.09868,0.0989,0.09854,0.09875
5,0.20793,0.43703,0.54513,0.70941,0.77675,0.80554,0.81683,0.82273,0.82707,0.83012,...,0.85274,0.85304,0.85336,0.85382,0.85374,0.85432,0.8544,0.85451,0.85502,0.85538
6,0.32405,0.28171,0.47103,0.68722,0.78754,0.80963,0.81622,0.8226,0.8267,0.82913,...,0.09861,0.09847,0.09854,0.09911,0.0989,0.09825,0.09847,0.0989,0.09861,0.09875
7,0.21432,0.3616,0.68459,0.77192,0.79975,0.81397,0.81914,0.82467,0.82668,0.83039,...,0.85334,0.85391,0.85433,0.85418,0.85461,0.85513,0.85529,0.85502,0.85561,0.83147
