# Jonathan's EMNIST Neural Network trainer
---
Welcome!

This notebook uses the data exported from the previous notebook to train a neural network and save it, which will then be used in the next notebook as the core of an OCR program.

*Note:
This was created for the tensor-format exported data. Dataset as images will not work in this notebook, and are extremely slow when they do work. Use of that format is not advised.*

**Run this notebook with GPU hardware acceleration if possible**

## Libraries

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms, io
from torch.utils.data import DataLoader, WeightedRandomSampler, Dataset
from torchsummary import summary
import cv2
import matplotlib.pyplot as plt
from google.colab import drive
import copy
from tqdm import tqdm

In [None]:
drive.mount('/content/drive')

Mounted at /content/drive


## Setup & Settings
Define file paths and adjust Dataloader settings if needed

Define parent directory of project

In [None]:
dir = '/content/drive/MyDrive/Datasets/EMNIST/'

`src` refers to the data source. If default folder names were used in the preceding notebook, use `raw` for the normal EMNIST format and `data` for enriched data.

In [None]:
src = 'raw' # 'raw' for pure EMNIST, 'data' for my punctuation-expanded EMNIST
train_data = torch.load(dir+src+'/train_data.dmp').float()/255
test_data = torch.load(dir+src+'/test_data.dmp').float()/255
train_keys = torch.load(dir+src+'/train_keys.dmp')
test_keys = torch.load(dir+src+'/test_keys.dmp')
class_names = torch.load(dir+src+'/class_names.dmp')

In [None]:
data = {'train':train_data,'valid':test_data}
keys = {'train':train_keys,'valid':test_keys}

In order to allow the data to be used, I had to create my own PyTorch Utilities Dataset class. It returns a dataset object using data and key tensors as inputs.

In [None]:
class JDataset(Dataset):
  def __init__(self,data,labels):
    if torch.is_tensor(data) and torch.is_tensor(labels):
      if data.shape[0] == labels.shape[0]:
        self.samples = data
        self.labels = labels
      else:
        raise Exception('Samples and labels not same length')
    else:
      raise Exception('Samples and labels must both be tensors')
  def __len__(self):
    return len(self.labels)
  def __getitem__(self,index):
    data = self.samples[index]
    label = self.labels[index]
    return data, label

This cell uses the defined JDataset class to create training and testing data.

In [None]:
jdataset = {x: JDataset(data[x],keys[x]) for x in ['train','valid']}

These are the settings for the dataloader. `batch` refers to size of a single batch that gets fed into the neural network, and `batches` defines how many are fed to it in a single iteration. I recommend `batch` and `batches` be set to multiply together to equal the amount of data in that set.

---

**Recommended Settings:**

- Raw Data:
 - `batches = {'train':300, 'valid':50}`
 - `batch = {'train':376, 'valid':376}`
- Default Enriched Data (140 fonts, 20 characters/font):
 - `batches = {'train':384, 'valid':64}`
 - `batch = {'train':300, 'valid':300}`

In [None]:
batches = {'train': 384, 'valid': 64}
batch = {'train': 300, 'valid': 300}

This cell sets up the weighted random sampler and the dataloader. The loader is used to provide tensors of data to be fed to the neural network, and the weighted random sampler is used to ensure equal representation for enriched data which is smaller by default.

In [None]:
weights = {x: 1./np.array(np.unique(jdataset[x].labels, return_counts=True)[1]) for x in ['train','valid']}
sample_weights = {x: torch.from_numpy(weights[x][jdataset[x].labels]) for x in ['train','valid']}

Sampler = {x: WeightedRandomSampler(sample_weights[x], batch[x]*batches[x]) for x in ['train','valid']}
dataloader = {x: DataLoader(jdataset[x], batch_size=batch[x], num_workers = 1, sampler=Sampler[x], pin_memory=True) for x in ['train','valid']}

Device definition (please use a GPU!)

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #use GPU cores if available, else cpu
dataset_sizes = {x: len(jdataset[x]) for x in ['train','valid']} #lengths of datasets

## Model Preparation

This is needed because the standard Alexnet takes 224x224x3 images, while for the purposes of my OCR, it needs to take the 28x28 images provided by EMNIST. While the images can be stacked to emulate 3 color channels, the model needs adjustment to fit their small size.

Import empty Alexnet

In [None]:
alexnet = models.alexnet(weights = None)

Adjust to fit my data;

This deletes two convolution layers and one max pooling layer from Alexnet. The remaining convolutional layers are adjusted with smaller kernels and more padding to ensure they don't remove more data than they process. The final layer is edited to fit all 67 classes.

In [None]:
jnet = copy.deepcopy(alexnet)
del(jnet.features[5:10])
jnet.features[0] = nn.Conv2d(3, 32, kernel_size=(5,5), stride=(1,1), padding='same')
jnet.features[3] = nn.Conv2d(32, 64, kernel_size=(3,3), stride=(1,1), padding='same')
jnet.features[5] = nn.Conv2d(64, 256, kernel_size=(3,3), stride=(1,1), padding='same')
jnet.classifier[6] = nn.Linear(4096, 67)
jnet = jnet.to(device)

In [None]:
summary(jnet,(3,28,28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 32, 28, 28]           2,432
              ReLU-2           [-1, 32, 28, 28]               0
         MaxPool2d-3           [-1, 32, 13, 13]               0
            Conv2d-4           [-1, 64, 13, 13]          18,496
              ReLU-5           [-1, 64, 13, 13]               0
            Conv2d-6          [-1, 256, 13, 13]         147,712
              ReLU-7          [-1, 256, 13, 13]               0
         MaxPool2d-8            [-1, 256, 6, 6]               0
 AdaptiveAvgPool2d-9            [-1, 256, 6, 6]               0
          Dropout-10                 [-1, 9216]               0
           Linear-11                 [-1, 4096]      37,752,832
             ReLU-12                 [-1, 4096]               0
          Dropout-13                 [-1, 4096]               0
           Linear-14                 [-

## Model Training

Once everything else is prepared, get in here to start training the model!

This is the primary training loop. Feel free to adjust the settings to your liking to fiddle with how it learns. Running this cell multiple times continues training the same model, although the preparation section can be run again if you wish to start over.

Settings:

- `num_epochs` - The number of iterations to train the neural network before stopping. Be careful not to train too much to avoid overfitting!
- `lr` - Learning rate. This adjusts the rate at which the model's parameters can change. Too low values make slow training, but high values make less precise training. I've found 0.005 works well, but feel free to change it.
- `momentum` - The amount of influence previous iterations have on future training. I haven't tried changing this but 0.9 works well, as it ensures training is consistent but allows room to move if needed.

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(jnet.parameters(), lr=0.005, momentum=0.9)
num_epochs = 1000

for epoch in range(num_epochs):
  jnet.train()
  traincorrect = 0
  for inputs, labels in dataloader['train']:
    inputs = inputs.to(device)
    labels = labels.to(device)

    outputs = jnet(inputs)
    preds = torch.max(outputs,1)[1]
    traincorrect += torch.sum(preds == labels.data)

    optimizer.zero_grad()
    loss = criterion(outputs,labels)
    loss.backward()
    optimizer.step()

  jnet.eval()
  testcorrect = 0
  for inputs, labels in dataloader['valid']:
    inputs = inputs.to(device)
    labels = labels.to(device)

    outputs = jnet(inputs)
    preds = torch.max(outputs, 1)[1]
    testcorrect += torch.sum(preds == labels.data)

  print('Epoch {:03d}'.format(epoch+1),'-'*15,'Training Accuracy: {:.2f}%'.format((traincorrect*100)/(batch['train']*batches['train'])),'-'*5,'Testing Accuracy: {:.2f}%'.format((testcorrect*100)/(batch['valid']*batches['valid'])))

Epoch 001 --------------- Training Accuracy: 51.13% ----- Testing Accuracy: 74.01%
Epoch 002 --------------- Training Accuracy: 76.30% ----- Testing Accuracy: 81.65%


KeyboardInterrupt: ignored

## Saving & Loading
To save this model for use, and to load it here if you wish to train the same model more, use this section.

`this_model` refers to the filepath to write to/load model from

In [None]:
this_model = 'saved models/jnet.dat'

In [None]:
torch.save(jnet.state_dict(),dir+this_model)

In [None]:
jnet.load_state_dict(torch.load(dir+this_model))