# Model Training

We've integrated the data and took a look at some of the outstanding factors such as missing values. We are good to proceed to testing out models. Some models that would probably work well: 
1. Baseline/naive neural network softmax as baseline
1. Vanilla CNN with different pooling + batch normalization
1. Advanced image detection algorithms such as AlexNet, LeNet, ResNet

## Part 1: Split Training and Test Split

In [19]:
import pandas as pd 
import numpy as np 
from PIL import Image 
import os 
import torch
from torch.utils.data import Dataset, DataLoader 
from torchvision import transforms

In [38]:
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
print(device)

mps


In [9]:
data = pd.read_csv('../data/raw/chinese_mnist_classes.csv')
data.head()

Unnamed: 0,suite_id,sample_id,code,value,character,class,img_name,width,height
0,1,1,10,9,九,9,input_1_1_10,64,64
1,1,10,10,9,九,9,input_1_10_10,64,64
2,1,2,10,9,九,9,input_1_2_10,64,64
3,1,3,10,9,九,9,input_1_3_10,64,64
4,1,4,10,9,九,9,input_1_4_10,64,64


In [14]:
# Create a class that uses the dataset 
class ChineseMNISTDataset(Dataset): 
    def __init__(self, csv_file, root_dir, transform=None): 
        self.annotations = pd.read_csv(csv_file)
        self.root_dir = root_dir 
        self.transform = transform
    def __len__(self): 
        return len(self.annotations)
    
    def __getitem__(self, idx): 
        # load the image name
        img_name = os.path.join(self.root_dir, self.annotations.iloc[idx]['img_name'] + '.jpg')
        image = Image.open(img_name).convert('L')
        label = int(self.annotations.iloc[idx]['class'])
        
        if self.transform: 
            image = self.transform(image)
        
        return image,label


In [15]:
root_dir = '../data/raw/images'
csv_file = '../data/raw/chinese_mnist_classes.csv'

dataset = ChineseMNISTDataset(csv_file=csv_file, root_dir=root_dir, transform=transforms.ToTensor())

In [16]:
sample_loader = DataLoader(dataset, batch_size = 4, shuffle=True)
for images,labels in sample_loader: 
    print(images.shape)
    print(labels.shape)
    break

torch.Size([4, 1, 64, 64])
torch.Size([4])


We have a dataset and we are able to split into train and test data sets. Let us do do that

In [24]:
from torch.utils.data import random_split
dataset_size = len(dataset)
train_size = int(0.8 * dataset_size)
test_size = dataset_size - train_size 

train_dataset, test_dataset = random_split(dataset, [train_size, test_size])

train_loader = DataLoader(dataset=train_dataset, batch_size=128, shuffle=True)
test_loader=DataLoader(dataset=test_dataset, batch_size=128, shuffle=False)


## Part 2: Simple NN Model 

Let's start with a simple linear softmax neural network model just to check some baseline performance

In [25]:
from torch import nn

class nn_softmax_baseline(nn.Module): 
    def __init__(self, height, width, num_classes): 
        super(nn_softmax_baseline, self).__init__()
        self.fc = nn.Linear(height * width, num_classes)
        self.flatten = nn.Flatten()
    def forward(self, x): 
        x = self.flatten(x)
        x = self.fc(x)
        return x

In [45]:
def train_model(model, num_epochs=10,  train_loader=train_loader, test_loader=test_loader, learning_rate= 0.001, device=device): 
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    model = model.to(device)
    
    
    train_losses = [] 
    test_losses = []
    
    for epoch in range(num_epochs): 
        model.train()
        total_loss_tr = 0
        total_tr_predictions = 0 
        total_tr_correct = 0
        for (images, labels) in train_loader: 
            images,labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)
            total_loss_tr+=loss.item()
            
            _, predicted = torch.max(outputs.data, 1)
            total_tr_predictions+=labels.size(0)
            total_tr_correct +=(predicted == labels).sum().item()
            
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()          
        
        model.eval()
        with torch.no_grad(): 
            total_t_loss = 0
            total_predictions = 0 
            correct_predictions = 0
            for images, labels in test_loader: 
                images,labels = images.to(device), labels.to(device)
                outputs = model(images)
                loss = criterion(outputs, labels)
                total_t_loss+=loss.item()
                
                _, predicted = torch.max(outputs.data, 1)
                total_predictions+=labels.size(0)
                correct_predictions +=(predicted == labels).sum().item()
        
        train_losses.append(total_loss_tr/len(train_loader))
        test_losses.append(total_t_loss/len(test_loader))
        test_accuracy = correct_predictions/total_predictions
        train_accuracy = total_tr_correct/total_tr_predictions
        
        print(f"Epoch {epoch+1}")
        print(f"Train Loss: {total_loss_tr/len(train_loader)}")
        print(f"Test Loss: {total_t_loss/len(test_loader)}")
        print(f"Train Accuracy: {train_accuracy}")
        print(f"Test Accuracy: {test_accuracy}")
        print()
        
    return model, train_losses, test_losses

In [46]:
baseline_model = nn_softmax_baseline(64, 64, 15)
_, baseline_tr, baseline_t = train_model(baseline_model, num_epochs=10)

Epoch 1
Train Loss: 2.47991184985384
Test Loss: 2.3274452884991965
Train Accuracy: 0.21591666666666667
Test Accuracy: 0.3416666666666667

Epoch 2
Train Loss: 2.1996408498033566
Test Loss: 2.154573361078898
Train Accuracy: 0.398
Test Accuracy: 0.41233333333333333

Epoch 3
Train Loss: 2.0489703835325037
Test Loss: 2.055066933234533
Train Accuracy: 0.45575
Test Accuracy: 0.44966666666666666

Epoch 4
Train Loss: 1.950087622125098
Test Loss: 1.990242376923561
Train Accuracy: 0.4821666666666667
Test Accuracy: 0.4663333333333333

Epoch 5
Train Loss: 1.8787282159987917
Test Loss: 1.9429818938175838
Train Accuracy: 0.5024166666666666
Test Accuracy: 0.471

Epoch 6
Train Loss: 1.822484864833507
Test Loss: 1.9091433236996334
Train Accuracy: 0.5159166666666667
Test Accuracy: 0.479

Epoch 7
Train Loss: 1.7770469784736633
Test Loss: 1.8820263743400574
Train Accuracy: 0.5263333333333333
Test Accuracy: 0.48233333333333334

Epoch 8
Train Loss: 1.7383843254535756
Test Loss: 1.8603791048129399
Train Accur

## Part 3: Vanilla CNN Networks

Next we test out vanilla cnn networks, those of which will implement just convolution, convolution with pooling, and batch normalization, dropout, other factors <br>

Our goal is to outperform the baseline model, which captures no spatial invariance/equivariance of the images

In [47]:
class vanilla_cnn(nn.Module): 
    def __init__(self, num_classes): 
        super(vanilla_cnn, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(64*16*16, 128)
        self.fc2 = nn.Linear(128, num_classes)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
    def forward(self, x): 
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

In [48]:
baseline_cnn = vanilla_cnn(15)
_, baseline_cnn_tr, baseline_cnn_t = train_model(baseline_cnn, num_epochs=10)

Epoch 1
Train Loss: 1.9464068121098457
Test Loss: 1.3930198748906453
Train Accuracy: 0.3888333333333333
Test Accuracy: 0.5736666666666667

Epoch 2
Train Loss: 0.9742066612903107
Test Loss: 0.7541897793610891
Train Accuracy: 0.7063333333333334
Test Accuracy: 0.76

Epoch 3
Train Loss: 0.5693226938552045
Test Loss: 0.4977073259651661
Train Accuracy: 0.816
Test Accuracy: 0.8403333333333334

Epoch 4
Train Loss: 0.39931658956598726
Test Loss: 0.40543611099322635
Train Accuracy: 0.8729166666666667
Test Accuracy: 0.8543333333333333

Epoch 5
Train Loss: 0.2964889812976756
Test Loss: 0.30255187427004177
Train Accuracy: 0.9056666666666666
Test Accuracy: 0.909

Epoch 6
Train Loss: 0.22684275010164748
Test Loss: 0.24922095922132334
Train Accuracy: 0.9291666666666667
Test Accuracy: 0.925

Epoch 7
Train Loss: 0.17697744348898847
Test Loss: 0.25416907481849194
Train Accuracy: 0.94825
Test Accuracy: 0.9223333333333333

Epoch 8
Train Loss: 0.14692367145672758
Test Loss: 0.2045219587162137
Train Accuracy

## Part 4: Transfer Learning Using Pretrained Architectures

### ResNet 18

In [52]:
# Part 4: Transfer Learning Using ResNet18
from torchvision import models 
resnet18 = models.resnet18()
num_classes = 15
resnet18.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
resnet18.fc = nn.Linear(resnet18.fc.in_features, num_classes)

_, resnet18_tr, resnet18_t = train_model(resnet18, num_epochs=10)


Epoch 1
Train Loss: 0.5459261225100528
Test Loss: 0.15133033708358803
Train Accuracy: 0.82175
Test Accuracy: 0.951

Epoch 2
Train Loss: 0.09516149787034126
Test Loss: 1.3060758287707965
Train Accuracy: 0.9704166666666667
Test Accuracy: 0.6793333333333333

Epoch 3
Train Loss: 0.0685760673294042
Test Loss: 0.244675158833464
Train Accuracy: 0.978
Test Accuracy: 0.9253333333333333

Epoch 4
Train Loss: 0.045600317453252194
Test Loss: 0.15023984014987946
Train Accuracy: 0.9855833333333334
Test Accuracy: 0.951

Epoch 5
Train Loss: 0.049547036143733146
Test Loss: 0.1523511977866292
Train Accuracy: 0.9849166666666667
Test Accuracy: 0.961

Epoch 6
Train Loss: 0.024462277624518313
Test Loss: 0.15614262782037258
Train Accuracy: 0.9926666666666667
Test Accuracy: 0.957

Epoch 7
Train Loss: 0.0207369190658086
Test Loss: 0.050020152780537806
Train Accuracy: 0.994
Test Accuracy: 0.9843333333333333

Epoch 8
Train Loss: 0.03634533886872034
Test Loss: 0.15214982318381468
Train Accuracy: 0.9881666666666666

### LeNet-5