# Development of a dog breed classification model 

We have a small dataset of 133 types of dogs. It is necessary to build a classifier of these breeds. Since the dataset is too small to train the model from scratch, we will use transfer lerning. Let's test the ResNet and MobileNet models. I will compare the most accurate ResNet101 model with the tiny and fast MobileNet v3 (Large) model.

## Preprocessing

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


To augment the data, I will use a RandomHorizontalFlip and ColorJitter. I chose a batch size equal to 256 because this value shows good convergence results for image classifiers. I use three subsets of the data for training, testing and validating overfitting. 

In [None]:
from os.path import join as path
from torchvision.transforms import (Compose, Resize, CenterCrop, ColorJitter,
                                    RandomHorizontalFlip, ToTensor, Normalize)
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader

tform = {'train': Compose([
                           Resize(256),
                           CenterCrop(227),
                           RandomHorizontalFlip(),
                           ColorJitter(0.4, 0.4, 0.4),
                           ToTensor(),
                           Normalize((0.485, 0.456, 0.406),
                                     (0.229, 0.224, 0.225))
                           ]),
         'valid': Compose([
                           Resize(256),
                           CenterCrop(227),
                           ToTensor(),
                           Normalize((0.485, 0.456, 0.406),
                                     (0.229, 0.224, 0.225))]),
         'test': Compose([
                           Resize(256),
                           CenterCrop(227),
                           ToTensor(),
                           Normalize((0.485, 0.456, 0.406),
                                     (0.229, 0.224, 0.225))])}

data_dir = '/content/drive/MyDrive/Colab Notebooks/dogImages'
dataset = {x: ImageFolder(path(data_dir, x), tform[x]) for x in tform.keys()}
loaders = {x: DataLoader(dataset=dataset[x],
                         batch_size=256,
                         shuffle=(x=='train'),
                         num_workers=4,
                         pin_memory=True,
                         drop_last=False) for x in dataset.keys()}

I use the cross-entropy loss function for multiclass classification and the Adam algorithm, since it shows good convergence results with ease of setup. 

In [None]:
from torch.nn import CrossEntropyLoss
from torch.optim import Adam

criterion = CrossEntropyLoss()
optimizer = Adam(model.parameters(), lr=1e-3)

## ResNet

I need to change the output layer of the classifier in accordance with the task. In addition, I will add a batch normalization layer for fast model convergence.

In [None]:
from torch.nn import Sequential, Linear, BatchNorm1d
from torch import cuda
from torchvision.models import resnext101_32x8d

model = resnext101_32x8d(pretrained=True)

for param in model.parameters():
    param.requires_grad = False

model.fc = Sequential(
    BatchNorm1d(model.fc.in_features),
    Linear(model.fc.in_features, 133, bias=True))

use_cuda = cuda.is_available()
if use_cuda:
    model = model.cuda()

The model quickly overfits so I will use a small number of epochs. Therefore, with the help of the planner, I will decrease the learning rate non-linearly to achieve the best result.

In [None]:
from torch.optim.lr_scheduler import MultiStepLR

scheduler = MultiStepLR(optimizer, milestones=[8, 10, 11], gamma=0.5)

The train function consists of two stages of training and validation. Error values are displayed every 10 batches. The model will be saved when the minimum loss is reached.

In [None]:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
from torch import save, load


def train(model, loaders, criterion, optimizer, epochs):
    min_loss = 2**64
    for epoch in range(epochs):
        print(f'epoch: {epoch+1}')

        for phase in ['train', 'valid']:
            if phase == 'train':
                model.train()
            elif phase == 'valid':
                model.eval()

            epoch_loss = 0.0
            for idx, (data, cls) in enumerate(loaders[phase]):
                if use_cuda:
                    data = data.cuda()
                    cls = cls.cuda()

                if phase == 'train':
                    optimizer.zero_grad()

                loss = criterion(model(data), cls)
                
                epoch_loss += loss.item() * data.size(0)
                if not idx % 10:
                    print(loss.item())

                if phase == 'train':
                    loss.backward()
                    optimizer.step()

            epoch_loss /= len(loaders[phase].dataset)

            print(f'epoch {epoch+1}: {phase} phase is completed. mean loss: {epoch_loss}')

        scheduler.step()

        if min_loss > epoch_loss:
            save(model.state_dict(), 'resnet_model.pt')
            min_loss = epoch_loss
            print(f'minimum loss: {min_loss}. Model saved')

Let's start a training process.

In [None]:
epochs = 12

train(model, loaders, criterion, optimizer, epochs)

epoch: 1
5.176853179931641
1.0089534521102905
0.5885645747184753
epoch 1: train phase is completed. mean loss: 1.4243048998410117
0.3846698999404907
epoch 1: valid phase is completed. mean loss: 0.45516235421517653
minimum loss: 0.45516235421517653. Model saved
epoch: 2
0.2627578675746918
0.18864288926124573
0.24788738787174225
epoch 2: train phase is completed. mean loss: 0.2520779275965548
0.2616994380950928
epoch 2: valid phase is completed. mean loss: 0.32592301682797736
minimum loss: 0.32592301682797736. Model saved
epoch: 3
0.18819408118724823
0.15649135410785675
0.20243898034095764
epoch 3: train phase is completed. mean loss: 0.16111699854899308
0.2419833093881607
epoch 3: valid phase is completed. mean loss: 0.296657872271395
minimum loss: 0.296657872271395. Model saved
epoch: 4
0.10141529142856598
0.1573948711156845
0.12641480565071106
epoch 4: train phase is completed. mean loss: 0.12540101578492605
0.23808136582374573
epoch 4: valid phase is completed. mean loss: 0.31017033

Testing the classifier.

In [None]:
from torch import save, load
from torch import cuda

def test(model, loaders, criterion):
    model.eval()
    running_loss = 0.0
    accuracy = 0.0
    total = 0
    for data, cls in loaders['test']:
        if use_cuda:
            data = data.cuda()
            cls = cls.cuda()

        pred = model(data)
        loss = criterion(pred, cls)
        pred = pred.data.max(1)[1]

        running_loss += loss.item() * data.size(0)
        accuracy += sum(pred == cls.data)
        total += data.size(0)

    running_loss /= len(loaders['test'].dataset)
    accuracy /= total
    print(f'loss: {running_loss}, accuracy: {accuracy*100}%')


params = load('resnet_model.pt', map_location='cuda' if use_cuda else 'cpu')
model.load_state_dict(params)

test(model, loaders, criterion)

loss: 0.30240592403274974, accuracy: 91.02870178222656%


# MobileNet

Now let's make a MobileNet. I'll replace the Dropout layer with a normalization layer for quick convergence of training. I found that for this task, initializing the last layer with zero values is the most opportunely. 

In [2]:
from torchvision.models import mobilenet_v3_large

model = mobilenet_v3_large(pretrained=True)

for param in model.parameters():
    param.requires_grad = False

model.classifier[2] = BatchNorm1d(1280)
model.classifier[3] = Linear(1280, 133, bias=True)

model.classifier[3].weight.data.fill_(0.0)
model.classifier[3].bias.data.fill_(0.0)

use_cuda = cuda.is_available()
if use_cuda:
    model = model.cuda()

Here I'm using a learning algorithm that stops fast enough. The model with batch normalization is overfitting too quickly and therefore, without updating the minimum loss, the algorithm will decrease the learning rate and stop earlier. 

In [None]:
epochs = 100

def train(model, loaders, criterion, optimizer, epochs):
    min_loss = 2**64
    useless_epochs = 0
    default_lr = optimizer.defaults['lr']

    for epoch in range(epochs):
        print(f'epoch: {epoch+1}')
        print('learning rate: {}'.format(optimizer.defaults['lr']))
        
        for phase in ['train', 'valid']:
            if phase == 'train':
                model.train()
            elif phase == 'valid':
                model.eval()

            epoch_loss = 0.0
            for idx, (data, cls) in enumerate(loaders[phase]):
                if use_cuda:
                    data = data.cuda()
                    cls = cls.cuda()

                if phase == 'train':
                    optimizer.zero_grad()

                loss = criterion(model(data), cls)
                
                epoch_loss += loss.item() * data.size(0)
                if not idx % 10:
                    print(loss.item())

                if phase == 'train':
                    loss.backward()
                    optimizer.step()

            epoch_loss /= len(loaders[phase].dataset)

            print(f'epoch {epoch+1}: {phase} phase is completed. mean loss: {epoch_loss}')

        useless_epochs += 1

        if min_loss > epoch_loss:
            save(model.state_dict(), 'mobilenet_model.pt')
            min_loss = epoch_loss
            print(f'minimum loss: {min_loss}. Model saved')
            useless_epochs = 0
        
        if useless_epochs > 0:
            if useless_epochs >= 2:
                optimizer.defaults['lr'] = default_lr
                break
              
            optimizer.defaults['lr'] *= 0.5

train(model, loaders, criterion, optimizer, epochs)

epoch: 1
learning rate: 0.001
4.890349864959717
2.933051347732544
2.005187511444092
epoch 1: train phase is completed. mean loss: 2.897489925344547
0.8594317436218262
epoch 1: valid phase is completed. mean loss: 1.14082393132284
minimum loss: 1.14082393132284. Model saved
epoch: 2
learning rate: 0.001
1.2356833219528198
1.0186142921447754
0.9835569858551025
epoch 2: train phase is completed. mean loss: 1.0377937084186577
0.5307849645614624
epoch 2: valid phase is completed. mean loss: 0.7435367631341169
minimum loss: 0.7435367631341169. Model saved
epoch: 3
learning rate: 0.001
0.7315253019332886
0.683749258518219
0.6761021614074707
epoch 3: train phase is completed. mean loss: 0.6562500330502402
0.45249080657958984
epoch 3: valid phase is completed. mean loss: 0.651704239060065
minimum loss: 0.651704239060065. Model saved
epoch: 4
learning rate: 0.001
0.49228665232658386
0.4172075092792511
0.40234169363975525
epoch 4: train phase is completed. mean loss: 0.47058061189994127
0.4230929

Testing the model 

In [None]:
def test(model, loaders, criterion):
    model.eval()
    running_loss = 0.0
    accuracy = 0.0
    total = 0
    for data, cls in loaders['test']:
        if use_cuda:
            data = data.cuda()
            cls = cls.cuda()

        pred = model(data)
        loss = criterion(pred, cls)
        pred = pred.data.max(1)[1]

        running_loss += loss.item() * data.size(0)
        accuracy += sum(pred == cls.data)
        total += data.size(0)

    running_loss /= len(loaders['test'].dataset)
    accuracy /= total
    print(f'loss: {running_loss}, accuracy: {accuracy*100}%')


params = load('mobilenet_model.pt', map_location='cuda' if use_cuda else 'cpu')
model.load_state_dict(params)

test(model, loaders, criterion)

loss: 0.6637992391175631, accuracy: 82.29664611816406%


## Conclusion

The models learned quickly enough. Due to the small dataset, a fast overfitting is observed. The use of a learning strategy minimized this tendency. Thus obtained models with accuracies of 91 and 82.3 percent with the size of models 333 MiB and 17 MiB, respectively. It can be seen that the model built with MobileNet is more preferable in terms of performance and memory.