<a href="https://www.kaggle.com/code/marinabalakina/dll30-dz4-2?scriptVersionId=155042589" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [None]:
# ** Домашнее** задание по теме «Архитектуры свёрточных сетей»

Цель задания: изучить работу с готовыми моделями из torchvision.

Контекст

Вам необходимо подобрать базовую модель для работы по вашей задаче. Вы пробуете обучать различные модели на “ваших” данных. По результатам отберёте лучшую для дальнейшего обучения.

Задание

Вам необходимо провести эксперименты по начальному обучению различных моделей и сравнить результаты.

1.Возьмите датасет EMNIST из torchvision

2. Обучите на нём модели: ResNet 18, VGG 16, Inception v3, DenseNet 161 (с нуля по 10 эпох)

3. Сведите результаты обучения моделей (графики лоса) в таблицу и сравните их.


Задание со звездочкой*

* Выполните то же задание, используя датасет hymenoptera_data

Инструкция к выполнению задания

* Загрузите датасет, посмотрите примеры картинок в нём и проверьте наличествующие классы и их дисбаланс.

* Создайте модель текущего типа, используя интерфейс torchvision для нужного количества классов.

* Обучите модель с нуля 10 эпох. Фиксируйте значение функции потерь в список для последующего отображения.

Повторите пункты 2 и 3 для всех указанных вариантов моделей.

Формат сдачи работы

Прикрепите ссылку на готовое решение в личном кабинете. Работу можно отправлять в виде ссылки на python-ноутбук из GitHub, Google Colaboratory или аналогичных платформ. Не забудьте открыть доступ на просмотр и комментирование.

Критерии оценивания
По итогу выполнения задания вы получите зачёт.

Задание считается выполненным, если:

вы обучили каждую модель до некоторого улучшения качества

составлена таблица обучения для сравнения

Задание будет отправлено на доработку, если:

использованы не все типы моделей

не составлена сводная таблица с результатами

# Импорт библиотек и пользовательские функции

In [1]:
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pandas as pd

In [2]:
import torch
from torch import nn
import torchvision as tv # consists of popular datasets, model architectures, and common image transformations for computer vision - для работы с предобученными нейросетями
import time

In [3]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [4]:
def evaluate_accuracy(data_iter, net):
    acc_sum, n = 0, 0
    net.eval()
    for X, y in data_iter:
        X, y = X.to(device), y.to(device)
        acc_sum += (net(X).argmax(axis=1) == y).sum()
        n += y.shape[0]
    return acc_sum.item() / n

In [5]:
def train(net, train_iter, test_iter, trainer, num_epochs):
    net.to(device)
    loss = nn.CrossEntropyLoss(reduction='sum')
    net.train()
    train_accuracy, train_losses, test_accuracy =[], [], []
    for epoch in range(num_epochs):
        train_l_sum, train_acc_sum, n, start = 0.0, 0.0, 0, time.time()

        for i, (X, y) in enumerate(train_iter):
            X, y = X.to(device), y.to(device)
            trainer.zero_grad()
            y_hat = net(X)
            l = loss(y_hat, y)
            l.backward()
            trainer.step()
            train_l_sum += l.item()
            train_acc_sum += (y_hat.argmax(axis=1) == y).sum().item()
            n += y.shape[0]

            if i % 10 == 0:
              print(f"Step {i}. time since epoch: {time.time() -  start:.3f}. "
                    f"Train acc: {train_acc_sum / n:.3f}. Train Loss: {train_l_sum / n:.3f}")
        test_acc = evaluate_accuracy(test_iter, net.to(device))
        print('-' * 20)
        print(f'epoch {epoch + 1}, loss {train_l_sum / n:.4f}, train acc {train_acc_sum / n:.3f}'
              f', test acc {test_acc:.3f}, time {time.time() - start:.1f} sec')
        train_accuracy.append(train_acc_sum / n)
        train_losses.append(train_l_sum / n)
        test_accuracy.append(test_acc)
    return train_accuracy, train_losses, test_accuracy

In [6]:
BATCH_SIZE = 256
# Переводим картинки в 224х224 и в тензор
transoforms = tv.transforms.Compose([

    tv.transforms.Resize((224, 224)),
    tv.transforms.ToTensor()
])
train_dataset = tv.datasets.EMNIST('.', split='mnist', train=True, transform=transoforms, download=True)
test_dataset = tv.datasets.EMNIST('.', split='mnist', train=False, transform=transoforms, download=True)
train_iter = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE)
test_iter = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE)

## 1.1. Визуализация

In [None]:
train_dataset.classes

In [None]:
labels_map={
    0: '48',
    1: '49',
    2: '50',
    3: '51',
    4: '52',
    5: '53',
    6: '54',
    7: '55',
    8: '56',
    9: '57',
}

figure = plt.figure(figsize = (10,10))
cols, rows = 3, 3

for i in range (1, cols*rows + 1):
    sample_idx = torch.randint(len(train_dataset), size = (1,)).item()
    image, label = train_dataset[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(labels_map[label])
    plt.axis('off')
    plt.imshow(image.squeeze(), cmap='gray')
plt.show()

In [None]:
# считаем количество элементов
targets = train_dataset.targets
counts = torch.bincount(targets)
for i, count in enumerate(counts):
    print(f"{i}: {count}")

Классы сбалансированы

# 2. Обучение моделей

## 2.1. ResNET 18

In [None]:
classification_models = tv.models.list_models(module=tv.models)

In [None]:
classification_models

In [None]:
# Если брать предобученные модели из torchvision.models, то они обучены на изображениях (224,224,3). Поэтому на до перетрансформировать датасеты 
transoforms = tv.transforms.Compose([
    tv.transforms.Grayscale(3),
    tv.transforms.Resize((224, 224)),
    tv.transforms.ToTensor()
])

train_dataset = tv.datasets.MNIST('.', train=True, transform=transoforms, download=True)
test_dataset = tv.datasets.MNIST('.', train=False, transform=transoforms, download=True)
train_iter = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE)
test_iter = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE)

In [None]:
model = tv.models.resnet18(pretrained=True)

In [None]:
model

In [None]:
# Убираем требование градиента:
for param in model.parameters():
    param.requires_grad = False

In [None]:
model.fc

In [None]:
model.fc = nn.Linear(in_features=512, out_features=10)

In [None]:
print("Params to learn:")
params_to_update = []
for name, param in model.named_parameters():
    if param.requires_grad == True:
        params_to_update.append(param)
        print("\t",name)

In [None]:
trainer = torch.optim.Adam(params_to_update, lr=0.001)

In [None]:
train_accuracy, train_losses, test_accuracy = train(model, train_iter, test_iter, trainer, 10)

In [None]:
df_results= pd.DataFrame(columns = ['model', 'train_accuracy', 'train_loss', 'test_accuracy','epoch'])
for i in range(10):
  df_results.loc[len(df_results.index)] = ['resnet18',  train_accuracy[i], train_losses[i], test_accuracy[i], i]

In [None]:
df_results

In [None]:
df_results.to_csv('resnet.csv', index=False)

## 2.2. VGG16

In [None]:
torch.cuda.empty_cache()

In [None]:
BATCH_SIZE = 256
# Переводим картинки в 224х224 и в тензор
transoforms = tv.transforms.Compose([

    tv.transforms.Resize((224, 224)),
    tv.transforms.ToTensor()
])
train_dataset = tv.datasets.EMNIST('.', split='mnist', train=True, transform=transoforms, download=True)
test_dataset = tv.datasets.EMNIST('.', split='mnist', train=False, transform=transoforms, download=True)
train_iter = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE)
test_iter = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE)

In [7]:
def vgg_block(num_convs, input_channels, num_channels):
    blk = nn.Sequential(nn.Conv2d(input_channels, num_channels, kernel_size=3, padding=1), nn.ReLU())
    for i in range(num_convs - 1):
        blk.add_module("conv{}".format(i), nn.Conv2d(num_channels, num_channels, kernel_size=3, padding=1))
        blk.add_module("relu{}".format(i), nn.ReLU())
    blk.add_module("pool", nn.MaxPool2d(2, stride=2))
    return blk

In [8]:
def vgg(conv_arch):
    net = nn.Sequential()

    for i, (num_convs, input_ch, num_channels) in enumerate(conv_arch):
        net.add_module("block{}".format(i), vgg_block(num_convs, input_ch, num_channels))


    classifier = nn.Sequential(
        nn.Flatten(),

        # nn.Linear(25088, 4096), nn.ReLU(), nn.Dropout(0.5),
        nn.Linear(6272, 4096), nn.ReLU(), nn.Dropout(0.5),

        nn.Linear(4096, 4096), nn.ReLU(), nn.Dropout(0.5),
        nn.Linear(4096, 10))

    net.add_module('classifier', classifier)
    return net

In [9]:
conv_arch = ((1, 1, 64), (1, 64, 128), (2, 128, 256), (2, 256, 512), (2, 512, 512))

In [10]:
ratio = 4
small_conv_arch = [(v[0], max(v[1] // ratio, 1), v[2] // ratio) for v in conv_arch]
net = vgg(small_conv_arch)

In [11]:
lr, num_epochs = 0.001, 10
trainer = torch.optim.Adam(net.parameters(), lr=lr)

In [12]:
train_accuracy, train_losses, test_accuracy  = train(net, train_iter, test_iter, trainer, num_epochs)

Step 0. time since epoch: 1.671. Train acc: 0.098. Train Loss: 2.302
Step 10. time since epoch: 5.247. Train acc: 0.096. Train Loss: 2.307
Step 20. time since epoch: 8.833. Train acc: 0.100. Train Loss: 2.305
Step 30. time since epoch: 12.392. Train acc: 0.116. Train Loss: 2.284
Step 40. time since epoch: 15.940. Train acc: 0.222. Train Loss: 2.066
Step 50. time since epoch: 19.526. Train acc: 0.332. Train Loss: 1.788
Step 60. time since epoch: 23.118. Train acc: 0.422. Train Loss: 1.556
Step 70. time since epoch: 26.701. Train acc: 0.492. Train Loss: 1.374
Step 80. time since epoch: 30.248. Train acc: 0.547. Train Loss: 1.232
Step 90. time since epoch: 33.834. Train acc: 0.591. Train Loss: 1.113
Step 100. time since epoch: 37.465. Train acc: 0.627. Train Loss: 1.019
Step 110. time since epoch: 41.080. Train acc: 0.657. Train Loss: 0.939
Step 120. time since epoch: 44.652. Train acc: 0.683. Train Loss: 0.870
Step 130. time since epoch: 48.245. Train acc: 0.705. Train Loss: 0.812
Step 1

In [13]:
df_results= pd.DataFrame(columns = ['model', 'train_accuracy', 'train_loss', 'test_accuracy','epoch'])
for i in range(10):
  df_results.loc[len(df_results.index)] = ['vgg16',  train_accuracy[i], train_losses[i], test_accuracy[i], i]

In [15]:
df_results

Unnamed: 0,model,train_accuracy,train_loss,test_accuracy,epoch
0,vgg16,0.823417,0.491901,0.9816,0
1,vgg16,0.985533,0.048897,0.9875,1
2,vgg16,0.99045,0.031727,0.9901,2
3,vgg16,0.992567,0.024981,0.9918,3
4,vgg16,0.994383,0.019147,0.9923,4
5,vgg16,0.995317,0.015705,0.9911,5
6,vgg16,0.996283,0.012776,0.9911,6
7,vgg16,0.9958,0.012743,0.9921,7
8,vgg16,0.996333,0.011288,0.9871,8
9,vgg16,0.997017,0.009972,0.9912,9


In [16]:
df_results.to_csv('vgg16.csv', index=False)

## 2.3. Inception v3

In [None]:
torch.cuda.empty_cache()
model = tv.models.inception_v3(pretrained=True)

In [None]:
model

In [None]:
# Убираем требование градиента:
for param in model.parameters():
    param.requires_grad = False

In [None]:
print("Params to learn:")
params_to_update = []
for name, param in model.named_parameters():
    if param.requires_grad == True:
        params_to_update.append(param)
        print("\t",name)

In [None]:
lr, num_epochs = 0.001, 1
trainer = torch.optim.Adam(params_to_update, lr=lr)
train_accuracy, train_losses, test_accuracy  = train(model, train_iter, test_iter, trainer, num_epochs)

In [None]:
for i in range(10):
  df_results.loc[len(df_results.index)] = ['inception_v3',  train_accuracy[i], train_losses[i], test_accuracy[i], i]

## 2.4. DenseNet 161

In [None]:
torch.cuda.empty_cache()

In [None]:
model = tv.models.densenet161(pretrained=True)

In [None]:
model

In [None]:
# Убираем требование градиента:
for param in model.parameters():
    param.requires_grad = False

In [None]:
model.classifier

In [None]:
print("Params to learn:")
params_to_update = []
for name, param in model.named_parameters():
    if param.requires_grad == True:
        params_to_update.append(param)
        print("\t",name)

In [None]:
trainer = torch.optim.Adam(params_to_update, lr=0.001)
train_accuracy, train_losses, test_accuracy  = train(model, train_iter, test_iter, trainer, num_epochs)

In [None]:
for i in range(10):
  df_results.loc[len(df_results.index)] = ['densenet161',  train_accuracy[i], train_losses[i], test_accuracy[i], i]