# Домашнее задание по теме «Архитектуры свёрточных сетей»

**Контекст**

Подберите базовую модель для работы по вашей задаче. Попробуйте обучить различные модели на «ваших» данных. По результатам выберите лучшую модель для дальнейшего обучения.

**Задание**

Проведите эксперименты по начальному обучению различных моделей и сравните результаты.

1. Возьмите датасет EMNIST из torchvision.
2. Обучите на нём модели: ResNet 18, VGG 16, Inception v3, DenseNet 161:
    1) желательно обучить каждую модель с нуля по 10 эпох
    2) если ресурсов компьютера / Colab не хватает, достаточно обучить каждую модель по 1-2 эпохи
3. Сведите результаты обучения моделей (графики лосса) в таблицу и сравните их.

**Дополнительное задание (выполняется по желанию)**

Выполните то же задание, используя датасет [hymenoptera_data](https://www.kaggle.com/datasets/ajayrana/hymenoptera-data/code)

In [1]:
import torch
from torch import nn
import torchvision as tv
from torchsummary import summary
import pandas as pd
import time

In [2]:
def evaluate_accuracy(data_iter, net):
    acc_sum, n = 0, 0
    net.eval()
    for X, y in data_iter:
        X, y = X.to(device), y.to(device)
        acc_sum += (net(X).argmax(axis=1) == y).sum()
        n += y.shape[0]
    return acc_sum.item() / n

In [3]:
def train(net, train_iter, test_iter, trainer, num_epochs):
    net.to(device)
    loss = nn.CrossEntropyLoss(reduction='sum')
    train_accuracy, train_losses, test_accuracy =[], [], []
    net.train()
    for epoch in range(num_epochs):
        train_l_sum, train_acc_sum, n, start = 0.0, 0.0, 0, time.time()

        for i, (X, y) in enumerate(train_iter):
            X, y = X.to(device), y.to(device)
            trainer.zero_grad()
            y_hat = net(X)
            l = loss(y_hat, y)
            l.backward()
            trainer.step()
            train_l_sum += l.item()
            train_acc_sum += (y_hat.argmax(axis=1) == y).sum().item()
            n += y.shape[0]

            if i % 100 == 0:
              print(f"Step {i}. time since epoch: {time.time() -  start:.3f}. "
                    f"Train acc: {train_acc_sum / n:.3f}. Train Loss: {train_l_sum / n:.3f}")
        test_acc = evaluate_accuracy(test_iter, net.to(device))
        train_accuracy.append(train_acc_sum / n)
        train_losses.append(train_l_sum / n)
        test_accuracy.append(test_acc)
        print('-' * 20)
        print(f'epoch {epoch + 1}, loss {train_l_sum / n:.4f}, train acc {train_acc_sum / n:.3f}'
              f', test acc {test_acc:.3f}, time {time.time() - start:.1f} sec')
    return train_accuracy, train_losses, test_accuracy

In [4]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [5]:
BATCH_SIZE = 32
transoforms = tv.transforms.Compose([
    tv.transforms.Grayscale(3),
    tv.transforms.Resize((224, 224)),
    tv.transforms.ToTensor()
])
train_dataset = tv.datasets.EMNIST('.', split='mnist', train=True, transform=transoforms, download=True)
test_dataset = tv.datasets.EMNIST('.', split='mnist', train=False, transform=transoforms, download=True)
train_iter = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE)
test_iter = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE)

Downloading https://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/gzip.zip to ./EMNIST/raw/gzip.zip


100%|██████████| 561753746/561753746 [00:35<00:00, 15826331.97it/s]


Extracting ./EMNIST/raw/gzip.zip to ./EMNIST/raw


## ResNet 18

In [6]:
model = tv.models.resnet18(pretrained=False)



In [7]:
model

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

In [8]:
summary(model.to(device), input_size=(3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 112, 112]           9,408
       BatchNorm2d-2         [-1, 64, 112, 112]             128
              ReLU-3         [-1, 64, 112, 112]               0
         MaxPool2d-4           [-1, 64, 56, 56]               0
            Conv2d-5           [-1, 64, 56, 56]          36,864
       BatchNorm2d-6           [-1, 64, 56, 56]             128
              ReLU-7           [-1, 64, 56, 56]               0
            Conv2d-8           [-1, 64, 56, 56]          36,864
       BatchNorm2d-9           [-1, 64, 56, 56]             128
             ReLU-10           [-1, 64, 56, 56]               0
       BasicBlock-11           [-1, 64, 56, 56]               0
           Conv2d-12           [-1, 64, 56, 56]          36,864
      BatchNorm2d-13           [-1, 64, 56, 56]             128
             ReLU-14           [-1, 64,

In [9]:
model.fc

Linear(in_features=512, out_features=1000, bias=True)

In [10]:
model.fc = nn.Linear(in_features=512, out_features=10)

In [11]:
trainer = torch.optim.Adam(model.parameters(), lr=0.001)

In [12]:
train_accuracy, train_losses, test_accuracy = train(model, train_iter, test_iter, trainer, 1)

Step 0. time since epoch: 1.130. Train acc: 0.000. Train Loss: 2.439
Step 100. time since epoch: 14.193. Train acc: 0.872. Train Loss: 0.436
Step 200. time since epoch: 27.055. Train acc: 0.916. Train Loss: 0.288
Step 300. time since epoch: 39.981. Train acc: 0.934. Train Loss: 0.227
Step 400. time since epoch: 53.006. Train acc: 0.944. Train Loss: 0.193
Step 500. time since epoch: 66.364. Train acc: 0.950. Train Loss: 0.174
Step 600. time since epoch: 79.461. Train acc: 0.954. Train Loss: 0.157
Step 700. time since epoch: 92.663. Train acc: 0.958. Train Loss: 0.145
Step 800. time since epoch: 105.860. Train acc: 0.960. Train Loss: 0.135
Step 900. time since epoch: 118.943. Train acc: 0.962. Train Loss: 0.128
Step 1000. time since epoch: 131.938. Train acc: 0.964. Train Loss: 0.121
Step 1100. time since epoch: 145.080. Train acc: 0.966. Train Loss: 0.117
Step 1200. time since epoch: 158.365. Train acc: 0.967. Train Loss: 0.113
Step 1300. time since epoch: 171.592. Train acc: 0.968. Tra

In [13]:
df_results= pd.DataFrame(columns = ['model', 'train_accuracy', 'train_loss',
                                    'test_accuracy'])
df_results.loc[len(df_results.index)] = ['ResNet18',  train_accuracy,
                                           train_losses, test_accuracy]

In [14]:
del model, trainer
torch.cuda.empty_cache()
print(torch.cuda.memory_summary())

|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      | 197884 KiB |    940 MiB |   8815 GiB |   8815 GiB |
|       from large pool | 184384 KiB |    926 MiB |   8801 GiB |   8801 GiB |
|       from small pool |  13500 KiB |     16 MiB |     14 GiB |     14 GiB |
|---------------------------------------------------------------------------|
| Active memory         | 197884 KiB |    940 MiB |   8815 GiB |   8815 GiB |
|       from large pool | 184384 KiB |    926 MiB |   8801 GiB |   8801 GiB |
|       from small pool |  13500 KiB |     16 MiB |     14 GiB |     14 GiB |
|---------------------------------------------------------------

## VGG 16

In [15]:
model = tv.models.vgg16(pretrained=False)



In [16]:
model

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1

In [17]:
summary(model.to(device), input_size=(3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 224, 224]           1,792
              ReLU-2         [-1, 64, 224, 224]               0
            Conv2d-3         [-1, 64, 224, 224]          36,928
              ReLU-4         [-1, 64, 224, 224]               0
         MaxPool2d-5         [-1, 64, 112, 112]               0
            Conv2d-6        [-1, 128, 112, 112]          73,856
              ReLU-7        [-1, 128, 112, 112]               0
            Conv2d-8        [-1, 128, 112, 112]         147,584
              ReLU-9        [-1, 128, 112, 112]               0
        MaxPool2d-10          [-1, 128, 56, 56]               0
           Conv2d-11          [-1, 256, 56, 56]         295,168
             ReLU-12          [-1, 256, 56, 56]               0
           Conv2d-13          [-1, 256, 56, 56]         590,080
             ReLU-14          [-1, 256,

In [18]:
model.classifier[6]

Linear(in_features=4096, out_features=1000, bias=True)

In [19]:
model.classifier[6] = nn.Linear(in_features=4096, out_features=10)

In [20]:
trainer = torch.optim.Adam(model.parameters(), lr=0.001)

In [21]:
train_accuracy, train_losses, test_accuracy = train(model, train_iter, test_iter, trainer, 1)

Step 0. time since epoch: 0.564. Train acc: 0.156. Train Loss: 2.299
Step 100. time since epoch: 49.572. Train acc: 0.373. Train Loss: 3.275
Step 200. time since epoch: 98.390. Train acc: 0.641. Train Loss: 1.794
Step 300. time since epoch: 147.111. Train acc: 0.739. Train Loss: 1.278
Step 400. time since epoch: 195.806. Train acc: 0.795. Train Loss: 0.995
Step 500. time since epoch: 244.478. Train acc: 0.829. Train Loss: 0.824
Step 600. time since epoch: 293.134. Train acc: 0.851. Train Loss: 0.709
Step 700. time since epoch: 341.839. Train acc: 0.868. Train Loss: 0.623
Step 800. time since epoch: 390.469. Train acc: 0.880. Train Loss: 0.561
Step 900. time since epoch: 439.129. Train acc: 0.890. Train Loss: 0.512
Step 1000. time since epoch: 487.740. Train acc: 0.899. Train Loss: 0.470
Step 1100. time since epoch: 536.313. Train acc: 0.905. Train Loss: 0.437
Step 1200. time since epoch: 584.828. Train acc: 0.910. Train Loss: 0.411
Step 1300. time since epoch: 633.376. Train acc: 0.915

In [22]:
df_results.loc[len(df_results.index)] = ['VGG16',  train_accuracy, train_losses, test_accuracy]

In [23]:
del model, trainer
torch.cuda.empty_cache()
print(torch.cuda.memory_summary())

|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |   2176 MiB |   5135 MiB |  54180 GiB |  54178 GiB |
|       from large pool |   2165 MiB |   5124 MiB |  54144 GiB |  54142 GiB |
|       from small pool |     11 MiB |     16 MiB |     36 GiB |     36 GiB |
|---------------------------------------------------------------------------|
| Active memory         |   2176 MiB |   5135 MiB |  54180 GiB |  54178 GiB |
|       from large pool |   2165 MiB |   5124 MiB |  54144 GiB |  54142 GiB |
|       from small pool |     11 MiB |     16 MiB |     36 GiB |     36 GiB |
|---------------------------------------------------------------

## Inception v3

In [24]:
class Inception(nn.Module):
    def __init__(self, ic, c1, c2, c3, c4, **kwargs):
        super(Inception, self).__init__(**kwargs)
        self.p1_1 = nn.Sequential(nn.Conv2d(ic, c1, kernel_size=1), nn.ReLU())
        self.p2_1 = nn.Sequential(nn.Conv2d(ic, c2[0], kernel_size=1), nn.ReLU())
        self.p2_2 = nn.Sequential(nn.Conv2d(c2[0], c2[1], kernel_size=3, padding=1), nn.ReLU())
        self.p3_1 = nn.Sequential(nn.Conv2d(ic, c3[0], kernel_size=1), nn.ReLU())
        self.p3_2 = nn.Sequential(nn.Conv2d(c3[0], c3[1], kernel_size=5, padding=2), nn.ReLU())
        self.p4_1 = nn.Sequential(nn.MaxPool2d(3, stride=1, padding=1))
        self.p4_2 = nn.Sequential(nn.Conv2d(ic, c4, kernel_size=1), nn.ReLU())

    def forward(self, x):
        p1 = self.p1_1(x)
        p2 = self.p2_2(self.p2_1(x))
        p3 = self.p3_2(self.p3_1(x))
        p4 = self.p4_2(self.p4_1(x))
        # Concatenate the outputs on the channel dimension.
        return torch.cat((p1, p2, p3, p4), dim=1)

In [25]:
b1 = nn.Sequential(nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3), nn.ReLU(),
       nn.MaxPool2d(3, stride=2, padding=1))

b2 = nn.Sequential(
       nn.Conv2d(64, 64, kernel_size=1),
       nn.Conv2d(64, 192, kernel_size=3, padding=1),
       nn.MaxPool2d(3, stride=2, padding=1))

b3 = nn.Sequential(
       Inception(192, 64, (96, 128), (16, 32), 32),
       Inception(256, 128, (128, 192), (32, 96), 64),
       nn.MaxPool2d(3, stride=2, padding=1))
b4 = nn.Sequential(
       Inception(480, 192, (96, 208), (16, 48), 64),
       Inception(512, 160, (112, 224), (24, 64), 64),
       Inception(512, 128, (128, 256), (24, 64), 64),
       Inception(512, 112, (144, 288), (32, 64), 64),
       Inception(528, 256, (160, 320), (32, 128), 128),
       nn.MaxPool2d(3, stride=2, padding=1))

b5 = nn.Sequential(
       Inception(832, 256, (160, 320), (32, 128), 128),
       Inception(832, 384, (192, 384), (48, 128), 128),
       nn.AvgPool2d(7))

model = nn.Sequential(b1, b2, b3, b4, b5, nn.Flatten(), nn.Linear(1024, 10))

In [26]:
model

Sequential(
  (0): Sequential(
    (0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  )
  (1): Sequential(
    (0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
    (1): Conv2d(64, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (2): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  )
  (2): Sequential(
    (0): Inception(
      (p1_1): Sequential(
        (0): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1))
        (1): ReLU()
      )
      (p2_1): Sequential(
        (0): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1))
        (1): ReLU()
      )
      (p2_2): Sequential(
        (0): Conv2d(96, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
      )
      (p3_1): Sequential(
        (0): Conv2d(192, 16, kernel_size=(1, 1), stride=(1, 1))
        (1): ReLU()
      )
      (p3_2): Sequenti

In [27]:
summary(model.to(device), input_size=(1, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 112, 112]           3,200
              ReLU-2         [-1, 64, 112, 112]               0
         MaxPool2d-3           [-1, 64, 56, 56]               0
            Conv2d-4           [-1, 64, 56, 56]           4,160
            Conv2d-5          [-1, 192, 56, 56]         110,784
         MaxPool2d-6          [-1, 192, 28, 28]               0
            Conv2d-7           [-1, 64, 28, 28]          12,352
              ReLU-8           [-1, 64, 28, 28]               0
            Conv2d-9           [-1, 96, 28, 28]          18,528
             ReLU-10           [-1, 96, 28, 28]               0
           Conv2d-11          [-1, 128, 28, 28]         110,720
             ReLU-12          [-1, 128, 28, 28]               0
           Conv2d-13           [-1, 16, 28, 28]           3,088
             ReLU-14           [-1, 16,

In [28]:
trainer = torch.optim.Adam(model.parameters(), lr=0.001)

In [29]:
transoforms_inc = tv.transforms.Compose([
    tv.transforms.Resize((224, 224)),
    tv.transforms.ToTensor()
])
train_dataset_inc = tv.datasets.EMNIST('.', split='mnist', train=True, transform=transoforms_inc, download=True)
test_dataset_inc = tv.datasets.EMNIST('.', split='mnist', train=False, transform=transoforms_inc, download=True)
train_iter_inc = torch.utils.data.DataLoader(train_dataset_inc, batch_size=BATCH_SIZE)
test_iter_inc = torch.utils.data.DataLoader(test_dataset_inc, batch_size=BATCH_SIZE)

In [30]:
train_accuracy, train_losses, test_accuracy = train(model, train_iter_inc, test_iter_inc, trainer, 1)

Step 0. time since epoch: 0.205. Train acc: 0.094. Train Loss: 2.303
Step 100. time since epoch: 11.728. Train acc: 0.097. Train Loss: 2.304
Step 200. time since epoch: 23.226. Train acc: 0.105. Train Loss: 2.304
Step 300. time since epoch: 34.733. Train acc: 0.101. Train Loss: 2.304
Step 400. time since epoch: 46.230. Train acc: 0.101. Train Loss: 2.303
Step 500. time since epoch: 57.696. Train acc: 0.100. Train Loss: 2.303
Step 600. time since epoch: 69.154. Train acc: 0.102. Train Loss: 2.303
Step 700. time since epoch: 80.628. Train acc: 0.103. Train Loss: 2.304
Step 800. time since epoch: 92.067. Train acc: 0.103. Train Loss: 2.303
Step 900. time since epoch: 103.480. Train acc: 0.103. Train Loss: 2.305
Step 1000. time since epoch: 114.905. Train acc: 0.129. Train Loss: 2.243
Step 1100. time since epoch: 126.332. Train acc: 0.176. Train Loss: 2.132
Step 1200. time since epoch: 137.739. Train acc: 0.221. Train Loss: 2.022
Step 1300. time since epoch: 149.190. Train acc: 0.265. Trai

In [31]:
df_results.loc[len(df_results.index)] = ['Inception v3',  train_accuracy, train_losses, test_accuracy]

In [32]:
torch.backends.cuda.cufft_plan_cache.clear()

In [33]:
del model, trainer
torch.cuda.empty_cache()
print(torch.cuda.memory_summary())

|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |   1242 MiB |   5135 MiB |  63967 GiB |  63966 GiB |
|       from large pool |   1181 MiB |   5124 MiB |  63819 GiB |  63818 GiB |
|       from small pool |     61 MiB |     74 MiB |    147 GiB |    147 GiB |
|---------------------------------------------------------------------------|
| Active memory         |   1242 MiB |   5135 MiB |  63967 GiB |  63966 GiB |
|       from large pool |   1181 MiB |   5124 MiB |  63819 GiB |  63818 GiB |
|       from small pool |     61 MiB |     74 MiB |    147 GiB |    147 GiB |
|---------------------------------------------------------------

## DenseNet 161

In [34]:
model = tv.models.densenet161(pretrained=False)



In [35]:
model

DenseNet(
  (features): Sequential(
    (conv0): Conv2d(3, 96, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm0): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): _DenseBlock(
      (denselayer1): _DenseLayer(
        (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv1): Conv2d(96, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (norm2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (conv2): Conv2d(192, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      )
      (denselayer2): _DenseLayer(
        (norm1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (rel

In [36]:
model.classifier

Linear(in_features=2208, out_features=1000, bias=True)

In [37]:
model.classifier = nn.Linear(in_features=2208, out_features=10)

In [38]:
trainer = torch.optim.Adam(model.parameters(), lr=0.001)

In [39]:
train_accuracy, train_losses, test_accuracy = train(model, train_iter, test_iter, trainer, 1)

Step 0. time since epoch: 0.722. Train acc: 0.219. Train Loss: 2.274
Step 100. time since epoch: 72.576. Train acc: 0.847. Train Loss: 0.510
Step 200. time since epoch: 144.536. Train acc: 0.897. Train Loss: 0.344
Step 300. time since epoch: 216.468. Train acc: 0.918. Train Loss: 0.273
Step 400. time since epoch: 288.532. Train acc: 0.929. Train Loss: 0.236
Step 500. time since epoch: 360.557. Train acc: 0.936. Train Loss: 0.212
Step 600. time since epoch: 432.568. Train acc: 0.943. Train Loss: 0.191
Step 700. time since epoch: 504.478. Train acc: 0.948. Train Loss: 0.176
Step 800. time since epoch: 576.493. Train acc: 0.951. Train Loss: 0.164
Step 900. time since epoch: 648.527. Train acc: 0.954. Train Loss: 0.155
Step 1000. time since epoch: 720.645. Train acc: 0.957. Train Loss: 0.147
Step 1100. time since epoch: 792.693. Train acc: 0.959. Train Loss: 0.141
Step 1200. time since epoch: 864.536. Train acc: 0.960. Train Loss: 0.136
Step 1300. time since epoch: 936.508. Train acc: 0.96

In [40]:
df_results.loc[len(df_results.index)] = ['DenseNet 161',  train_accuracy, train_losses, test_accuracy]

## Результаты моделей

In [41]:
df_results

Unnamed: 0,model,train_accuracy,train_loss,test_accuracy
0,ResNet18,[0.9734166666666667],[0.09018796292090944],[0.9866]
1,VGG16,[0.9337],[0.29546628054646423],[0.9812]
2,Inception v3,[0.4672],[1.4054471664613735],[0.9657]
3,DenseNet 161,[0.9677666666666667],[0.10979927198883767],[0.9891]


Наилучший результат на обучающей выборке показала модель ResNet18, а на тестовой - DenseNet 161.