# ResNet

<div style="display: flex; align-items: center;">
    <img src="../imgs/ResNet.jpg" alt="Your Image" width="500" style="margin-right: 20px;">
    <div>
        <p>In chapter 7, we introduced a new dataset CIFAR, and tried to challenge it, but the result is limited by using the current knowledge we had learned. In the following chapters, we will aim to achieve a accuracy with 95% on both CIFAR10 and CIFAR100 and continuously introduce new models. In this chapter, we will introduce ResNet.</p>
        <p>From experience, the depth of the network is crucial to the performance of the model. When the number of network layers is increased, the network can extract more complex feature patterns, so theoretically, better results can be achieved when the model is deeper. But will deeper networks necessarily have better performance? As the depth of the network increases, the accuracy of the network saturates or even decreases.
</p>
        <p>The degradation problem of deep networks at least indicates that deep networks are not easy to train. Residual learning solves the problem of difficult training and even degradation of networks, making it easier to learn the original features directly compared to residual learning. When the residual is 0, the stacked layer only performs identity mapping, at least the network performance will not decrease. In fact, the residual will not be 0, which will enable the stacked layer to learn new features based on the input features, thus having better performance. This is somewhat similar to a "short circuit" in a circuit, so it is a short circuit connection.</p>
        <p></p>
    </div>
</div>

## ResNet

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class ResNet(nn.Module):
    def __init__(self, input_channels, output_size):
        super(ResNet, self).__init__()
        self.in_channels = 64
        
        self.conv1 = nn.Conv2d(input_channels, self.in_channels, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.in_channels)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        
        self.blocks_layer = nn.Sequential(
            self._make_layer(64),
            self._make_layer(128, stride=2),
            self._make_layer(256, stride=2)
        )
        
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(256, output_size)

    def _make_layer(self, out_channels, stride=1):
        layers = []
        layers.append(BasicBlock(self.in_channels, out_channels, stride))
        self.in_channels = out_channels
        layers.append(BasicBlock(self.in_channels, out_channels))
        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.blocks_layer(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

class BasicBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels)
            )

    def forward(self, x):
        identity = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        out += self.shortcut(identity)
        out = self.relu(out)
        return out

# Train ResNet on CIFAR10/100

In [2]:
import sys
import torch.nn as nn
sys.path.append('../tools')
from CIFAR10 import CIFAR10Trainer
from CIFAR100 import CIFAR100Trainer

In [None]:
model = ResNet(input_channels=3, output_size=10)
trainer = CIFAR10Trainer(model, loss='CE', lr=0.01, optimizer='SGD', batch_size=128, epoch=30, model_type='classification')
trainer.train()
trainer.test()

Files already downloaded and verified
Files already downloaded and verified
2024-05-20 00:18:28
Epoch 1 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:13<00:00, 26.19it/s, train_loss=1.34]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.92it/s, val_acc=59.2, val_loss=0.00917]


2024-05-20 00:18:42
Epoch 2 / 30


[Train]: 100%|████████████████████████| 352/352 [00:13<00:00, 25.55it/s, train_loss=0.924]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 33.29it/s, val_acc=66.9, val_loss=0.00749]


2024-05-20 00:18:57
Epoch 3 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.98it/s, train_loss=0.738]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.70it/s, val_acc=71.6, val_loss=0.00646]


2024-05-20 00:19:13
Epoch 4 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.69it/s, train_loss=0.615]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.50it/s, val_acc=73.7, val_loss=0.00612]


2024-05-20 00:19:28
Epoch 5 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.78it/s, train_loss=0.495]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 32.70it/s, val_acc=73.5, val_loss=0.00621]


2024-05-20 00:19:44
Epoch 6 / 30


[Train]: 100%|████████████████████████| 352/352 [00:13<00:00, 25.24it/s, train_loss=0.393]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 32.22it/s, val_acc=72.5, val_loss=0.00689]


2024-05-20 00:19:59
Epoch 7 / 30


[Train]: 100%|████████████████████████| 352/352 [00:13<00:00, 25.50it/s, train_loss=0.314]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 32.56it/s, val_acc=73.8, val_loss=0.00689]


2024-05-20 00:20:14
Epoch 8 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.61it/s, train_loss=0.247]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.64it/s, val_acc=75.2, val_loss=0.00718]


2024-05-20 00:20:30
Epoch 9 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:14<00:00, 24.78it/s, train_loss=0.19]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 32.12it/s, val_acc=72.9, val_loss=0.00796]


2024-05-20 00:20:45
Epoch 10 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.28it/s, train_loss=0.153]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.72it/s, val_acc=72.9, val_loss=0.00881]


2024-05-20 00:21:01
Epoch 11 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.55it/s, train_loss=0.114]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.37it/s, val_acc=73.7, val_loss=0.00873]


2024-05-20 00:21:16
Epoch 12 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.84it/s, train_loss=0.096]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 30.54it/s, val_acc=74.6, val_loss=0.00865]


2024-05-20 00:21:32
Epoch 13 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.64it/s, train_loss=0.0716]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.09it/s, val_acc=75.5, val_loss=0.00911]


2024-05-20 00:21:47
Epoch 14 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.53it/s, train_loss=0.0659]
[Valid]: 100%|█████████████████████████| 40/40 [00:01<00:00, 30.95it/s, val_acc=74, val_loss=0.0102]


2024-05-20 00:22:03
Epoch 15 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 23.99it/s, train_loss=0.0635]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.20it/s, val_acc=75.4, val_loss=0.0096]


2024-05-20 00:22:19
Epoch 16 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.15it/s, train_loss=0.0514]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.10it/s, val_acc=76.4, val_loss=0.00985]


2024-05-20 00:22:35
Epoch 17 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.35it/s, train_loss=0.0337]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 30.65it/s, val_acc=77.2, val_loss=0.00951]


2024-05-20 00:22:51
Epoch 18 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.57it/s, train_loss=0.0197]
[Valid]: 100%|██████████████████████| 40/40 [00:01<00:00, 31.14it/s, val_acc=77.2, val_loss=0.00988]


2024-05-20 00:23:06
Epoch 19 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.35it/s, train_loss=0.0136]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.03it/s, val_acc=77.6, val_loss=0.0102]


2024-05-20 00:23:22
Epoch 20 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.83it/s, train_loss=0.00695]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.13it/s, val_acc=77.4, val_loss=0.0103]


2024-05-20 00:23:37
Epoch 21 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.01it/s, train_loss=0.00812]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.23it/s, val_acc=76.9, val_loss=0.0111]


2024-05-20 00:23:53
Epoch 22 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.16it/s, train_loss=0.0136]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 29.97it/s, val_acc=76.2, val_loss=0.0114]


2024-05-20 00:24:09
Epoch 23 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 23.95it/s, train_loss=0.0262]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 29.85it/s, val_acc=75.2, val_loss=0.0116]


2024-05-20 00:24:25
Epoch 24 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.31it/s, train_loss=0.0489]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.40it/s, val_acc=75.2, val_loss=0.0113]


2024-05-20 00:24:41
Epoch 25 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.52it/s, train_loss=0.0389]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.04it/s, val_acc=76.3, val_loss=0.0106]


2024-05-20 00:24:57
Epoch 26 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.29it/s, train_loss=0.0218]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.57it/s, val_acc=76.3, val_loss=0.0106]


2024-05-20 00:25:13
Epoch 27 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.02it/s, train_loss=0.0128]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.80it/s, val_acc=76.7, val_loss=0.0107]


2024-05-20 00:25:29
Epoch 28 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.08it/s, train_loss=0.00491]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.17it/s, val_acc=77.7, val_loss=0.0104]


2024-05-20 00:25:44
Epoch 29 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.14it/s, train_loss=0.0021]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.05it/s, val_acc=77.9, val_loss=0.0107]


2024-05-20 00:26:00
Epoch 30 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 23.91it/s, train_loss=0.0011]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.56it/s, val_acc=78.3, val_loss=0.0107]


In [None]:
model_100 = ResNet(input_channels=3, output_size=100)
trainer_100 = CIFAR100Trainer(model_100, loss='CE', lr=0.01, optimizer='SGD', batch_size=128, epoch=30, model_type='classification')
trainer_100.train()
trainer_100.test()

Files already downloaded and verified
Files already downloaded and verified
2024-05-20 00:27:53
Epoch 1 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:13<00:00, 26.26it/s, train_loss=3.49]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.64it/s, val_acc=22.1, val_loss=0.0254]


2024-05-20 00:28:08
Epoch 2 / 30


[Train]: 100%|██████████████████████████| 352/352 [00:13<00:00, 26.09it/s, train_loss=2.7]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.79it/s, val_acc=31.8, val_loss=0.0212]


2024-05-20 00:28:23
Epoch 3 / 30


[Train]: 100%|██████████████████████████| 352/352 [00:14<00:00, 25.04it/s, train_loss=2.3]
[Valid]: 100%|█████████████████████████| 40/40 [00:01<00:00, 32.03it/s, val_acc=35.1, val_loss=0.02]


2024-05-20 00:28:38
Epoch 4 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:13<00:00, 25.21it/s, train_loss=1.99]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.55it/s, val_acc=37.6, val_loss=0.0191]


2024-05-20 00:28:53
Epoch 5 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:14<00:00, 24.80it/s, train_loss=1.71]
[Valid]: 100%|████████████████████████| 40/40 [00:01<00:00, 32.83it/s, val_acc=39.3, val_loss=0.019]


2024-05-20 00:29:08
Epoch 6 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:14<00:00, 25.07it/s, train_loss=1.46]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.61it/s, val_acc=41.3, val_loss=0.0185]


2024-05-20 00:29:24
Epoch 7 / 30


[Train]: 100%|█████████████████████████| 352/352 [00:14<00:00, 25.11it/s, train_loss=1.19]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.46it/s, val_acc=40.1, val_loss=0.0194]


2024-05-20 00:29:39
Epoch 8 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.60it/s, train_loss=0.937]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.25it/s, val_acc=40.4, val_loss=0.0201]


2024-05-20 00:29:55
Epoch 9 / 30


[Train]: 100%|████████████████████████| 352/352 [00:13<00:00, 25.57it/s, train_loss=0.703]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.81it/s, val_acc=40.9, val_loss=0.0209]


2024-05-20 00:30:10
Epoch 10 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.77it/s, train_loss=0.485]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.97it/s, val_acc=40.6, val_loss=0.0217]


2024-05-20 00:30:25
Epoch 11 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.64it/s, train_loss=0.303]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.28it/s, val_acc=42.4, val_loss=0.0224]


2024-05-20 00:30:41
Epoch 12 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.65it/s, train_loss=0.171]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.97it/s, val_acc=42.4, val_loss=0.0231]


2024-05-20 00:30:56
Epoch 13 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.84it/s, train_loss=0.0888]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.56it/s, val_acc=44.2, val_loss=0.0232]


2024-05-20 00:31:11
Epoch 14 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.85it/s, train_loss=0.0392]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.78it/s, val_acc=45.7, val_loss=0.0229]


2024-05-20 00:31:27
Epoch 15 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.72it/s, train_loss=0.0164]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.85it/s, val_acc=45.3, val_loss=0.0228]


2024-05-20 00:31:42
Epoch 16 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.33it/s, train_loss=0.0106]
[Valid]: 100%|████████████████████████| 40/40 [00:01<00:00, 32.38it/s, val_acc=45.8, val_loss=0.023]


2024-05-20 00:31:58
Epoch 17 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 25.00it/s, train_loss=0.00708]
[Valid]: 100%|████████████████████████| 40/40 [00:01<00:00, 31.59it/s, val_acc=46.2, val_loss=0.023]


2024-05-20 00:32:14
Epoch 18 / 30


[Train]: 100%|████████████████████████| 352/352 [00:14<00:00, 24.48it/s, train_loss=0.006]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.65it/s, val_acc=46.2, val_loss=0.0231]


2024-05-20 00:32:29
Epoch 19 / 30


[Train]: 100%|██████████████████████| 352/352 [00:13<00:00, 25.53it/s, train_loss=0.00521]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.75it/s, val_acc=46.2, val_loss=0.0233]


2024-05-20 00:32:44
Epoch 20 / 30


[Train]: 100%|██████████████████████| 352/352 [00:13<00:00, 25.48it/s, train_loss=0.00438]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.12it/s, val_acc=46.4, val_loss=0.0233]


2024-05-20 00:32:59
Epoch 21 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.38it/s, train_loss=0.00446]
[Valid]: 100%|█████████████████████████| 40/40 [00:01<00:00, 31.16it/s, val_acc=46, val_loss=0.0234]


2024-05-20 00:33:15
Epoch 22 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.14it/s, train_loss=0.00396]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.78it/s, val_acc=46.5, val_loss=0.0235]


2024-05-20 00:33:31
Epoch 23 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.22it/s, train_loss=0.00336]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.33it/s, val_acc=46.2, val_loss=0.0236]


2024-05-20 00:33:47
Epoch 24 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.17it/s, train_loss=0.00333]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 31.11it/s, val_acc=46.4, val_loss=0.0237]


2024-05-20 00:34:03
Epoch 25 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.18it/s, train_loss=0.00307]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.01it/s, val_acc=46.2, val_loss=0.0238]


2024-05-20 00:34:18
Epoch 26 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.30it/s, train_loss=0.00292]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.05it/s, val_acc=46.3, val_loss=0.0238]


2024-05-20 00:34:34
Epoch 27 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.54it/s, train_loss=0.00281]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 32.05it/s, val_acc=46.3, val_loss=0.0238]


2024-05-20 00:34:50
Epoch 28 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.35it/s, train_loss=0.00288]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.44it/s, val_acc=46.4, val_loss=0.0239]


2024-05-20 00:35:06
Epoch 29 / 30


[Train]: 100%|██████████████████████| 352/352 [00:14<00:00, 24.13it/s, train_loss=0.00256]
[Valid]: 100%|████████████████████████| 40/40 [00:01<00:00, 31.72it/s, val_acc=46.3, val_loss=0.024]


2024-05-20 00:35:21
Epoch 30 / 30


[Train]: 100%|███████████████████████| 352/352 [00:14<00:00, 24.49it/s, train_loss=0.0026]
[Valid]: 100%|███████████████████████| 40/40 [00:01<00:00, 30.96it/s, val_acc=46.6, val_loss=0.0241]
