- Change our model to use a 5 × 5 kernel with kernel_size=5 passed to the nn.Conv2d constructor.
    - What impact does this change have on the number of parameters in the model?
    - Does the change improve or degrade overfitting?
    - Read https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d.
    - Can you describe what kernel_size=(1,3) will do?
    - How does the model behave with such a kernel? やっていない。

In [3]:
import torch.nn as nn
import torch
from torchvision import datasets, transforms
from matplotlib import pyplot as plt
import torch.optim as optim
import torch.nn.functional as F

- cifar2を用意する：

In [2]:
data_path = '../data-unversioned/p1ch7/'
cifar10 = datasets.CIFAR10(data_path, train=True, download=False,
                          transform=transforms.Compose([
                              transforms.ToTensor(),
                              transforms.Normalize((0.4915, 0.4823, 0.4468),
                                                   (0.2470, 0.2435, 0.2616))
                          ]))
cifar10_val = datasets.CIFAR10(data_path, train=False, download=False,
                              transform=transforms.Compose([
                                  transforms.ToTensor(),
                                  transforms.Normalize((0.4915, 0.4823, 0.4468),
                                                       (0.2470, 0.2435, 0.2616))
                              ]))
label_map = {0: 0, 2: 1}
class_names = ['airplane', 'bird']
cifar2 = [(img, label_map[label])
         for img, label in cifar10
         if label in [0, 2]]
cifar2_val = [(img, label_map[label])
             for img, label in cifar10_val
             if label in [0, 2]]

- kernel sizeを5にするので、paddingも２に変更するはず。
- 一番目Linearのinput sizeはエラーメッセージから取るか。。でも変わらない気がする、paddingがあるから。サイズはmax poolingしかと関係ないはずだ。

In [9]:
class Net(nn.Module):
    def __init__(self, kernel_size):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=kernel_size, padding=2)
        self.conv2 = nn.Conv2d(16, 8, kernel_size=kernel_size, padding=2)
        self.fc1 = nn.Linear(8 * 8 * 8, 32)
        self.fc2 = nn.Linear(32, 2)
        
    def forward(self, x):
        out = F.max_pool2d(torch.tanh(self.conv1(x)), 2)
        out = F.max_pool2d(torch.tanh(self.conv2(out)), 2)
        out = out.view(-1, 8 * 8 * 8)
        out = torch.tanh(self.fc1(out))
        out = self.fc2(out)
        return out

- 上記モデルがinput画像を処理できるかをテスト：

In [10]:
model = Net(kernel_size=5)
img, _ = cifar2[0]
model(img.unsqueeze(0))

tensor([[ 0.0445, -0.1149]], grad_fn=<AddmmBackward>)

- 処理できたので、モデルは大丈夫そうだ！
- parameter数は：

In [6]:
numel_list = [p.numel() for p in model.parameters()]
sum(numel_list), numel_list

(20906, [1200, 16, 3200, 8, 16384, 32, 64, 2])

- parameter数は主に5x5x3x16=1200, kernel sizeは5x5x3だ。input channelsを処理できるように。
- training:

In [13]:
device = (torch.device('cuda') if torch.cuda.is_available()
         else torch.device('cpu'))
print(f"Training on device {device}.")

Training on device cuda.


In [14]:
import datetime

def training_loop(n_epochs, optimizer, model, loss_fn, train_loader):
    for epoch in range(1, n_epochs + 1):
        loss_train = 0.0
        for imgs, labels in train_loader:
            imgs = imgs.to(device=device)
            labels = labels.to(device=device)
            outputs = model(imgs)
            loss = loss_fn(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            loss_train += loss.item()
        if epoch == 1 or epoch % 10 == 0:
            print('{} Epoch {}, Training loss {}'.format(
            datetime.datetime.now(), epoch, loss_train / len(train_loader)))

In [15]:
train_loader = torch.utils.data.DataLoader(cifar2, batch_size=64, shuffle=True)
model = Net(kernel_size=5).to(device=device)
optimizer = optim.SGD(model.parameters(), lr=1e-2)
loss_fn = nn.CrossEntropyLoss()

training_loop(n_epochs=100,
             optimizer=optimizer,
             model=model,
             loss_fn=loss_fn,
             train_loader=train_loader,)

2020-12-30 03:05:51.072684 Epoch 1, Training loss 0.5387514252571544
2020-12-30 03:05:53.797294 Epoch 10, Training loss 0.3192493832035429
2020-12-30 03:05:56.822801 Epoch 20, Training loss 0.2778432199339958
2020-12-30 03:05:59.859464 Epoch 30, Training loss 0.24807112464669404
2020-12-30 03:06:02.878734 Epoch 40, Training loss 0.21821043374621943
2020-12-30 03:06:05.895975 Epoch 50, Training loss 0.19998073959901075
2020-12-30 03:06:08.939146 Epoch 60, Training loss 0.1797535287062074
2020-12-30 03:06:11.950189 Epoch 70, Training loss 0.1627138643317921
2020-12-30 03:06:14.945885 Epoch 80, Training loss 0.14478921819074897
2020-12-30 03:06:17.927673 Epoch 90, Training loss 0.1261396531941025
2020-12-30 03:06:20.933380 Epoch 100, Training loss 0.10748946671463122


In [17]:
train_loader = torch.utils.data.DataLoader(cifar2, batch_size=64, shuffle=False)
val_loader = torch.utils.data.DataLoader(cifar2_val, batch_size=64, shuffle=False)

def validate(model, train_loader, val_loader):
    for name, loader in [("train", train_loader), ("val", val_loader)]:
        correct = 0
        total = 0
        
        with torch.no_grad():
            for imgs, labels in loader:
                outputs = model(imgs)
                _, predicted = torch.max(outputs, dim=1)
                total += labels.shape[0]
                correct += int((predicted == labels).sum())
        
        print("Accuracy {}: {:.2f}".format(name, correct / total))

validate(model.to(device=torch.device('cpu')), train_loader=train_loader, val_loader=val_loader)

Accuracy train: 0.93
Accuracy val: 0.88


- overfittingは悪くなっている。つまり小さいkernelの方がgeneralizationが良い。