여러 레이어를 연결하여 이미지를 분류할 때에는 
해상도가 커지면 연산량이 기하급수적으로 증가합니다. 

또한 위치에 대한 정보가 소실이 됩니다. 

이러한 문제 해결을 위해서 Fully Connected Layer 앞 단에 이미지 특징을 추출할 수 있는 Colvolution Layer 층을 추가합니다. 

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

이전 코드를 활용하여 모델만 변경해보겠습니다. 


In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
import torch.nn.functional as F  # 함수형 API 제공

# Train dataset
training_data = datasets.FashionMNIST(
    root="data",    # 저장할 위치
    train=True,     # 훈련 데이터셋
    download=True,  # 다운로드 실행
    transform=ToTensor(),   # 변환할 함수
)

# Test Dataset
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

batch_size = 64

# 데이터 로더 만들기
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/26421880 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/29515 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/4422102 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/5148 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw



모델은 CNN으로 변경해보겠습니다. 

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d



In [2]:
class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)  # 
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 4 * 4)  # FCN(Fully Connected Network) 입력으로 변경을 위해서 1차원으로 변경
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

In [3]:
model = CNN()

# GPU로 변경
device = "cuda" if torch.cuda.is_available() else "cpu"  # gpu 사용 가능 여부 확인, 없으면 cpu로 사용
print(device)

model = model.to(device)  

cuda


모델 구조를 살펴봅시다. 

In [4]:
from torchsummary import summary
summary(model,input_size = (1,28,28) ) 

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 6, 24, 24]             156
         MaxPool2d-2            [-1, 6, 12, 12]               0
            Conv2d-3             [-1, 16, 8, 8]           2,416
         MaxPool2d-4             [-1, 16, 4, 4]               0
            Linear-5                  [-1, 120]          30,840
            Linear-6                   [-1, 84]          10,164
            Linear-7                   [-1, 10]             850
Total params: 44,426
Trainable params: 44,426
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.04
Params size (MB): 0.17
Estimated Total Size (MB): 0.22
----------------------------------------------------------------


In [5]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

def train(dataloader, model, loss_fn, optimizer):  # 모델이나, loss_fn 등을 바꿔도 사용할 수 있도록 입력값으로 처리리
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader): # enumerate은 인덱스 숫자 추가가
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():  
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

In [6]:
epochs = 20
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.318957  [   64/60000]
loss: 2.311727  [ 6464/60000]
loss: 2.310286  [12864/60000]
loss: 2.299121  [19264/60000]
loss: 2.297309  [25664/60000]
loss: 2.312116  [32064/60000]
loss: 2.294134  [38464/60000]
loss: 2.312366  [44864/60000]
loss: 2.309774  [51264/60000]
loss: 2.284011  [57664/60000]
Test Error: 
 Accuracy: 12.8%, Avg loss: 2.299037 

Epoch 2
-------------------------------
loss: 2.312370  [   64/60000]
loss: 2.306483  [ 6464/60000]
loss: 2.302641  [12864/60000]
loss: 2.293882  [19264/60000]
loss: 2.294157  [25664/60000]
loss: 2.301692  [32064/60000]
loss: 2.289362  [38464/60000]
loss: 2.302789  [44864/60000]
loss: 2.303329  [51264/60000]
loss: 2.277372  [57664/60000]
Test Error: 
 Accuracy: 22.9%, Avg loss: 2.291122 

Epoch 3
-------------------------------
loss: 2.304712  [   64/60000]
loss: 2.299892  [ 6464/60000]
loss: 2.293038  [12864/60000]
loss: 2.285955  [19264/60000]
loss: 2.289919  [25664/60000]
loss: 2.286181  [32064/600