![nn](img/pytorch_02.png)

## 실습 목표

- Neural network 모델을 만들고 학습시킬 수 있다.
- 모델을 튜닝하여 원하는 성능을 얻을 수 있다.

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

from torch.utils.data import DataLoader

import torchvision
import torchvision.transforms as transforms

In [2]:
# gpu number 지정
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'  # 1차 과정 gpu number

![nn](img/pytorch_08.png)

In [3]:
# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='datasets/', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='datasets/', train=False, transform=transforms.ToTensor())

# Data loader
# mini batch size
train_loader = DataLoader(dataset=train_dataset, batch_size=128, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=128, shuffle=False)

![nn](img/pytorch_07.png)

In [4]:
# 모델 class 선언
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)
        self.sigmoid = nn.Sigmoid()
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid(out)
        out = self.fc2(out)
        out = self.sigmoid(out)
        out = self.fc3(out)
        return out

In [5]:
# 모델 instance 생성
model = NeuralNet(784, 20, 10)  # init(784, 20, 10)
# input dim: 784  / hidden dim: 20  / output dim: 10

In [6]:
model

NeuralNet(
  (fc1): Linear(in_features=784, out_features=20, bias=True)
  (fc2): Linear(in_features=20, out_features=20, bias=True)
  (fc3): Linear(in_features=20, out_features=10, bias=True)
  (sigmoid): Sigmoid()
)

In [7]:
# gpu 사용
model = model.to('cuda')

In [8]:
# 잘 학습이 되었는지 판단 기준
loss_fn = nn.CrossEntropyLoss()

In [9]:
optimizer = torch.optim.SGD(model.parameters(), lr=0.05) 
# torch.optim.SGD(model.parameters(), lr=0.05, momentum=0.9)
# torch.optim.Adam(model.parameters(), lr=0.05)

![nn](img/pytorch_09.gif)

In [10]:
# Train the model
total_step = len(train_loader)

for epoch in range(10):
    for i, (images, labels) in enumerate(train_loader):  # mini batch for loop
        # gpu
        images = images.reshape(-1, 28*28).to('cuda')
        labels = labels.to('cuda')
        
        # Forward
        outputs = model(images)  # forwardI(images)
        loss = loss_fn(outputs, labels)  # 예측 값, 실제 값
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()  # 자동 미분값 계산
        optimizer.step()  # requires_grad=True parameter 업데이트
        
        if (i+1) % 100 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, 10, i+1, total_step, loss.item()))

Epoch [1/10], Step [100/469], Loss: 2.2968
Epoch [1/10], Step [200/469], Loss: 2.2902
Epoch [1/10], Step [300/469], Loss: 2.2916
Epoch [1/10], Step [400/469], Loss: 2.2966
Epoch [2/10], Step [100/469], Loss: 2.2727
Epoch [2/10], Step [200/469], Loss: 2.2706
Epoch [2/10], Step [300/469], Loss: 2.2578
Epoch [2/10], Step [400/469], Loss: 2.2406
Epoch [3/10], Step [100/469], Loss: 2.1911
Epoch [3/10], Step [200/469], Loss: 2.1588
Epoch [3/10], Step [300/469], Loss: 2.0711
Epoch [3/10], Step [400/469], Loss: 2.0070
Epoch [4/10], Step [100/469], Loss: 1.8577
Epoch [4/10], Step [200/469], Loss: 1.7323
Epoch [4/10], Step [300/469], Loss: 1.6590
Epoch [4/10], Step [400/469], Loss: 1.4602
Epoch [5/10], Step [100/469], Loss: 1.4875
Epoch [5/10], Step [200/469], Loss: 1.4059
Epoch [5/10], Step [300/469], Loss: 1.2327
Epoch [5/10], Step [400/469], Loss: 1.1442
Epoch [6/10], Step [100/469], Loss: 1.0709
Epoch [6/10], Step [200/469], Loss: 1.0047
Epoch [6/10], Step [300/469], Loss: 1.0573
Epoch [6/10

In [11]:
# Test the model
# In test phase, we don't need to compute gradients (for memory efficiency)
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28).to('cuda')
        labels = labels.to('cuda')
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)  # classificatoin model -> top 1 label이 예측 값
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

Accuracy of the network on the 10000 test images: 83.44 %


### 퀴즈

#### 아래 코드를 변형하여, Fully connected neural network의 MNIST classification  test 성능을  95% 이상으로 올려보세요.  정답은 물론 하나가 아니며, 코드의 변형이 많을수도 있고 적을수도 있습니다.

In [12]:
# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='datasets/', train=True, transform=transforms.ToTensor(), download=True)
test_dataset = torchvision.datasets.MNIST(root='datasets/', train=False, transform=transforms.ToTensor())

# Data loader
train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=32, shuffle=False)

In [13]:
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, hidden_size)
        self.fc4 = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        out = F.relu(out)
        out = self.fc3(out)
        out = F.relu(out)
        out = self.fc4(out)
        return out

In [14]:
model = NeuralNet(784, 50, 10)
model = model.to('cuda')

In [15]:
loss_fn = nn.CrossEntropyLoss()

In [16]:
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)

In [17]:
# Train the model
total_step = len(train_loader)
for epoch in range(10):
    for i, (images, labels) in enumerate(train_loader):  
        # Move tensors to the configured device
        images = images.reshape(-1, 28*28).to('cuda')
        labels = labels.to('cuda')
        
        # Forward pass
        outputs = model(images)
        loss = loss_fn(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1) % 100 == 0:
            print ('Epoch [{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, i+1, total_step, loss.item()))

Epoch [1], Step [100/1875], Loss: 0.2540
Epoch [1], Step [200/1875], Loss: 0.3388
Epoch [1], Step [300/1875], Loss: 0.2909
Epoch [1], Step [400/1875], Loss: 0.3307
Epoch [1], Step [500/1875], Loss: 0.3641
Epoch [1], Step [600/1875], Loss: 0.3738
Epoch [1], Step [700/1875], Loss: 0.2374
Epoch [1], Step [800/1875], Loss: 0.3779
Epoch [1], Step [900/1875], Loss: 0.1987
Epoch [1], Step [1000/1875], Loss: 0.2675
Epoch [1], Step [1100/1875], Loss: 0.2308
Epoch [1], Step [1200/1875], Loss: 0.0732
Epoch [1], Step [1300/1875], Loss: 0.0305
Epoch [1], Step [1400/1875], Loss: 0.0852
Epoch [1], Step [1500/1875], Loss: 0.1668
Epoch [1], Step [1600/1875], Loss: 0.2521
Epoch [1], Step [1700/1875], Loss: 0.1314
Epoch [1], Step [1800/1875], Loss: 0.1734
Epoch [2], Step [100/1875], Loss: 0.0724
Epoch [2], Step [200/1875], Loss: 0.1043
Epoch [2], Step [300/1875], Loss: 0.0577
Epoch [2], Step [400/1875], Loss: 0.0516
Epoch [2], Step [500/1875], Loss: 0.2164
Epoch [2], Step [600/1875], Loss: 0.2721
Epoch [

In [18]:
# Test the model
# In test phase, we don't need to compute gradients (for memory efficiency)
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.reshape(-1, 28*28).to('cuda')
        labels = labels.to('cuda')
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: {} %'.format(100 * correct / total))

Accuracy of the network on the 10000 test images: 96.94 %


### 생각해보기

#### 이전 퀴즈에서 사용한 방법 외에 어떤 시도를 더 해볼 수 있을까요?