# 8. PyTorch로 구현하는 순방향 뉴럴 네트워크

## 1. 순방향 신경망에 대해

### 1.1 로지스틱 회귀분석의 뉴럴 네트워크로의 전이

#### 로지스틱 회귀분석 복습

<img src="./images/07-01.png">

In [1]:
import torch
import torch.nn as nn

In [2]:
class LogisticRegressionModel(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LogisticRegressionModel, self).__init__()
        self.linear = nn.Linear(input_dim, output_dim)
    def forward(self, x):
        out = self.linear(x)
        return out

In [3]:
input_dim = 28 * 28
output_dim = 10

model = LogisticRegressionModel(input_dim, output_dim)

In [4]:
print (model)

LogisticRegressionModel(
  (linear): Linear(in_features=784, out_features=10, bias=True)
)


** 로지스틱 회귀분석의 문제점 **

- **선형** 함수들은 잘 표현할 수 있다.
    - $y = 2x + 3$
    - $y = x_1 + x_2$
    - $y = x_1 + 3x_2 + 4x_3$
- **비선형** 함수들은 잘 표현할 수 없다.
    - $y = 4x_1 + 2x_2^2 + 3x_3^3$

### 1.2 Introducing a Non-linear Function

<img src="./images/08-01.png">

### 1.3 비선형 함수 자세히 살펴보기

- 함수: 숫자를 받고 & 수학적 연산을 수행한다
- 비선형성의 흔한 종류
    - ReLUs (Rectified Linear Units)
    - Sigmoid
    - Tanh

#### Sigmoid (Logistic)

- $\sigma(x) = \frac{1}{1+e^{-x}}$ where $x$ are logits
- input number $\rightarrow$ [0, 1]
    - 큰 음수일수록 $\rightarrow$ 0
    - 큰 양수일수록 $\rightarrow$ 1
- Cons:
    1. 활성화 지점들이 0과 1에 모여있고 **gradients** $\approx$ 0
        - 가중치 업데이트를 할 시그널이 발생하지 않는다 $\rightarrow$  **학습 불가**
        - 해결책: 가중치를 신중하게 초기화하여 이것을 예방한다.
    2. 출력값들이 0 주변에 없다 (출력값이 0에서 1사이에 있으니)
        - 만약 출력값이 항상 양이라면 $\rightarrow$ gradients는 항상 양이거나 음일 것이다 $\rightarrow$ **그라디언트 업데이트에 좋지 않다**

#### Tanh
- tanh(x) = $2\sigma(2x) - 1$ where $\sigma$ means Sigmoid
    - 스케일된 sigmoid 함수이다.
- Input number $\rightarrow$ [-1, 1]
- Cons:
    1. 활성화 지점들이 0과 1에 모여있고 **gradients** $\approx$ 0
        - 가중치를 업데이트할 시그널이 발생하지 않는다. $\rightarrow$ **학습 불가**
        - **해결책**: 가중치를 신중하게 초기화하여 이것을 예방한다.

#### ReLUs

- $f(x)$ = max$(0, x)$
- Pros:
    1. 수렴을 빠르게 한다 $\rightarrow$ **학습이 빠르다**
    2. **연산적으로 비용이 적게 든다** 반면 Sigmoid/Tanh 는 exponentials 가 들어간다.
- Cons:
    1. 많은 ReLU 요소들이 "죽는다" $\rightarrow$ **gradients = 0** 이면, 파라미터들에 대한 업데이트가 없다.
        - **해결책**: 학습률을 잘 선택한다.

## 2. PyTorch로 순방향 신경망 구현하기

### 모델 A: 1 Hidden Layer Feedfoward Neural Network (Sigmoid Activation)

<img src="./images/08-02.png">

### Steps
- Step 1: 데이터셋 로드
- Step 2: 데이터셋 순환 가능하게 만들기
- Step 3: 모델 클래스 만들기
- Step 4: 모델 클래스 인스턴스화
- Step 5: 손실 클래스 인스턴스화
- Step 6: 최적화 클래스 인스턴스화
- Step 7: 모델 학습

### Step 1: MNIST 학습 데이터셋 로드
** 0에서 9까지의 이미지 **

In [5]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable

In [6]:
train_dataset = dsets.MNIST(root = './data',
                            train = True,
                            transform = transforms.ToTensor(),
                            download = True)

test_dataset = dsets.MNIST(root = './data',
                           train = False,
                           transform = transforms.ToTensor())

### Step 2: 데이터셋 순환 가능하게 만들기

In [7]:
60000 / 100

600.0

In [8]:
3000 / 600

5.0

In [9]:
batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

### Step 3: 모델 클래스 만들기

In [10]:
class FeedforwardNeuralNetModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(FeedforwardNeuralNetModel, self).__init__()
        # Linear function
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        # Non-linearity
        self.sigmoid = nn.Sigmoid()
        # Linear function (readout)
        self.fc2 = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, x):
        # Linear function # LINEAR
        out = self.fc1(x)
        # Non-linearity # NON-LINEAR
        out = self.sigmoid(out)
        # Linear function (readout) # LINEAR
        out = self.fc2(out)
        return out

### Step 4: 모델 클래스 인스턴스화

- **입력층** 차원: **784**
    - 이미지의 크기
    - 28 $\times$ 28 = 784
- **출력층** 차원: **10**
    - 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
- **은닉층** 차원: **100**
    - 어떤 숫자든지 가능함
    - 비슷한 용어
        - 뉴런의 수
        - 비선형 활성화 함수의 수

In [11]:
input_dim = 28*28
hidden_dim = 100
output_dim = 10

model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)

### Step 5: 손실 클래스 인스턴스화

- 순방향 신경망: **Cross Entropy Loss**
    - *로지스틱 회귀분석*: **Cross Entropy Loss**
    - *선형 회귀분석*: **MSE**

In [12]:
criterion = nn.CrossEntropyLoss()

- 간단한 수식
    - $\theta = \theta - \eta \cdot \nabla_{\theta}$
        - $\theta$: 파라미터들 (우리의 변수들)
        - $\eta$: 학습률 (얼마나 빠르게 학습하고 싶은가)
        - $\nabla_{\theta}$: 파라미터들의 그라디언트들
        
    - 더욱 간단한 수식
        - 파라미터들 = 파라미터들 - 학습률 * 파라미터들의 그라디언트들
        - **iteration이 한번 돌 때마다, 우리는 모델들의 파라미터들을 업데이트 한다**

In [13]:
learning_rate = 0.1

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

### 파라미터들 자세히 살펴보기

In [14]:
print (model.parameters())

<generator object Module.parameters at 0x113cd9e08>


In [15]:
print (len(list(model.parameters())))

4


In [16]:
# Hidden Layer Parameters
print (list(model.parameters())[0].size())

torch.Size([100, 784])


In [17]:
# FC 1 Bias Parameters
print (list(model.parameters())[1].size())

torch.Size([100])


In [18]:
# FC 2 Parameters
print (list(model.parameters())[2].size())

torch.Size([10, 100])


In [19]:
# FC 2 Bias Parameters
print (list(model.parameters())[3].size())

torch.Size([10])


<img src="./images/08-03.png">

### Step 7: 모델 학습시키기
- 과정
    1. 입력값/라벨들을 변수화 시킨다
    2. 그라디언트 버퍼를 비워준다
    3. 입력값에 대한 출력값을 구한다
    4. 손실을 구한다
    5. 파라미터들에 관한 그라디언트들을 구한다
    6. 그라디언트들을 이용하여 파라미터들을 업데이트 시킨다
        - 파라미터들 = 파라미터들 - 학습률 * 파라미터들의 그라디언트들
    7. 반복

In [20]:
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as Variable
        images = Variable(images.view(-1, 28*28))
        labels = Variable(labels)
        
        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()
        
        # Forward pass to get output/logits
        outputs = model(images)
        
        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)
        
        # Getting gradients w.r.t parameters
        loss.backward()
        
        # Updationg parameters
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            # Calculate Accuracy
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                # Load images to a Torch Variable
                images = Variable(images.view(-1, 28*28))
                
                # Forward pass only to get logits/output
                outputs = model(images)
                
                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)
                
                # Total number of labels
                total += labels.size(0)
                
                # Total correct predictions
                correct += (predicted == labels).sum()
                
            accuracy = 100 * int(correct) / int(total)
            
            # Print Loss
            print (f'Iteration: {iter}, Loss: {loss.item()}, Accuracy: {accuracy}')              

Iteration: 500, Loss: 0.7069175243377686, Accuracy: 86.23
Iteration: 1000, Loss: 0.47253838181495667, Accuracy: 89.28
Iteration: 1500, Loss: 0.37439411878585815, Accuracy: 90.57
Iteration: 2000, Loss: 0.40091001987457275, Accuracy: 91.16
Iteration: 2500, Loss: 0.29052475094795227, Accuracy: 91.64
Iteration: 3000, Loss: 0.38746798038482666, Accuracy: 92.01


### 모델 B: 1 Hidden Layer Feedforward Neural Network (Tanh Activation)

<img src="./images/08-02.png">

### Steps
- Step 1: 데이터셋 로드
- Step 2: 데이터셋 순환 가능하게 만들기
- **Step 3: 모델 클래스 만들기**
- Step 4: 모델 클래스 인스턴스화
- Step 5: 손실 클래스 인스턴스화
- Step 6: 최적화 클래스 인스턴스화
- Step 7: 모델 학습

In [24]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable

'''
STEP 1: LOADING DATASET
'''

train_dataset = dsets.MNIST(root='./data',
                            train=True,
                            transform= transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root = './data',
                           train=False,
                           transform=transforms.ToTensor())

'''
STEP 2: MAKING DATASET ITERABLE
'''

batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)

'''
STEP 3: CREATE MODEL CLASS
'''

class FeedforwardNeuralNetModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, output_dim):
        
        super(FeedforwardNeuralNetModel, self).__init__()
        
        # Linear funciton 1: 784 --> 100
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        # Non-Linearity 1
        self.relu1 = nn.ReLU()
        
        # Linear function 2: 100 --> 100
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        # Non-Linearity 2
        self.relu2 = nn.ReLU()
        
        # Linear function 3 (readout): 100 --> 10
        self.fc3 = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, x):
        # Linear function 1
        out = self.fc1(x)
        # Non-linearity 1
        out = self.relu1(out)
        
        # Linear function2
        out = self.fc2(out)
        # Non-Linearity 2
        out = self.relu2(out)
        
        # Linear function 3 (readout)
        out = self.fc3(out)
        return out
    
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28 * 28
hidden_dim = 100
output_dim = 10

model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

'''
STEP 7: TRAIN THE MODEL
'''

iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as Variable
        images = Variable(images.view(-1, 28*28))
        labels = Variable(labels)
        
        # Clear gradients w.r.t parameters
        optimizer.zero_grad()
        
        # Forward pass to get output/logits
        outputs = model(images)
        
        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)
        
        # Getting gradients w.r.t. parameters
        loss.backward()
        
        # Updating parameters
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            # Calculate Accuracy
            correct = 0
            total = 0
            for images, labels in test_loader:
                # Load images to a Torch Variable
                images = Variable(images.view(-1, 28*28))
                
                # Forward pass only to get logits/output
                outputs = model(images)
                
                # Get prediction from the maximum value
                _, predicted = torch.max(outputs.data, 1)
                
                # Total number of labels
                total += labels.size(0)
                
                # Total correct predictions
                correct += (predicted == labels).sum()
                
            accuracy = 100 * int(correct) / int(total)
            
            # Print Loss
            print (f'Iteration: {iter}, Loss: {loss.item()}, Accuracy: {accuracy}')

Iteration: 500, Loss: 0.30187758803367615, Accuracy: 90.4
Iteration: 1000, Loss: 0.39137744903564453, Accuracy: 93.48
Iteration: 1500, Loss: 0.22136268019676208, Accuracy: 94.68
Iteration: 2000, Loss: 0.1438414305448532, Accuracy: 95.6
Iteration: 2500, Loss: 0.1414695680141449, Accuracy: 96.16
Iteration: 3000, Loss: 0.02799205295741558, Accuracy: 96.49


### 모델 E: 3 Hidden Layer Feedforward Neural Network (ReLU Activation)

<img src="./images/08-04.png">

### Steps
- Step 1: 데이터셋 로드
- Step 2: 데이터셋 순환 가능하게 만들기
- **Step 3: 모델 클래스 만들기**
- Step 4: 모델 클래스 인스턴스화
- Step 5: 손실 클래스 인스턴스화
- Step 6: 최적화 클래스 인스턴스화
- Step 7: 모델 학습

In [28]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable

'''
STEP 1: LOADING DATASET
'''

train_dataset = dsets.MNIST(root='./data',
                            train=True,
                            transform= transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root = './data',
                           train=False,
                           transform=transforms.ToTensor())

'''
STEP 2: MAKING DATASET ITERABLE
'''

batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                          batch_size=batch_size,
                                          shuffle=False)


'''
STEP 3: CREATE MODEL CLASS
'''
class FeedforwardNeuralNetModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, output_dim):
        
        super(FeedforwardNeuralNetModel, self).__init__()
        # Linear function 1: 784 --> 100
        self.fc1 = nn.Linear(input_dim, hidden_dim)
        # Non-Linearity 1
        self.relu1 = nn.ReLU()
        
        # Linear function 2: 100 --> 100
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        # Non-Linearity 2
        self.relu2 = nn.ReLU()
        
        # Linear function 3: 100 --> 100
        self.fc3 = nn.Linear(hidden_dim, hidden_dim)
        # Non-Linearity 3
        self.relu3 = nn.ReLU()
        
        # Linear function 4 (readout) : 100 -- > 100
        self.fc4 = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, x):
        # Linear function 1
        out = self.fc1(x)
        # Non-linearity 1
        out = self.relu1(out)
        
        # Linear function 2
        out = self.fc2(out)
        # Non-linearity 2
        out = self.relu2(out)
        
        # Linear function 3
        out = self.fc3(out)
        # Non-linearity 3
        out = self.relu3(out)
        
        # Linear function 4 (readout)
        out = self.fc4(out)
        return out
    
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28 * 28
hidden_dim = 100
output_dim = 10

model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

'''
STEP 7: TRAIN THE MODEL
'''

iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as Variable
        images = Variable(images.view(-1, 28*28))
        labels = Variable(labels)
        
        # Clear gradients w.r.t parameters
        optimizer.zero_grad()
        
        # Forward pass to get output/logits
        outputs = model(images)
        
        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)
        
        # Getting gradients w.r.t. parameters
        loss.backward()
        
        # Updating parameters
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            # Calculate Accuracy
            correct = 0
            total = 0
            for images, labels in test_loader:
                # Load images to a Torch Variable
                images = Variable(images.view(-1, 28*28))
                
                # Forward pass only to get logits/output
                outputs = model(images)
                
                # Get prediction from the maximum value
                _, predicted = torch.max(outputs.data, 1)
                
                # Total number of labels
                total += labels.size(0)
                
                # Total correct predictions
                correct += (predicted == labels).sum()
                
            accuracy = 100 * int(correct) / int(total)
            
            # Print Loss
            print (f'Iteration: {iter}, Loss: {loss.item()}, Accuracy: {accuracy}')

Iteration: 500, Loss: 0.25381138920783997, Accuracy: 89.61
Iteration: 1000, Loss: 0.36012136936187744, Accuracy: 93.02
Iteration: 1500, Loss: 0.0649011954665184, Accuracy: 94.99
Iteration: 2000, Loss: 0.08892955631017685, Accuracy: 95.99
Iteration: 2500, Loss: 0.11272620409727097, Accuracy: 96.56
Iteration: 3000, Loss: 0.14828279614448547, Accuracy: 96.71


### 딥 러닝
- 뉴럴 넷을 확장하기 위한 두가지 방법
    - 비선형 활성화 유닛들(뉴런들)을 늘리기 (100 => 200) 
    - 은닉층을 늘리기 (3 => 5)
- 단점
    - 보다 큰 데이터셋이 필요하다
        - 차원의 저주
    - 높은 정확도를 보장하지는 않는다

## 3. PyTorch로 순방향 신경망 구축하기 (GPU)

<img src="./images/08-04.png">

GPU: GPU 옵션이 켜져야 하는 두 가지
- model
- variables

### Steps
- Step 1: 데이터셋 로드
- Step 2: 데이터셋 순환 가능하게 만들기
- **Step 3: 모델 클래스 만들기**
- Step 4: 모델 클래스 인스턴스화
- Step 5: 손실 클래스 인스턴스화
- Step 6: 최적화 클래스 인스턴스화
- **Step 7: 모델 학습**

In [30]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable

'''
STEP 1: LOADING DATASET
'''

train_dataset = dsets.MNIST(root='./data', 
                            train=True, 
                            transform=transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root='./data', 
                           train=False, 
                           transform=transforms.ToTensor())

'''
STEP 2: MAKING DATASET ITERABLE
'''

batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

'''
STEP 3: CREATE MODEL CLASS
'''
class FeedforwardNeuralNetModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(FeedforwardNeuralNetModel, self).__init__()
        # Linear function 1: 784 --> 100
        self.fc1 = nn.Linear(input_dim, hidden_dim) 
        # Non-linearity 1
        self.relu1 = nn.ReLU()
        
        # Linear function 2: 100 --> 100
        self.fc2 = nn.Linear(hidden_dim, hidden_dim)
        # Non-linearity 2
        self.relu2 = nn.ReLU()
        
        # Linear function 3: 100 --> 100
        self.fc3 = nn.Linear(hidden_dim, hidden_dim)
        # Non-linearity 3
        self.relu3 = nn.ReLU()
        
        # Linear function 4 (readout): 100 --> 10
        self.fc4 = nn.Linear(hidden_dim, output_dim)  
    
    def forward(self, x):
        # Linear function 1
        out = self.fc1(x)
        # Non-linearity 1
        out = self.relu1(out)
        
        # Linear function 2
        out = self.fc2(out)
        # Non-linearity 2
        out = self.relu2(out)
        
        # Linear function 2
        out = self.fc3(out)
        # Non-linearity 2
        out = self.relu3(out)
        
        # Linear function 4 (readout)
        out = self.fc4(out)
        return out
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28*28
hidden_dim = 100
output_dim = 10

model = FeedforwardNeuralNetModel(input_dim, hidden_dim, output_dim)

#######################
#  USE GPU FOR MODEL  #
#######################

if torch.cuda.is_available():
    model.cuda()

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()


'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

'''
STEP 7: TRAIN THE MODEL
'''
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        #######################
        #  USE GPU FOR MODEL  #
        #######################
        if torch.cuda.is_available():
            images = Variable(images.view(-1, 28*28).cuda())
            labels = Variable(labels.cuda())
        else:
            images = Variable(images.view(-1, 28*28))
            labels = Variable(labels)
        
        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()
        
        # Forward pass to get output/logits
        outputs = model(images)
        
        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)
        
        # Getting gradients w.r.t. parameters
        loss.backward()
        
        # Updating parameters
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                #######################
                #  USE GPU FOR MODEL  #
                #######################
                if torch.cuda.is_available():
                    images = Variable(images.view(-1, 28*28).cuda())
                else:
                    images = Variable(images.view(-1, 28*28))
                
                # Forward pass only to get logits/output
                outputs = model(images)
                
                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)
                
                # Total number of labels
                total += labels.size(0)
                
                #######################
                #  USE GPU FOR MODEL  #
                #######################
                # Total correct predictions
                correct += (predicted.cpu() == labels.cpu()).sum()
            
            accuracy = 100 * int(correct) / int(total)
            
            # Print Loss
            print (f'Iteration: {iter}, Loss: {loss.item()}, Accuracy: {accuracy}')

Iteration: 500, Loss: 0.43487799167633057, Accuracy: 90.44
Iteration: 1000, Loss: 0.31847378611564636, Accuracy: 93.13
Iteration: 1500, Loss: 0.0910906195640564, Accuracy: 95.6
Iteration: 2000, Loss: 0.10492780059576035, Accuracy: 96.23
Iteration: 2500, Loss: 0.08852627128362656, Accuracy: 96.74
Iteration: 3000, Loss: 0.12088696658611298, Accuracy: 96.78


## 요약

- **로지스틱 회귀분석 문제점들** 비선형 함수들 표현
    - **비-선형** 함수들을 잘 표현하지 못함
        - y = 4$x_1$ + 2$x_2^2$ + 3$x_3^3$
        - y = $x_1x_2$
- 뉴럴 네트워크를 생성하기 위해 **비-선형성**을 로지스틱 회귀분석에 소개하였다.
- 비선형의 종류들
    - Sigmoid
    - Tanh
    - ReLU
- 순방향 뉴럴 네트워크 **모델들**
    - 모델 A: 1 hidden layers (**sigmoid** activation)
    - 모델 B: 1 hidden layers (**tanh** activation)
    - 모델 C: 1 hidden layers (**ReLU** activation)
    - 모델 D: **2 hidden** layers (ReLU activation)
    - 모델 E: **3 hidden** layers (ReLU activation)
- **코드**에서의 모델들의 변형
    - Step 3번만 수정한다
- 모델의 **능력**을 확장시키는 방법
    - 보다 더 많은 활성화 유닛들 (**뉴런들**)
    - 더 많은 **은닉층**
- 능력을 확장시키는 것에 관한 **단점**
    - **데이터**가 더 많이 필요하다
    - **정확도** 향상을 보장해주지는 못한다
- **GPU** 코드
    - GPU를 활용하기 위해 추가해야 할 두가지 옵션
        - **model**
        - **variable**
    - **Step 4 & Step 7 **만 수정해 주면 된다
- **7 Step** 모델 구현 요약
    - Step 1: 데이터셋 로드
    - Step 2: 데이터셋 순환 가능하게 만들기
    - Step 3: 모델 클래스 만들기
    - Step 4: 모델 클래스 인스턴스화
    - Step 5: 손실 클래스 인스턴스화
    - Step 6: 최적화 클래스 인스턴스화
    - Step 7: 모델 학습