# 오늘은 LeNet 구조를 만들어봅시다


LeNet 구조는 CNN이며, 초기에 만들어진 모델입니다. 

2가지 모델(Sigmoid, ReLU)를 만들어 두 모델의 성능을 비교해봅시다.


## 1.우선 필요 라이브러리를 import 합니다.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

import torch
from torchvision import datasets
import torchvision.transforms as transforms

import torch.optim as optim

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

## 2. 딥러닝 모델을 설계할 때 활용하는 장비 확인

In [2]:
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

print('Using PyTorch version:', torch.__version__, ' Device:', device)

Using PyTorch version: 2.0.0+cpu  Device: cpu


## 3. MNIST 데이터 다운로드 

 1. Training data와 Test data 분리하기
 
 2. Training data를 Training data 와 Validation data로 분리하기

In [3]:
BATCH_SIZE = 64

transform = transforms.Compose(
    [
        transforms.Resize([32, 32]), 
        transforms.ToTensor(), 
        transforms.Normalize((0.5,), (1.0,))
    ])

train_data = datasets.MNIST('./data', train=True, download=True, transform=transforms.ToTensor())
test_data = datasets.MNIST('./data', train=False, download=True, transform=transforms.ToTensor())

train, val = torch.utils.data.random_split(train_data, [50000, 10000])
train_loader = torch.utils.data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=False)
val_loader = torch.utils.data.DataLoader(dataset=val, batch_size=BATCH_SIZE, shuffle=False)
test_loader = torch.utils.data.DataLoader(dataset=test_data, batch_size=BATCH_SIZE, shuffle=False)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data\MNIST\raw\train-images-idx3-ubyte.gz


100%|████████████████████████████████████████████████████████████████████| 9912422/9912422 [00:27<00:00, 361240.10it/s]


Extracting ./data\MNIST\raw\train-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data\MNIST\raw\train-labels-idx1-ubyte.gz


100%|████████████████████████████████████████████████████████████████████████| 28881/28881 [00:00<00:00, 285397.86it/s]


Extracting ./data\MNIST\raw\train-labels-idx1-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data\MNIST\raw\t10k-images-idx3-ubyte.gz


100%|████████████████████████████████████████████████████████████████████| 1648877/1648877 [00:02<00:00, 709671.64it/s]


Extracting ./data\MNIST\raw\t10k-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz


100%|██████████████████████████████████████████████████████████████████████████████████████| 4542/4542 [00:00<?, ?it/s]

Extracting ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw






## 4. torch.nn을 이용하여 모델-1 만들기

   1) 아래의 그림 중 LeNet 구조를 구현 할 것
   
   2) Sigmoid 활성화 함수를 이용할 것
   
   
![](Comparison_image_neural_networks.svg.png)

In [4]:
import torch.nn as nn

class Model_1(nn.Module):
    def __init__(self):
        super(Model_1, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.sig1 = nn.Sigmoid()
        self.pool1 = nn.AvgPool2d(2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.sig2 = nn.Sigmoid()
        self.pool2 = nn.AvgPool2d(2)
        self.fc1 = nn.Linear(256, 120)
        self.sig3 = nn.Sigmoid()
        self.fc2 = nn.Linear(120, 84)
        self.sig4 = nn.Sigmoid()
        self.fc3 = nn.Linear(84, 10)
        self.sig5 = nn.Sigmoid()

    def forward(self, x):
        x = self.conv1(x)
        x = self.sig1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.sig2(x)
        x = self.pool2(x)
        x = x.view(x.shape[0], -1)
        x = self.fc1(x)
        x = self.sig3(x)
        x = self.fc2(x)
        x = self.sig4(x)
        x = self.fc3(x)
        x = self.sig5(x)
        return x

## 5. torch.nn을 이용하여 모델-2 만들기

   LeNet 모델에서 ReLU 활성화 함수를 사용하시요

In [5]:
class Model_2(nn.Module):
    def __init__(self):
        super(Model_2, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.AvgPool2d(2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.AvgPool2d(2)
        self.fc1 = nn.Linear(256, 120)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(120, 84)
        self.relu4 = nn.ReLU()
        self.fc3 = nn.Linear(84, 10)
        self.relu5 = nn.ReLU()

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)
        x = x.view(x.shape[0], -1)
        x = self.fc1(x)
        x = self.relu3(x)
        x = self.fc2(x)
        x = self.relu4(x)
        x = self.fc3(x)
        x = self.relu5(x)
        return x

## 7. 학습 준비하기

1) 1 epoch를 학습할 수 있는 함수 만들기

2) Test와 Validation data의 정확도 계산할 수 있는 함수 만들기

In [6]:
def training_epoch(train_loader, network, loss_func, optimizer, epoch):
    train_losses = []
    train_correct = 0
    log_interval = 300
    
    for batch_idx, (image, label) in enumerate(train_loader):
        image, label = image.to(device), label.to(device)

        # 미분값의 초기화
        optimizer.zero_grad()

        # Forward propagration 계산하기.
        outputs = network(image)
        
        
        # Cross_entropy 함수를 적용하여 loss를 구하고 저장하기
        loss = loss_func(outputs, label)
        train_losses.append(loss.item())

        # training accuracy 정확도 구하기 위해 맞는 샘플 개수 세기
        pred = outputs.data.argmax(dim=1)
        train_correct += pred.eq(label).sum()

        # Gradinet 구하기
        loss.backward()

        # weight값 update 하기
        optimizer.step()

        # 학습 상황 출력
        if batch_idx % log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.2f}%)]\tLoss: {:.6f}'
                  .format(epoch, batch_idx * len(label), len(train_loader.dataset),100. * batch_idx / len(train_loader),
                          loss.item()))
            
    return train_losses, train_correct

In [7]:
def test_epoch(test_loader, network, loss_func, val = False):
    correct = 0
    
    test_losses = []
    
    with torch.no_grad():
        for batch_idx, (image, label) in enumerate(test_loader):
            image, label = image.to(device), label.to(device)

            # Forward propagration 계산하기.
            outputs = network(image)

            # Cross_entropy 함수를 적용하여 loss를 구하기
            loss = loss_func(outputs, label)
            test_losses.append(loss.item())

            # Batch 별로 정확도 구하기
            pred = outputs.data.argmax(dim=1)
            correct += pred.eq(label).sum()

        # 전체 정확도 구하기
        test_accuracy = 100. * correct / len(test_loader.dataset)

        #중간결과 출력
        if val is True:
                print('Validation set: Accuracy: {}/{} ({:.2f}%)\n'
              .format(correct, len(test_loader.dataset),100. * correct / len(test_loader.dataset)))
        
        else:
            print('Test set: Accuracy: {}/{} ({:.2f}%)\n'
                  .format(correct, len(test_loader.dataset),100. * correct / len(test_loader.dataset)))
        
    return test_losses, test_accuracy


## 8. 위 정의된 함수로 학습 함수 만들기

Adam Optimizer를 사용하여 학습시키기

In [8]:
def training(network, learning_rate = 0.001):
    
    epoches = 15
    
    cls_loss = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(network.parameters(), lr = learning_rate)
    
    train_losses_per_epoch = []
    test_losses_per_epoch = []
    
    train_accuracies = []
    test_accuracies = []
    
    
    for epoch in range(epoches):
                
        # 모델를 학습 중이라고 선언하기
        network.train()
        
        train_losses, train_correct = training_epoch(train_loader, network, cls_loss, optimizer, epoch)
        
        # epoch 별로 loss 평균값, 정확도 구하기
        average_loss = np.mean(train_losses)
        train_losses_per_epoch.append(average_loss)
        
        train_accuracy = train_correct / len(train_loader.dataset) * 100
        train_accuracies.append(train_accuracy)
        
        # epoch 별로 정확도 출력
        print('\nTraining set: Accuracy: {}/{} ({:.2f}%)'
              .format(train_correct, len(train_loader.dataset),100. * train_correct / len(train_loader.dataset)))

        
        ### 학습 중에 test 결과 보기
        
        # 모델 test 중인 것을 선언하기
        network.eval()
        
        correct = 0
        with torch.no_grad():
            test_losses, test_accuracy = test_epoch(val_loader, network, cls_loss, True)

        test_losses_per_epoch.append(np.mean(test_losses))
        test_accuracies.append(test_accuracy)
        
    with torch.no_grad():
        test_losses, test_accuracy = test_epoch(test_loader, network, cls_loss, False)
        
    return train_losses_per_epoch, test_losses_per_epoch, train_accuracies, test_accuracies


In [9]:
network = Model_1().to(device)
rlt_const = training(network)


Training set: Accuracy: 36865/60000 (61.44%)
Validation set: Accuracy: 8276/10000 (82.76%)


Training set: Accuracy: 51339/60000 (85.57%)
Validation set: Accuracy: 8484/10000 (84.84%)


Training set: Accuracy: 51528/60000 (85.88%)
Validation set: Accuracy: 8566/10000 (85.66%)


Training set: Accuracy: 52365/60000 (87.28%)
Validation set: Accuracy: 8818/10000 (88.18%)


Training set: Accuracy: 54778/60000 (91.30%)
Validation set: Accuracy: 9423/10000 (94.23%)


Training set: Accuracy: 57526/60000 (95.88%)
Validation set: Accuracy: 9591/10000 (95.91%)


Training set: Accuracy: 57992/60000 (96.65%)
Validation set: Accuracy: 9685/10000 (96.85%)


Training set: Accuracy: 58245/60000 (97.07%)
Validation set: Accuracy: 9725/10000 (97.25%)


Training set: Accuracy: 58413/60000 (97.36%)
Validation set: Accuracy: 9760/10000 (97.60%)


Training set: Accuracy: 58577/60000 (97.63%)
Validation set: Accuracy: 9794/10000 (97.94%)


Training set: Accuracy: 58701/60000 (97.83%)
Validation set: Accuracy

In [10]:
network = Model_2().to(device)
rlt_const = training(network)


Training set: Accuracy: 38610/60000 (64.35%)
Validation set: Accuracy: 6807/10000 (68.07%)


Training set: Accuracy: 41262/60000 (68.77%)
Validation set: Accuracy: 6926/10000 (69.26%)


Training set: Accuracy: 41709/60000 (69.51%)
Validation set: Accuracy: 7012/10000 (70.12%)


Training set: Accuracy: 41943/60000 (69.90%)
Validation set: Accuracy: 7038/10000 (70.38%)


Training set: Accuracy: 42056/60000 (70.09%)
Validation set: Accuracy: 7053/10000 (70.53%)


Training set: Accuracy: 42150/60000 (70.25%)
Validation set: Accuracy: 7061/10000 (70.61%)


Training set: Accuracy: 42216/60000 (70.36%)
Validation set: Accuracy: 7070/10000 (70.70%)


Training set: Accuracy: 42285/60000 (70.47%)
Validation set: Accuracy: 7070/10000 (70.70%)


Training set: Accuracy: 42327/60000 (70.54%)
Validation set: Accuracy: 7087/10000 (70.87%)


Training set: Accuracy: 42362/60000 (70.60%)
Validation set: Accuracy: 7081/10000 (70.81%)


Training set: Accuracy: 42377/60000 (70.63%)
Validation set: Accuracy

## 9. 두모델의 성능을 비교하시오

정답)ReLu를 사용한 모델은 첫 epoch부터 Loss값이 Sigmoid를 사용한 모델에 비해 적게 나왔다. 하지만 학습을 진행할수록 계속해서 낮게 나오는 Loss값에 비해 Accuracy 향상이 상대적으로 더디게 일어났고, 반면에 Sigmoid 모델을 학습이 진행될수록 Accuracy가 급격하게 향상되었다.
Sigmoid를 사용한 모델이 더 좋은 성능을 보여주고, Loss값과 Accuracy는 무조건적인 반비례 관계에 있지는 않다는 것을 알 수 있었다.