## 과제 1
ReLu activation function과 derivative function을 구현해보세요
- Hint : np.maximum 함수 사용하면 편리합니다
- 다른 방법 사용하셔도 무방합니다


In [1]:
import numpy as np

def relu(x):
  x = np.maximum(0, x)
  return x

In [3]:
def d_relu(x):
  x[x > 0] = 1
  x[x <= 0] = 0
  return x

## 과제 2
Deep Learning Basic 코드 파일의 MLP implementation with Numpy library using MNIST dataset 코드 참고해서
Three layer MLP 일 때의 backward_pass 함수를 완성해주세요.   
- Hint : 코드 파일의 예시는 Two layer MLP


In [4]:
def backward_pass(x, y_true, params):

  dS2 = params["A3"] - y_true

  grads = {}

  grads["dW3"] =  np.dot(dS3, params["A2"].T)/x.shape[1]
  grads["db3"] =  (1/x.shape[1])*np.sum(dS3, axis=1, keepdims=True)/x.shape[1]

  dA2 = np.dot(params["W3"].T, dS3)
  dS2 = dA2 * d_sigmoid(params["S2"])

  grads["dW2"] =  np.dot(dS2, params["A1"].T)/x.shape[1]
  grads["db2"] =  (1/x.shape[1])*np.sum(dS2, axis=1, keepdims=True)/x.shape[1]

  dA1 = np.dot(params["W2"].T, dS2)
  dS1 = dA1 * d_sigmoid(params["S1"])

  grads["dW1"] = np.dot(dS1, x.T)/x.shape[1]
  grads["db1"] = np.sum(dS1, axis=1, keepdims=True)/x.shape[1]

  return grads

## 과제 3
Deep Learning Basic 코드 파일의 MLP implementation with Pytorch library using MNIST dataset 코드 참고해서
Three layer MLP를 구한후, 학습을 돌려 보세요

hyperparameter는 다음과 같이 설정

- epochs : 100
- hiddensize : 128, 64 (two layer)
- learning_rate : 0.5

In [5]:
from torchvision import transforms, datasets
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

In [6]:
# 이미지를 텐서로 변경
transform = transforms.Compose([
    transforms.ToTensor()
])

In [None]:
trainset = datasets.MNIST(
    root      = './.data/', 
    train     = True,
    download  = True,
    transform = transform
)
testset = datasets.MNIST(
    root      = './.data/', 
    train     = False,
    download  = True,
    transform = transform
)

In [8]:
BATCH_SIZE = 512
# train set과 test set 각각에 대하여 DataLoader를 생성합니다.
# shuffle=True 매개변수를 넣어 데이터를 섞어주세요.
train_loader = DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True)
test_loader =  DataLoader(testset, batch_size=BATCH_SIZE, shuffle=True)

In [14]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layer1 = nn.Linear(784,128)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(128,64)
        self.relu = nn.ReLU()
        self.layer3 = nn.Linear(64,10)
        
    def forward(self, x):
        x = x.view(-1, 784)
        out = self.layer1(x)
        out = self.relu(out)
        out = self.layer2(out)
        out = self.relu(out)
        out = self.layer3(out)

        return out

In [15]:
model = Net()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.5)

In [16]:
def train(model, train_loader, optimizer):
    model.train()
    # 배치 당 loss 값을 담을 리스트 생성
    batch_losses = []

    for data, target in train_loader:
        # 옵티마이저의 기울기 초기화
        optimizer.zero_grad()

        # y pred 값 산출
        output = model(data)
        # loss 계산
        # 정답 데이터와의 cross entropy loss 계산
        # 이 loss를 배치 당 loss로 보관
        loss = criterion(output, target)
        batch_losses.append(loss)

        # 기울기 계산
        loss.backward()

        # 가중치 업데이트!
        optimizer.step()
        
    # 배치당 평균 loss 계산
    avg_loss = sum(batch_losses) / len(batch_losses)
    
    return avg_loss

In [17]:
def evaluate(model, test_loader):
    # 모델을 평가 모드로 전환
    model.eval()

    batch_losses = []
    correct = 0 

    with torch.no_grad(): 
        for data, target in test_loader:
            # 예측값 생성
            output = model(data)

            # loss 계산 (이전과 동일)
            loss = criterion(output, target)
            batch_losses.append(loss)

           # Accuracy 계산
           # y pred와 y가 일치하면 correct에 1을 더해주기
            pred = output.max(1, keepdim=True)[1]

            # eq() 함수는 값이 일치하면 1을, 아니면 0을 출력.
            correct += pred.eq(target.view_as(pred)).sum().item()

    # 배치 당 평균 loss 계산 
    avg_loss =  sum(batch_losses) / len(batch_losses)

    #정확도 계산
    accuracy = 100. * correct / len(test_loader.dataset)

    return avg_loss, accuracy

In [18]:
EPOCHS = 100

for epoch in range(1, EPOCHS + 1):
    train_loss = train(model, train_loader, optimizer)
    test_loss, test_accuracy = evaluate(model, test_loader)
    
    print('[{}] Train Loss: {:.4f}\tTest Loss: {:.4f}\tAccuracy: {:.2f}%'.format(
          epoch, train_loss, test_loss, test_accuracy))

[1] Train Loss: 0.8268	Test Loss: 0.6319	Accuracy: 79.96%
[2] Train Loss: 0.2537	Test Loss: 0.1977	Accuracy: 93.90%
[3] Train Loss: 0.1684	Test Loss: 0.1492	Accuracy: 95.43%
[4] Train Loss: 0.1289	Test Loss: 0.1700	Accuracy: 94.88%
[5] Train Loss: 0.1078	Test Loss: 0.1242	Accuracy: 95.97%
[6] Train Loss: 0.0904	Test Loss: 0.1686	Accuracy: 94.74%
[7] Train Loss: 0.0798	Test Loss: 0.1687	Accuracy: 94.47%
[8] Train Loss: 0.0883	Test Loss: 0.1761	Accuracy: 94.90%
[9] Train Loss: 0.0601	Test Loss: 0.1106	Accuracy: 96.67%
[10] Train Loss: 0.0533	Test Loss: 0.1074	Accuracy: 96.51%
[11] Train Loss: 0.0462	Test Loss: 0.1437	Accuracy: 95.52%
[12] Train Loss: 0.1781	Test Loss: 0.1184	Accuracy: 96.46%
[13] Train Loss: 0.0558	Test Loss: 0.0949	Accuracy: 96.94%
[14] Train Loss: 0.0447	Test Loss: 0.0857	Accuracy: 97.34%
[15] Train Loss: 0.0387	Test Loss: 0.1292	Accuracy: 96.17%
[16] Train Loss: 0.0338	Test Loss: 0.0767	Accuracy: 97.71%
[17] Train Loss: 0.0285	Test Loss: 0.0944	Accuracy: 97.06%
[18] T

## 과제 4
과제 3 부분의 성능을 지금까지 배운 지식을 바탕으로 향상시켜보세요

- Hint : Activation function, hyperparameter setting

In [32]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layer1 = nn.Linear(784,128)
        self.relu = nn.LogSigmoid()
        self.layer2 = nn.Linear(128,64)
        self.relu = nn.LogSigmoid()
        self.layer3 = nn.Linear(64,10)
        
    def forward(self, x):
        x = x.view(-1, 784)
        out = self.layer1(x)
        out = self.relu(out)
        out = self.layer2(out)
        out = self.relu(out)
        out = self.layer3(out)

        return out

In [33]:
model = Net()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.5)

In [34]:
EPOCHS = 100

for epoch in range(1, EPOCHS + 1):
    train_loss = train(model, train_loader, optimizer)
    test_loss, test_accuracy = evaluate(model, test_loader)
    
    print('[{}] Train Loss: {:.4f}\tTest Loss: {:.4f}\tAccuracy: {:.2f}%'.format(
          epoch, train_loss, test_loss, test_accuracy))

[1] Train Loss: 1.5529	Test Loss: 0.8070	Accuracy: 70.68%
[2] Train Loss: 0.4489	Test Loss: 0.4115	Accuracy: 87.21%
[3] Train Loss: 0.3084	Test Loss: 0.2798	Accuracy: 91.03%
[4] Train Loss: 0.2628	Test Loss: 0.2439	Accuracy: 92.17%
[5] Train Loss: 0.2125	Test Loss: 0.2271	Accuracy: 92.87%
[6] Train Loss: 0.1878	Test Loss: 0.2157	Accuracy: 93.10%
[7] Train Loss: 0.1709	Test Loss: 0.2481	Accuracy: 91.71%
[8] Train Loss: 0.1540	Test Loss: 0.1584	Accuracy: 95.13%
[9] Train Loss: 0.1416	Test Loss: 0.1934	Accuracy: 93.88%
[10] Train Loss: 0.1287	Test Loss: 0.1421	Accuracy: 95.33%
[11] Train Loss: 0.1196	Test Loss: 0.1330	Accuracy: 95.83%
[12] Train Loss: 0.1106	Test Loss: 0.1676	Accuracy: 95.00%
[13] Train Loss: 0.1030	Test Loss: 0.1085	Accuracy: 96.55%
[14] Train Loss: 0.0965	Test Loss: 0.1184	Accuracy: 96.46%
[15] Train Loss: 0.0910	Test Loss: 0.1120	Accuracy: 96.36%
[16] Train Loss: 0.0853	Test Loss: 0.1511	Accuracy: 95.16%
[17] Train Loss: 0.0810	Test Loss: 0.1086	Accuracy: 96.58%
[18] T

**무엇을 보완하였고, 왜 보완되었는지에 대한 자유 서술 (아래에)**

- ReLU 함수를 LogSigmoid 함수로 다 변경해서 학습시켰다
- ReLU 함수 결과: Train Loss = 0.0004,Test Loss = 0.0996,Accuracy =  98.06%
- LogSigmoid 함수 결과: Train Loss=0.0055,	Test Loss=0.1380,	Accuracy=97.52%
- 결과가 향상될줄 알았는데 0.54%정도 안좋아짐