# Q1
실제 은행이 고객을 accept하지 않았는데, accept했다고 모델이 예측하는 경우를 FP라고 하며, 이 확률이 Type 2 error가 된다.
한편, 은행이 고객을 accept하였는데, accept하지 않았다고 모델이 예측하는 경우를 FN이라고 하며, 이 확률이 Type 1 error가 된다.
threshold를 0.3으로 지정하면 FP의 확률이 높고, FN의 확률이 낮게 되므로
Type 2 error가 높고, Type 1 error는 낮다.

# Q2

## Q2-1

- Precision(정밀도) = TP/(TP+FP) 
=> 모델이 True라고 분류한 것 중 실제 True인 것의 비율, 정답률, PPV

- Recall(재현율) = TP/(TP+FN)
=> 실제 True인 것 중에서 모델이 True라고 예측한 것의 비율, sensitivity, hit rate

- Accuracy(정확도) = TP+TN/(TP+FN+FP+FN)

## Q2-2
예시) 거짓말탐지기
- threshold 높이면 거짓말이 아닌 경우에 거짓말이라고 예측하는 FP가 낮아질 것
- 100명의 죄인을 잡는 것보다 한 명의 무고한 희생자를 막는 것이 더 중요하다고 판단한다면 threshold를 높이는 것이 합리적일 것

# Q3

In [1]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets # MNIST dataset 有
import torchvision.transforms as transforms
from torch.autograd import Variable

In [2]:
# MNIST Dataset (Images and Labels) 
train_dataset = dsets.MNIST(root ='./data',  
                            train = True,  
                            transform = transforms.ToTensor(), 
                            download = True) 
  
test_dataset = dsets.MNIST(root ='./data',  
                           train = False,  
                           transform = transforms.ToTensor()) 

# Hyper Parameters  
input_size = 784 # 28*28
num_classes = 10 # 0~9
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# Dataset Loader (Input Pipline) 
train_loader = torch.utils.data.DataLoader(dataset = train_dataset,  
                                           batch_size = batch_size,  
                                           shuffle = True) 
  
test_loader = torch.utils.data.DataLoader(dataset = test_dataset,  
                                          batch_size = batch_size,  
                                          shuffle = False) 

In [3]:
class LogisticRegression(nn.Module): 
    def __init__(self, input_size, num_classes): 
        super(LogisticRegression, self).__init__() 
        self.linear = nn.Linear(input_size, num_classes) 
  
    def forward(self, x): 
        out = self.linear(x) 
        return out 

In [4]:
# 모델 정의 (logistic regression)
model = LogisticRegression(input_size, num_classes) 

In [5]:
criterion = nn.CrossEntropyLoss() # loss function 정의
optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate) # optimizer

In [6]:
# Training the Model 
for epoch in range(num_epochs): 
    for i, (images, labels) in enumerate(train_loader): 
        images = Variable(images.view(-1, 28 * 28)) 
        labels = Variable(labels) 
  
        # Forward + Backward + Optimize 
        optimizer.zero_grad() # gradient 0으로 초기화
        outputs = model(images) 
        loss = criterion(outputs, labels) # loss
        loss.backward() # backward
        optimizer.step() # w 업데이트
  
        if (i + 1) % 100 == 0: 
            print('Epoch: [% d/% d], Step: [% d/% d], Loss: %.4f'
                  % (epoch + 1, num_epochs, i + 1, 
                     len(train_dataset) // batch_size, loss.data)) 

Epoch: [ 1/ 5], Step: [ 100/ 600], Loss: 2.2262
Epoch: [ 1/ 5], Step: [ 200/ 600], Loss: 2.1327
Epoch: [ 1/ 5], Step: [ 300/ 600], Loss: 2.0013
Epoch: [ 1/ 5], Step: [ 400/ 600], Loss: 1.9808
Epoch: [ 1/ 5], Step: [ 500/ 600], Loss: 1.8718
Epoch: [ 1/ 5], Step: [ 600/ 600], Loss: 1.7979
Epoch: [ 2/ 5], Step: [ 100/ 600], Loss: 1.6759
Epoch: [ 2/ 5], Step: [ 200/ 600], Loss: 1.6922
Epoch: [ 2/ 5], Step: [ 300/ 600], Loss: 1.6339
Epoch: [ 2/ 5], Step: [ 400/ 600], Loss: 1.5618
Epoch: [ 2/ 5], Step: [ 500/ 600], Loss: 1.6027
Epoch: [ 2/ 5], Step: [ 600/ 600], Loss: 1.4629
Epoch: [ 3/ 5], Step: [ 100/ 600], Loss: 1.4148
Epoch: [ 3/ 5], Step: [ 200/ 600], Loss: 1.4236
Epoch: [ 3/ 5], Step: [ 300/ 600], Loss: 1.3738
Epoch: [ 3/ 5], Step: [ 400/ 600], Loss: 1.3134
Epoch: [ 3/ 5], Step: [ 500/ 600], Loss: 1.3618
Epoch: [ 3/ 5], Step: [ 600/ 600], Loss: 1.2960
Epoch: [ 4/ 5], Step: [ 100/ 600], Loss: 1.3180
Epoch: [ 4/ 5], Step: [ 200/ 600], Loss: 1.1899
Epoch: [ 4/ 5], Step: [ 300/ 600], Loss:

In [7]:
# Test the Model 
correct = 0
total = 0
for images, labels in test_loader: 
    images = Variable(images.view(-1, 28 * 28)) 
    outputs = model(images) 
    _, predicted = torch.max(outputs.data, 1) 
    total += labels.size(0) 
    correct += (predicted == labels).sum() 
  
print('Accuracy of the model on the 10000 test images: % d %%' % ( 
            100 * correct // total))

Accuracy of the model on the 10000 test images:  82 %



# Q4

## Q4-1

In [8]:
def SGD(f, theta0, alpha, num_iters): 
    """  
       Arguments: 
       f -- the function to optimize, it takes a single argument 
            and yield two outputs, a cost and the gradient 
            with respect to the arguments 
       theta0 -- the initial point to start SGD from 
       num_iters -- total iterations to run SGD for 
       Return: 
       theta -- the parameter value after SGD finishes 
    """
    start_iter = 0
    theta = theta0 
    for iter in xrange(start_iter + 1, num_iters + 1): 
        _, grad = f(theta) 
   
        # there is NO dot product ! return theta 
        theta = theta - (alpha * grad)  

## Q4-2

In [9]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

## Adam

In [10]:
# 데이터
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[1], [2], [3]])

class LinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)
    
# 모델 초기화
model = LinearRegressionModel()

# optimizer 설정 (Adam)
optimizer = optim.Adam(model.parameters(), lr=0.01)

nb_epochs = 1000
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    prediction = model(x_train)
    
    # cost 계산
    cost = F.mse_loss(prediction, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad() # 미분값이 
    cost.backward()
    optimizer.step()
    
    # 100번마다 로그 출력
    if epoch % 100 == 0:
        params = list(model.parameters())
        W = params[0].item()
        b = params[1].item()
        print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
            epoch, nb_epochs, W, b, cost.item()
        ))

Epoch    0/1000 W: 0.957, b: -0.034 Cost: 0.024429
Epoch  100/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  200/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  300/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  400/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  500/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  600/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  700/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  800/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  900/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch 1000/1000 W: 1.000, b: 0.000 Cost: 0.000000


## RMSprop

In [11]:
# 데이터
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[1], [2], [3]])

class LinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)
    
# 모델 초기화
model = LinearRegressionModel()

# optimizer 설정 (RMSprop)
optimizer = optim.RMSprop(model.parameters(), lr=0.01)

nb_epochs = 1000
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    prediction = model(x_train)
    
    # cost 계산
    cost = F.mse_loss(prediction, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad() # 미분값이 
    cost.backward()
    optimizer.step()
    
    # 100번마다 로그 출력
    if epoch % 100 == 0:
        params = list(model.parameters())
        W = params[0].item()
        b = params[1].item()
        print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
            epoch, nb_epochs, W, b, cost.item()
        ))

Epoch    0/1000 W: -0.509, b: 0.286 Cost: 10.915822
Epoch  100/1000 W: 0.472, b: 1.114 Cost: 0.190032
Epoch  200/1000 W: 0.584, b: 0.920 Cost: 0.123859
Epoch  300/1000 W: 0.703, b: 0.658 Cost: 0.063537
Epoch  400/1000 W: 0.827, b: 0.384 Cost: 0.021695
Epoch  500/1000 W: 0.928, b: 0.160 Cost: 0.003819
Epoch  600/1000 W: 0.983, b: 0.038 Cost: 0.000222
Epoch  700/1000 W: 0.998, b: 0.004 Cost: 0.000002
Epoch  800/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch  900/1000 W: 1.000, b: 0.000 Cost: 0.000000
Epoch 1000/1000 W: 1.004, b: 0.004 Cost: 0.000205


## SGD with momentum=0.9

In [12]:
# 데이터
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[1], [2], [3]])

class LinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)
    
# 모델 초기화
model = LinearRegressionModel()

# optimizer 설정 (SGD with momentum)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

nb_epochs = 1000
for epoch in range(nb_epochs + 1):
    
    # H(x) 계산
    prediction = model(x_train)
    
    # cost 계산
    cost = F.mse_loss(prediction, y_train)
    
    # cost로 H(x) 개선
    optimizer.zero_grad() # 미분값이 
    cost.backward()
    optimizer.step()
    
    # 100번마다 로그 출력
    if epoch % 100 == 0:
        params = list(model.parameters())
        W = params[0].item()
        b = params[1].item()
        print('Epoch {:4d}/{} W: {:.3f}, b: {:.3f} Cost: {:.6f}'.format(
            epoch, nb_epochs, W, b, cost.item()
        ))

Epoch    0/1000 W: 0.717, b: -0.904 Cost: 2.782944
Epoch  100/1000 W: 1.017, b: -0.031 Cost: 0.000212
Epoch  200/1000 W: 1.000, b: -0.001 Cost: 0.000000
Epoch  300/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  400/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  500/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  600/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  700/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  800/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch  900/1000 W: 1.000, b: -0.000 Cost: 0.000000
Epoch 1000/1000 W: 1.000, b: -0.000 Cost: 0.000000
