# "[NeuralNetwork] Computational Unit - Multi Layer Perceptron"
> KNU AIR week4

- toc: false
- badges: false
- comments: false
- categories: [computational unit]
- hide_{github,colab,binder,deepnote}_badge: true

__Content creators:__ HEESUNG YANG

__Content reviewers:__ 

# 1. Overview
- First model for supervised neural network, in 1957
- Single-layer single-output neural network for binary classification of linearly separable dataset
- Model :

$
\text{For} \,\ \mathbf{x} = [x_1, \,\ \cdots, \,\ x_m]^T \,\ \text{and} \,\ 
W_1 = 
\begin{bmatrix}
w_{1, 1}^{(1)} & \cdots & w_{1, m}^{(1)} \\
w_{2, 1}^{(1)} & \cdots & w_{2, m}^{(1)} \\
\vdots & \ddots & \vdots \\
w_{n, 1}^{(1)} & \cdots & w_{n, m}^{(1)} \\
\end{bmatrix}, \,\
W_2 = 
\begin{bmatrix}
w_{1, 1}^{(2)} & \cdots & w_{1, n}^{(2)} \\
w_{2, 1}^{(2)} & \cdots & w_{2, n}^{(2)} \\
\vdots & \ddots & \vdots \\
w_{o, 1}^{(2)} & \cdots & w_{o, n}^{(2)} \\
\end{bmatrix}, \,\
$<br>

$
\mathbf{b}_1 = [b_1^{(1)}, \,\ b_2^{(1)}, \cdots \,\, b_n^{(1)}]^T, \,\ \mathbf{b}_2 = [b_1^{(2)}, \,\ b_2^{(2)}, \,\,  \cdots , \,\ b_o^{(2)}]^T
$,

$$
\hat{\mathbf{y}} = \text{softmax}(W_2\sigma(W_1\mathbf{x} + \mathbf{b}_1) + \mathbf{b}_2).
$$

- Learning : Error back-propagation

------------------------

# 2. Example

Dataset

In [1]:
import torch
import torchvision.datasets as dsets
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import random

device = 'cpu'

In [2]:
# parameters
training_epochs = 10
batch_size = 16

In [3]:
# MNIST dataset
mnist_train = dsets.MNIST(root='dataset/',
                          train=True,
                          transform=transforms.ToTensor(), 
                          download=True)

mnist_test = dsets.MNIST(root='dataset/',
                         train=False,
                         transform=transforms.ToTensor(),
                         download=True)

In [4]:
# dataset loader
train_data_loader = torch.utils.data.DataLoader(dataset=mnist_train,
                                                batch_size=batch_size,
                                                shuffle=True)
test_data_loader = torch.utils.data.DataLoader(dataset=mnist_test,
                                               batch_size=batch_size,
                                               shuffle=True)

In [5]:
import torch.nn as nn

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28 * 28, 512)
        self.fc2 = nn.Linear(512, 10)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        x = self.flatten(x)
        x = self.fc1(x)
        x = torch.sigmoid(x)
        x = self.fc2(x)
        x = self.softmax(x)
        return x
    
def train(model, train_loader, optimizer):
    model.train()
    for batch_idx, (data, label) in enumerate(train_loader):
        data = data.to(device)
        label = label.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = loss_fn(output, label)
        loss.backward()
        optimizer.step()
        
def evaluate(model, test_loader):
    model.eval()
    test_loss = 0
    correct = 0

    with torch.no_grad():
        for image, label in test_loader:
            image = image.to(device)
            label = label.to(device)
            output = model(image)
            test_loss += loss_fn(output, label).item()
            prediction = output.max(1, keepdim=True)[1]
            correct += prediction.eq(label.view_as(prediction)).sum().item()
    
    test_loss /= len(test_loader.dataset)
    test_accuracy = 100. * correct / len(test_loader.dataset)
    return test_loss, test_accuracy

In [6]:
model = MLP()
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(params=model.parameters(), lr=0.03)

In [7]:
for Epoch in range(1, training_epochs + 1):
    train(model, train_data_loader, optimizer)
    test_loss, test_accuracy = evaluate(model, test_data_loader)
    print("[EPOCH: {}], \tTest Loss: {:.4f}, \tTest Accuracy: {:.2f} %".format(
        Epoch, test_loss, test_accuracy
    ))
    
# 0.0362 loss -> log(10)

[EPOCH: 1], 	Test Loss: 0.1266, 	Test Accuracy: 48.11 %
[EPOCH: 2], 	Test Loss: 0.1172, 	Test Accuracy: 64.19 %
[EPOCH: 3], 	Test Loss: 0.1099, 	Test Accuracy: 74.16 %
[EPOCH: 4], 	Test Loss: 0.1080, 	Test Accuracy: 75.35 %
[EPOCH: 5], 	Test Loss: 0.1073, 	Test Accuracy: 75.81 %
[EPOCH: 6], 	Test Loss: 0.1069, 	Test Accuracy: 75.96 %
[EPOCH: 7], 	Test Loss: 0.1067, 	Test Accuracy: 76.21 %
[EPOCH: 8], 	Test Loss: 0.1066, 	Test Accuracy: 76.47 %
[EPOCH: 9], 	Test Loss: 0.1064, 	Test Accuracy: 76.55 %
[EPOCH: 10], 	Test Loss: 0.1063, 	Test Accuracy: 76.58 %
