# 삼성전자 첨기연 시각 심화

- **Instructor**: Jongwoo Lim / Jiun Bae
- **Email**: [jlim@hanyang.ac.kr](mailto:jlim@hanyang.ac.kr) / [jiun.maydev@gmail.com](mailto:jiun.maydev@gmail.com)

## NeuralNetwork Example

In this example you will practice a simple neural network written by only [Numpy](https://www.numpy.org) which is fundamental package for scientific computing with Python. The goals of this example are as follows:

- Understand **Neural Networks** and how they work.
- Learn basically how to **write and use code**(*Numpy*).

*If you are more familiar with PyTorch and TensorFlow(or Keras), You might be wondering why to write from the ground up with numpy instead of the built-in framework. This process is essential for understanding how a neural network works, and if you understand it, will not be too difficult to write in code.*

And this example also is written in [IPython Notebook](https://ipython.org/notebook.html), an interactive computational environment, in which you can run code directly.

### Environments

In this assignment, we assume the follows environments. 

The [Python](https://www.python.org) is a programming language that lets you work quickly and integrate systems more effectively. It is widely used in various fields, and also used in machine learning.

The [Pytorch](https://pytorch.org) is an open source deep learning platform, provides a seamless path from research to production.

The [Tensorflow](https://www.tensorflow.org) is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

The [CUDA®](https://developer.nvidia.com/cuda-zone) Toolkit provides high-performance GPU-accelerated computation. In deep learning, the model takes an age to train without GPU-acceleration. ~~even with the GPU, it still takes a lot of time~~.


- [Python3](https://www.python.org/downloads/) (recommend 3.6 or above)
- [PyTorch](https://pytorch.org) (recommend 1.0)
- [Tensorflow](https://tensorflow.org) (recommend above 1.13.0, but under 2.0 *There are huge difference between 2.0 and below*)
- [NumPy](http://www.numpy.org) the fundamental package for scientific computing with Python


- (Optional) [Anaconda](https://www.anaconda.com/distribution/#download-section), *popular Python Data Science Platform*
- (Optional) [Jupyter](https://jupyter.org/) (Notebook or Lab)
- (Optional) [CUDA](https://developer.nvidia.com/cuda-downloads) support GPU


Python packages can install by `pip install [package name]` or using **Anaconda** by `conda install [package name]`.

*If you are having trouble installing or something else, please contact TA or jiun.maydev@gmail.com.*

# Code

### Import packages

Numpy the basic scientific computing package used in customary.

In [None]:
import numpy as np

## Dataset

PyTorch basically provides MNIST Dataset and support download in running code!

In [None]:
from torchvision import datasets, transforms

In [None]:
DATASET_DIR = './data' # path to download mnist dataset

TRAIN_DATASET = datasets.MNIST(DATASET_DIR,   # Dataset root path
                               train=True,    # Train data
                               download=True) # Download if not exist

TEST_DATASET = datasets.MNIST(DATASET_DIR,    # Dataset root path
                              train=False)    # Test data

## Network

This is a simple two dense(fully connected) layer network. The code is quite easy.

So, whole network architecture as follow:

- Dense
- ReLU
- Dense
- ReLU

In [None]:
class Layer:
    pass

class Dense(Layer):
    def __init__(self, input_units, output_units):
        self.weights = np.random.randn(input_units, output_units) * .01
        self.biases = np.zeros(output_units)
        
    def forward(self, inputs):
        self.inputs = inputs
        
        return np.dot(inputs, self.weights) + self.biases
      
    def backward(self, grads):
        # compute d f / d x = d f / d dense * d dense / d x
        # where d dense/ d x = weights transposed
        grad_input = np.dot(grads, np.transpose(self.weights))

        # compute gradient w.r.t. weights and biases
        self.grad_weights = np.transpose(np.dot(np.transpose(grads), self.inputs))
        self.grad_biases = np.sum(grads, axis = 0)
        
        return grad_input

    def update(self, lr: float = .01):
        # Here we perform a stochastic gradient descent step.
        self.weights = self.weights - lr * self.grad_weights
        self.biases = self.biases - lr * self.grad_biases

In [None]:
class Sigmoid(Layer):
    def forward(self, inputs):
        self.inputs = inputs
        return 1. / (1. + np.exp(-inputs))

    def backward(self, grads):
        
        r = self.forward(self.inputs)
        return grads * r * (1. - r)
    
    def update(self, lr):
        pass

In [None]:
def loss_fn(preds, labels):
    """Compute crossentropy from logits[batch,n_classes] and ids of correct answers"""
    return -preds[np.arange(len(preds)), labels] + np.log(np.sum(np.exp(preds),axis=-1))

def grad_fn(preds, labels):
    """Compute crossentropy gradient from logits[batch,n_classes] and ids of correct answers"""
    ones_for_answers = np.zeros_like(preds)
    ones_for_answers[np.arange(len(preds)), labels] = 1
    
    softmax = np.exp(preds) / np.exp(preds).sum(axis=-1, keepdims=True)
    
    return (- ones_for_answers + softmax) / preds.shape[0]

In [None]:
from typing import List
from functools import reduce


def train(networks: List[Layer], X, y):
    preds = reduce(lambda inputs, layer: layer.forward(inputs), [X, *networks])
    
    loss = loss_fn(preds, y)
    grads = grad_fn(preds, y)
    
    grads = reduce(lambda grads, layer: layer.backward(grads), [grads, *reversed(networks)])

    for layer in networks:
        layer.update(lr)
    
    return np.mean(loss)

## Prepare

In [None]:
def get_batch(dataset, batch):
    for b in range(int(len(dataset) / batch)):
        images = np.empty((batch, 28, 28), dtype=np.float32)
        labels = np.empty(batch, dtype=np.uint8)
        
        for i in range(batch):
            images[i], labels[i] = dataset[b * batch + i]
        
        images = np.reshape(images / 255., (batch, -1))
        
        yield images, labels

In [None]:
np.random.seed(42)

In [None]:
lr = .3
batch = 128
epochs = 32

In [None]:
networks = [
    Dense(28*28, 100),
    Sigmoid(),
    Dense(100, 200),
    Sigmoid(),
    Dense(200, 10),
]

## Train

In [None]:
from tqdm import tqdm


for epoch in range(epochs):
    # Train scope
    train_loss, test_loss, test_acc = 0, 0, 0
    for images, labels in get_batch(TRAIN_DATASET, batch):
        train_loss += train(networks, images, labels)
    
    for images, labels in get_batch(TEST_DATASET, batch):
        preds = reduce(lambda inputs, layer: layer.forward(inputs), [images, *networks])

        test_loss += loss_fn(preds, labels).mean()
        test_acc += (preds.argmax(axis=-1) == labels).mean()
    
    print(f'Epoch: {epoch}')
    print(f'\tTrain Loss: {train_loss / (len(TRAIN_DATASET) / batch)}')
    print(f'\tTest Loss: {test_loss / (len(TEST_DATASET) / batch)}')
    print(f'\tTest Acc: {test_acc / (len(TEST_DATASET) / batch)}')

## Test

In [None]:
import random

In [None]:
image, label = random.choice(TEST_DATASET)

In [None]:
image

In [None]:
reduce(lambda inputs, layer: layer.forward(inputs), [np.reshape(np.array(image), -1)[None, :], *networks]).argmax(axis=-1)[0]