## 1. Introduction

Here describes a demo-level implementation of CNN based on the following architecture:

$[(CONV-RELU)*N-POOL?]*M-(FC-RELU)*K,SOFTMAX$

Specifically, for simplicity, 

* $N = 1$
* $M = 1$
* $K = 1$

The implemented architecture is as follows:

$(CONV-RELU)-POOL-(FC-RELU),SOFTMAX$

## 2. Forward Pass

### Prepare Datasets

In [1]:
import mnist

In [6]:
# The mnist package handles the MNIST dataset for us!
# Learn more at https://github.com/datapythonista/mnist

train_images = mnist.train_images()
train_labels = mnist.train_labels()

test_images = mnist.test_images()
test_labels = mnist.test_labels()

print(f'train_images.shape: {train_images.shape}, train_labels.shape: {train_labels.shape}')
print(f'test_images.shape: {test_images.shape}, test_labels.shape: {test_labels.shape}')

train_images.shape: (60000, 28, 28), train_labels.shape: (60000,)
test_images.shape: (10000, 28, 28), test_labels.shape: (10000,)


### Convolution Layer

In [14]:
from typing import Callable

import numpy as np

In [17]:
class Conv:
    """Conv Layer
    
    In order to focus on the soul of CNN, code has been kept simple. It may not have a good design and 
    efficiency, but it works to demonstrate the workflow of implementing a CNN manually.
    """
    
    def __init__(self, input_: np.ndarray, stride: int, padding_size: int, activator: Callable, filter_size, 
                 num_filters=1):
        """Intialization of the Conv layer"""
        self.input_ = input_
        self.stride = stride
        self.padding_size = padding_size
        self.activator = activator
        self.filter_size = filter_size
        self.num_filters = num_filters
        
    def add_padding(self, x: np.ndarray) -> np.ndarray:
        """Add padding zeros to input array based on padding_size"""
        original_shape = x.shape
        new_shape = (original_shape[0] + 2 * self.padding_size, original_shape[1] + 2 * self.padding_size)
        y = np.zeros(new_shape)
        y[padding_size: original_shape[0] + padding_size, padding_size: original_shape[1] + padding_size]
        
        return y
    
    def generate_filters(self) -> np.ndarray:
        """Generate a list of filters to use"""
        return np.random.randn(self.filter_size, self.filter_size)
        
    def forward(self) -> np.ndarray:
        """Forward computation"""
        y = self.add_padding(self.input_)
        filter_ = self.generate_filter()
        
        return y


In [13]:
class Activator:
    
    @staticmethod
    def relu(x: np.ndarray) -> np.ndarray:
        """ReLU
        
        Definition: $f(x) = max(0, x)$
        
        See: https://en.wikipedia.org/wiki/Rectifier_(neural_networks)
        """
        return x * (x > 0)

In [10]:
type(train_images)

numpy.ndarray

## Pooling Layer

- makes the representations smaller and more manageable 
- operates over each activation map independently

In [20]:
# The pooling is usually done by a simple operation like max, min, or average. Here we use `max`.
class MaxPooling:
    
    def __init__(self, input_: np.ndarray, size: int, activator: Callable):
        self.input_ = input_
        self.size = size
        self.activator = activator
    
    def forward(self) -> np.ndarray:
        input_shape = self._input.shape
        output = np.zeroes(input_shape[0] // 2, input_shape[1] // 2, 1)
        
        return output


## Fully Connected Layer

- Contains neurons that connect to the entire input volume, as in ordinary Neural Networksaz

## 3. Back Propagation

## 4. Summary