<a href="https://colab.research.google.com/github/jchen8000/MachineLearning/blob/master/9%20Convolutional%20Neural%20Network/Convolutional_Neural_Network_Introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Convolution

In [1]:
import numpy as np

## Convolution

### Convolution for 1-dimensional array

For example:

$\begin{bmatrix}3 & 4 & 1 &0  & 2 & 5\end{bmatrix} \ast \begin{bmatrix}2 & 3 & 1 \end{bmatrix} = \begin{bmatrix}17 & 7 & 5 & 16\end{bmatrix}$

$\begin{bmatrix}3 & 4 & 1 &0  & 2 & 5\end{bmatrix}$

$\begin{bmatrix}1 & 3 & 2 \end{bmatrix}\rightarrow $

First item of convolution result:

$2 \times 1 + 3 \times 4 + 1 \times 3  =  17$

$\begin{bmatrix}3 & 4 & 1 &0  & 2 & 5\end{bmatrix}$

$\quad\  \begin{bmatrix}1 & 3 & 2 \end{bmatrix}\rightarrow $

Seconde item of convolution result:

$2 \times 0 + 3 \times 1 + 1 \times 4  =  7$


... And so on.

In [2]:
f = np.array([2, 3, 1])
g = np.array([3, 4, 1, 0, 2, 5])
conv = np.convolve(f, g, mode = 'valid')
print(conv)

[17  7  5 16]


### Convolution for 2-dimensional array

In [3]:
def convolution2d(image, kernel, stride=[1,1], padding=[0,0]):
    p_h, p_w = padding
    s_h, s_w = stride
    image = np.pad(image, 
                   [(p_h, p_h), (p_w, p_w)], 
                   mode='constant', 
                   constant_values=0)

    k_h, k_w = kernel.shape
    i_h, i_w = image.shape

    output_h = (i_h - k_h) // s_h + 1
    output_w = (i_w - k_w) // s_w + 1

    output = np.zeros((output_h, output_w))

    for y in range(0, output_h):
        for x in range(0, output_w):
            c = image[y*s_h : y*s_h+k_h, x*s_w : x*s_w+k_w]
            c = np.multiply(c, kernel)
            output[y][x] = np.sum(c)
    return output

Example of 2D convolution:

$\begin{bmatrix}
1 & 3 & 1 & 0 & 2 & 1 & 0 \\ 
1 & 1 & 1 & 2 & 1 & 2 & 1 \\ 
2 & 1 & 9 & 9 & 8 & 2 & 0 \\ 
0 & 2 & 9 & 1 & 9 & 0 & 1 \\ 
1 & 0 & 9 & 0 & 8 & 2 & 1 \\ 
3 & 1 & 1 & 2 & 0 & 2 & 2 \\ 
1 & 3 & 1 & 3 & 3 & 2 & 0
\end{bmatrix} \ast \begin{bmatrix}
1 & 1 & 1\\ 
1 & 0 & 1\\ 
1 & 0 & 1
\end{bmatrix} = \begin{bmatrix}
 18 & 17 & 22 & 18 & 13\\
 23 & 17 & 39 & 17 & 22\\
 31 & 22 & 61 & 22 & 29\\
 25 & 15 & 37 & 16 & 21\\
 16 & 18 & 22 & 19 & 16
\end{bmatrix}$

The first item in convolution result is the sum of multiplication of each items highlighted below:

$\begin{bmatrix}
1 & 3 & 1 & . & . & . & . \\ 
1 & 1 & 1 & . & . & . & . \\ 
2 & 1 & 9 & . & . & . & . \\ 
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . 
\end{bmatrix} \ast \begin{bmatrix}
1 & 1 & 1\\ 
1 & 0 & 1\\ 
1 & 0 & 1
\end{bmatrix} = \begin{bmatrix}
 18  & . & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .
\end{bmatrix}$

The second item in convolution result is the sum of multiplication of each items highlighted below:

$\begin{bmatrix}
. & 3 & 1 & 0 & . & . & . \\ 
. & 1 & 1 & 2 & . & . & . \\ 
. & 1 & 9 & 9 & . & . & . \\ 
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . \\
. & . & . & . & . & . & . 
\end{bmatrix} \ast \begin{bmatrix}
1 & 1 & 1\\ 
1 & 0 & 1\\ 
1 & 0 & 1
\end{bmatrix} = \begin{bmatrix}
 18  & 17 & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .\\
 . & . & . & . & .
\end{bmatrix}$

and so on.

In [4]:
image = np.array([[1, 3, 1, 0, 2, 1 ,0],
                  [1, 1, 1, 2, 1, 2 ,1],
                  [2, 1, 9, 9, 8, 2 ,0],
                  [0, 2, 9, 1, 9, 0 ,1],
                  [1, 0, 9, 0, 8, 2 ,1],
                  [3, 1, 1, 2, 0, 2 ,2],
                  [1, 3, 1, 3, 3, 2 ,0]])

kernel = np.array([[1, 1, 1],
                   [1, 0, 1],
                   [1, 0, 1]])

conv2d = convolution2d(image, kernel)
print(conv2d)

[[18. 17. 22. 18. 13.]
 [23. 17. 39. 17. 22.]
 [31. 22. 61. 22. 29.]
 [25. 15. 37. 16. 21.]
 [16. 18. 22. 19. 16.]]


Padding = [1, 1]:

In [5]:
conv2d = convolution2d(image, kernel, padding=[1,1])
print(conv2d)

[[ 4.  4.  6.  5.  5.  4.  3.]
 [ 6. 18. 17. 22. 18. 13.  5.]
 [ 5. 23. 17. 39. 17. 22.  5.]
 [ 5. 31. 22. 61. 22. 29.  4.]
 [ 3. 25. 15. 37. 16. 21.  5.]
 [ 5. 16. 18. 22. 19. 16.  7.]
 [ 7.  7. 10.  7.  9.  7.  6.]]


Padding = [0,0]; Stride = [2, 2]:

In [6]:
conv2d = convolution2d(image, kernel, stride=[2,2])
print(conv2d)

[[18. 22. 13.]
 [31. 61. 29.]
 [16. 22. 16.]]


## Max Pooling

Example of Max Pooling, kernel = [3, 3]:

$\begin{bmatrix}
4 & 4 & 6 & 5 & 5 & 4 & 3 \\ 
6 & 18 & 17 & 22 & 18 & 13 & 5 \\ 
5 & 23 & 17 & 39 & 17 & 22 & 5 \\ 
5 & 31 & 22 & 61 & 22 & 29 & 4 \\ 
3 & 25 & 15 & 37 & 16 & 21 & 5 \\ 
5 & 16 & 18 & 22 & 19 & 16 & 7 \\ 
7 & 7 & 10 & 7 & 9 & 7 & 6
\end{bmatrix} \Rightarrow  \begin{bmatrix}
 23 & 39 & 5\\
 31 & 61 & 7\\
 10 & 9 & 6
\end{bmatrix}$

The first item in max pooling result is the max value of the area covered by the 3 x 3 kernel at the upper-left corner:

$\begin{bmatrix}
4 & 4 & 6 & . & . & . & . \\ 
6 & 18 & 17 & . & . & . & . \\ 
5 & 23 & 17 & . & . & . & . \\ 
. & . & . & . & . & . & . \\ 
. & . & . & . & . & . & . \\ 
\end{bmatrix} \Rightarrow  \begin{bmatrix}
 23 & . & .\\
 . & . & .\\
 . & . & .
\end{bmatrix}$

Then the kernel moves to right, and skip the already covered area, the second item in max pooling result is the max value of the area covered by the kernel:

$\begin{bmatrix}
. & . & . & 5 & 5 & 4 & . \\ 
. & . & . & 22 & 18 & 13 & . \\ 
. & . & . & 39 & 17 & 22 & . \\ 
. & . & . & . & . & . & . \\ 
. & . & . & . & . & . & . \\ 
\end{bmatrix} \Rightarrow  \begin{bmatrix}
 23 & 39 & .\\
 . & . & .\\
 . & . & .
\end{bmatrix}$

and so on, until the kernel traverses the entire input matrix.

In [7]:
def maxpooling2d(image, kernel=[3,3], stride=[0,0], padding=[0,0]):
    p_h, p_w = padding
    s_h, s_w = stride
    k_h, k_w = kernel
    image = np.pad(image, 
                   [(p_h, p_h), (p_w, p_w)], 
                   mode='constant', 
                   constant_values=0)

    i_h, i_w = image.shape

    output_h = -(-i_h // (k_h + s_h))
    output_w = -(-i_w // (k_w + s_w))

    output = np.zeros((output_h, output_w))

    for y in range(0, output_h):
        for x in range(0, output_w):
            y_, x_ = y*(s_h+k_h), x*(s_w+k_w)
            c = image[y_: y_+k_h, x_ : x_+k_w]
            output[y][x] = np.amax(c)
    return output

In [8]:
conv2d = convolution2d(image, kernel, padding=[1,1])
maxp2d = maxpooling2d(conv2d, kernel=[3,3])
print(maxp2d)

[[23. 39.  5.]
 [31. 61.  7.]
 [10.  9.  6.]]
