<br><br>
<font size = '6'><b>Convolution Neural Networks</b></font>

- <a href="./reference_files/deep_learning_tutorial_2015.pdf" target="_blank">Slides</a> by Phillip Isola
- <a href="./reference_files/cnn1.pdf" target="_blank">Slides</a> by Prof. Ali Ghodsi

<table style="border-style: hidden; border-collapse: collapse;" width = "90%"> 
    <tr style="border-style: hidden; border-collapse: collapse;">
        <td width = 60% style="border-style: hidden; border-collapse: collapse;">
             
        </td>
        <td width = 30%>
        Prof. Seungchul Lee<br>
        iSystems<br>
        UNIST<br>
        http://isystems.unist.ac.kr/
        </td>
    </tr>
</table>

Table of Contents
<div id="toc"></div>



# 1. Tranditional Machine Learning vs. Neural Networks

__Object recognition using machine learning__

<img src="./image_files/ML_clown_fish.png" width = 500>

__Neural Network__

<table style="border-style: hidden; border-collapse: collapse;" width = "96%"> 
    <tr style="border-style: hidden; border-collapse: collapse;">
        <td width = 48% style="border-style: hidden; border-collapse: collapse;">
<img src="./image_files/nn_clown_fish.png" width = 350>
        </td>
        <td width = 48%>
<img src="./image_files/nn_fish.png" width = 350>
        </td>
    </tr>
</table>

___Deep_ Neural Network__

<img src="./image_files/deep_fish.png" width = 500>


# 2. Convolution Neural Networks

CNNs are simply neural networks that use _convolution_ in place of general matrix multiplication in at least one of their layers

## 2.1. Convolution and cross-correlation

- Many machine learning libraries implement cross-correlation, but call it convolution

In [1]:
%%html
<iframe src="https://www.youtube.com/embed/Ma0YONjMZLI" 
width="560" height="315" frameborder="0" allowfullscreen></iframe>

- Discrete convolution can be viewed as multiplication by a matrix

<img src="./image_files/conv_animation.gif" width = 400>

<table style="border-style: hidden; border-collapse: collapse;" width = "96%"> 
    <tr style="border-style: hidden; border-collapse: collapse;">
        <td width = 48% style="border-style: hidden; border-collapse: collapse;">
<img src="./image_files/conv.png" width = 400>
        </td>
        <td width = 48%>
<img src="./image_files/cnn_conv.png" width = 400>
        </td>
    </tr>
</table>


__Sparse interations__

- CNNs, typically have sparse connectivity (sparse weights)

- This is accomplished by making the kernel (convolution mask) smaller than the input
 

__Parameter sharing__

- In CNNs each number of the kernel is used at every position of the input

- Instead of learning a separate set of parameters for every location, we learn only one set


__Equivariance__

- A function $f(x)$ is equivariant to a function $g$ if $f(g(x)) = g(f(x))$

- A convolution layer has equivariance to translation

- If we apply this translation to $x$, then apply convolution, the result will be the same as if we applied convolution to $x$, then applied the transformation to the output

- Note that convolution is not equivariant to some other transformation, such as changes in the scale or rotation of an image

## 2.2. Computation in a neural net

__1) Linear combination__

<img src="./image_files/linear_sum.png" width = 320>

__2) Nonlinear activation function__

<img src="./image_files/ReLU.png" width = 500>


__3) Pooling functions__

<img src="./image_files/max_pool.png" width = 400>

- The maximum of a rectangular neighborhood (max pooling operation)

<img src="./image_files/max_pooling.png" width = 300>

- Other candiates
    - the average of a rectangular neighborhood
    - the $L_2$ norm of a rectangular neighborhood


- Pooling with downsampling
    - reduce the representation size by a factor of 2, which reduces the computational and statistical burden on the next layer

<img src="./image_files/pooling_downsampling.png" width = 400>

- Pooling and translations
    - Pooling helps to make the representation become invariant to small translations of the input
<img src="./image_files/pooling_translation.png" width = 500>
    - Invariance to local translation can be a very useful property if we care more about whether some feature is present than exactly where it is.

    - For example, we need not know the exact location of the eyes in a face

<img src="./image_files/pool_edge.png" width = 500>

__4) Classification__

<img src="./image_files/classification.png" width = 450>

__CNN summary__

<table style="border-style: hidden; border-collapse: collapse;" width = "96%"> 
    <tr style="border-style: hidden; border-collapse: collapse;">
        <td width = 48% style="border-style: hidden; border-collapse: collapse;">
<img src="./image_files/layer_of_cnn.png" width = 450>
        </td>
        <td width = 48%>
<img src="./image_files/cnn.png" width = 400>
        </td>
    </tr>
</table>

## 2.3. Ingredients

- Select important features of the data
    - linear filters
    - pointwise nonlinearity


- Group features that all indicate the same thing
    - pooling


- Repeat to achieve greater abstraction

<img src="./image_files/cnn_ideas.png" width = 500>

# 3. Learning

- Backpropagation and 

- stochastic gradient descent

# 4. How to avoid overfitting?

- Convolutional nets use a prior that stuff in the world does not change identity as it translates

- Data Augmentation
    - Augment the training data by adding jittered versions of each image

<img src="./image_files/data_aug.png" width = 500>

- Dropout
    - Randomly choose edges not to update
    - Insensitive to local changes
    - acting as regularization
    
<img src="./image_files/dropout.png" width = 500>

# 5. How do deep neural nets work?

- Hierarchy of simple, repeated computations

- Sift through data by filtering it

- Build up invariance by pooling alike features

- Can be learned with vanilla SGD

# 6. Software Tools

- Caffe
    - fast and popular
    - hard to use
    - C++ with linited Matlab and Python interfaces

- Theano
    - Symbolic computation and automatic differentiation python

- Torch
    - Lua



<font size='5'><b>Online Video Lectures</b></font>

- <a href="./reference_files/deep_learning_tutorial_2015.pdf" target="_blank">Slides</a> by Phillip Isola

In [4]:
%%html
<iframe src="https://www.youtube.com/embed/bL1Zymz1b7g" 
width="560" height="315" frameborder="0" allowfullscreen></iframe>

- <a href="./reference_files/cnn1.pdf" target="_blank">Slides</a> by Prof. Ali Ghodsi

In [5]:
%%html
<iframe src="https://www.youtube.com/embed/ZMBp7_qqtLE" 
width="560" height="315" frameborder="0" allowfullscreen></iframe>

In [3]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>