# Convolutional Neural Networks
---


## ImageNet


![ImageNet](Images/imagenet.jpg)


## What happens if we use this for images? (hint: dimensions)
![Deep](Images/deep.png)


## Neurological Origins


![Cat](Images/cat.jpg)


Hubel and Wiesel. Receptive fields of single neurons in the cat's striate cortex. 1959


* Developed by Yann LeCun (1989)
* Goal was to build automatic mail-sorting machines
* First "real-world" application of Neural Networks


![title](Images/cnn.jpg)  
* Weights are shared across space
* Provides basis for translational invariance

## Convolution


![Convolution](Images/convolution.png)


![Algorithm](Images/alg.png)


![Time Complexity](Images/time.png)


The run-time complexity of the convolutional operation is described by the above equation. Where $F_{n}$ is the number of filters, $H$ is the height of the input image, $W$ is the width of the image, $f_{h}$ is the height of the filter, $f_{w}$ is the width of the filter, and $f_{d}$ is the depth of the filter. It is important to note that the equation only describes the forward pass in a convolutional layer. **Backpropogation through the network is roughly twice as costly as the forward pass. This is because we must compute the loss with respect to the weights and the loss with respect to the inputs.** 




![Matrix](Images/conv_matrix.png)


## Rectified Linear Unit (ReLU)


![ReLU](Images/relu.png)


* Very inexpensive
* Avoid vanishing gradient
* Differentiable at all points except 0
* Can kill neurons (Leaky ReLU)


![Leaky ReLU](Images/leakyrelu.png)


## Pooling



![Pooling](Images/pooling.png)
* Gives small amount of translational invariance at each level
    * Location information of most active neuron is thrown away
* Reduces number of inputs to next layer


## Fully Connected


In [None]:
# forward-pass of a 3-layer neural network:
f = lambda x: 1.0/(1.0 + np.exp(-x)) # activation function (use sigmoid)
x = np.random.randn(3, 1) # random input vector of three numbers (3x1)
h1 = f(np.dot(W1, x) + b1) # calculate first hidden layer activations (4x1)
h2 = f(np.dot(W2, h1) + b2) # calculate second hidden layer activations (4x1)
out = np.dot(W3, h2) + b3 # output neuron (1x1)

## Softmax Output


$$y_{i} = \dfrac{e^{z_{i}}}{\sum_{j=1}^{N}e^{z_{j}}}$$
* Output sums to one
* Represent probability distribution across discrete mutually exclusive alternatives


In [None]:
def softmax(x):
    return np.exp(x)/np.sum(np.exp(x))

## [Softmax Derivative](https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative/)


$$\frac{\partial y_{i}}{\partial z_{i}} = y_{i}(1-y_{i})$$

In [None]:
def softmax_derivative(y):
    return y * (1 - y)

## Cross-entropy Cost Function


$$C = - \sum_{j} t_{j} \log y_{j}$$

* Very large gradient when the target value is 1 and the output is 0

$$\frac{\partial C}{\partial z_{i}} = \sum_{j} \frac{\partial C}{\partial y_{i}} \frac{\partial y_{i}}{\partial z_{i}} = y_{i} - t_{i} $$


In [3]:
def cost(y,t):
    # y: predicted, t: target
    return y - t

## Dropout


![Dropout](Images/dropout.png)

* Combat overfitting
* Randomly drop connections during training
* Regularization


## Understanding CNN's


#### [Weight Visualization](https://www.youtube.com/watch?v=AgkfIQ4IGaM)


![Conv1](Images/conv1.png)


![Conv3](Images/conv3.png)

![Conv5](Images/conv5.png)

#### [Class Activation Mapping](http://cnnlocalization.csail.mit.edu/)


![Code](Images/code.png)


## [TensorFlow](https://www.tensorflow.org/)

* Open source software library for numerical computation using data flow graphs
* Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors)
* Allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API
* Originally developed by researchers and engineers working on the Google Brain Team


## [Keras](https://keras.io/)

* High level API
* Able to use different libraries as backend
    * TensorFlow, Theano, CNTK (Microsoft)
    

## Transfer Learning


![Transfer Learning](Images/transfer.png)

## References

[Stanford CS 231n](http://cs231n.github.io/convolutional-networks/)  
[Keras](https://keras.io/)  
[Keras Tutorials](https://github.com/keras-team/keras/tree/master/examples)  
[TensorFlow](https://www.tensorflow.org/)  
[Deep Visualization Toolbox](https://www.youtube.com/watch?v=AgkfIQ4IGaM)  
[Transfer Learning using Keras](https://towardsdatascience.com/transfer-learning-using-keras-d804b2e04ef8)  
[Dropout](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf)  
