# Convolutional Neural Networks

### References

* http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf
* https://pdfs.semanticscholar.org/450c/a19932fcef1ca6d0442cbf52fec38fb9d1e5.pdf
* https://arxiv.org/pdf/1609.04112.pdf
* https://arxiv.org/pdf/1502.01852.pdf
* http://ais.uni-bonn.de/papers/icann2010_maxpool.pdf
* https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
* https://rdipietro.github.io/friendly-intro-to-cross-entropy-loss/
* https://peterroelants.github.io/posts/cross-entropy-softmax/

ToDo:

* What do the different objects and parametrizations in Keras do?
  * https://keras.io/layers/convolutional/
  * https://keras.io/layers/pooling/
  * https://keras.io/preprocessing/image/

# Theoretical Foundations

## The Convolution Operation

ToDo: 

* Jianxin Wun paper https://pdfs.semanticscholar.org/450c/a19932fcef1ca6d0442cbf52fec38fb9d1e5.pdf
* Jay Kuo paper https://arxiv.org/pdf/1609.04112.pdf
* Kaiming He paper https://arxiv.org/pdf/1502.01852.pdf

$$ (f \ast g) = \int_{-\infty}^{\infty}f(\tau)g(t-\tau)d\tau $$

<img src='resources/cnn-components.png'>

## Pooling

ToDo:

* Dominik Scherer paper http://ais.uni-bonn.de/papers/icann2010_maxpool.pdf

Induces *spatial invariance* on the feature map and reduces dimensionality, by keeping the detected features and dropping the image size.

## Softmax Activation and Cross-Entropy Loss

The **softmax function**, or **normalized exponential function**, is a generalization of the logistic function that *"squashes"* a $K$-dimensional vector $z$ of arbitrary real values to a $K$-dimensional vector $\sigma(z)$ of real values, for $K\geq2$ where each entry is in the interval $(0,1)$ and all the entries add up to $1$.

$$ \sigma_j(z) = \frac{e^{z_j}}{\sum_ke^{z_k}} $$

Cross entropy:

$$ H(p,q) = -\sum_xp(x)\log q(x) $$

## The Convolutional Neural Network Process

ToDo:

* Adit Deshpande paper https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html

---

1. Transform the input image into a matrix of pixel intensities.
2. Apply a set of filters or feature detectors with their respective activations (usually ReLU or Leaky ReLU).
3. Apply a pooling operation to induce spatial invariance and reduce dimensionality.
4. Repeat steps 2 and 3 to keep obtaining features.
5. Flatten the pooled feature maps.
6. Input the flattened feature map into a fully connected neural network to be used on regression or classification tasks.
7. Backpropagate errors.