# Introduction to CNNs

## 1. Introduction to convolutional neural networks?

### 1.1 Why?

#### 1.1.1 CNNs can deal better with large images (until now, images used were fairly small)

- Imagine an color image with 500 x 500 pixels, this means you would    end up having 500 x 500 x 3 = 750,000 input features, $(x_1,...,x_{750,000})$.
- next, imagine having 2000 hidden units in the first hidden layer. Then the matrix $w^{[1]}$ would have dimensions (2000 x 750,000), and will have 1.5 billion parameters. So it becomes a very high-dimensional problem!

#### 1.1.1 CNNs have certain features that identify patterns in images because of  "convolution operation"

- Dense layers learn global patterns in their input feature space

- Convolution layers learn local patterns, and this leads to the following interesting features:
    - Unlike with densely connected networks, when a convolutional neural network recognizes a patterns let's say, in the upper-right corner of a picture, it can recognize it anywhere else in a picture. 
    - Deeper convolutional neural networks can learn spatial hierarchies. A first layer will learn small local patterns, a second layer will learn larger patterns using features of the first layer patterns, etc. 
     

### 1.2 What are they used for?
- Image classification
- Object detection in images
- Picture neural style transfer

## 2. The convolution operation 

### 2.1 The basic convolution operation 

The idea: detect edges in your image. Typically, we'll detect vertical or horizontal edges. Let's look at what horizontal edge detection would look like!

![title](conv.png)

This is a simplified 5 x 5 pixel image (greyscale!). You use a so-called "filter" (denoted on the right) to perform a convolution operation. This particular filter operation will detect horizontal edges. The matrix in the left should have number in it (from 1-255, or let's assume we rescaled it to number 1-10). The output is a 3 x 3 matrix. (*This example is for computational clarity, no clear edges*)

What dimension would the output matrix have had if we had started from a 7 x 7 matrix? And a 64 x 64 matrix?

(*Then, create a new example with a clear edge and look at the output*)

In Keras, function for the convolution step is `Conv2D`

### 2.2 Padding

downsides of using filters in images:
- image shrinks with each convolution layer and you're throwing away information in each layer!
    - Starting from a 5 x 5 matrix, and using a 3 x 3 matrix, you end up with a 3 x 3 image. 
    - Starting from a 10 x 10 matrix, and using a 3 x 3 matrix, you end up with a 8 x 8 image. 
    - etc.
- pixels around the edges are used much less in the outputs because of the way that filters work.

Solution for both of these problems: pad your image before applying the convolution! just one layer of pixels around the edges preserved the image size!



Typical layers in a CNN: convolution layers, pooling layers, Fully connected layers
- Convolution layer:
   - Performing a basic convolution: how filters work (to find "edges")
   - Introduce padding (adding edges to pictures to avoid info shrinkage)
   - introduce striding (how filter is shifter along the image)
   - Build your own convolution layer taking into account: padding, striding, filter size, number of filters
- Pooling layer: makes detected features more robust, works well
  - Preserves important features
  - Has no parameters!
  - Max pooling, also average pooling
- Add fully connected layers towards the end


# Extra reading

https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html

https://datascience.stackexchange.com/questions/16463/what-is-are-the-default-filters-used-by-keras-convolution2d