## Convolution

At its core, convolution is a mathematical operation that combines two functions or signals to produce a third function. It essentially shows how the shape of one is modified by the other. Imagine you have two signals (or functions), and you want to understand how one signal modifies or affects the other. Convolution provides a way to do this.

How can we use convolution in neural networks?

The convolutional layer is the core building block of a CNN, and it is where the majority of computation occurs. It requires a few components, which are input data, a filter, and a feature map. Let’s assume that the input will be a color image, which is made up of a matrix of pixels in 3D. This means that the input will have three dimensions—a height, width, and depth—which correspond to RGB in an image. We also have a feature detector, also known as a kernel or a filter, which will move across the receptive fields of the image, checking if the feature is present. 

![convolution1.png](attachment:convolution1.png)

The feature detector is a two-dimensional (2-D) array of weights, which represents part of the image. While they can vary in size, the filter size is typically a 3x3 matrix; this also determines the size of the receptive field. The filter is then applied to an area of the image, and a dot product is calculated between the input pixels and the filter. This dot product is then fed into an output array. Afterwards, the filter shifts by a stride, repeating the process until the kernel has swept across the entire image. The final output from the series of dot products from the input and the filter is known as a feature map, activation map, or a convolved feature

Note that the weights in the feature detector remain fixed as it moves across the image, which is also known as parameter sharing. Some parameters, like the weight values, adjust during training through the process of backpropagation and gradient descent. However, there are three hyperparameters which affect the volume size of the output that need to be set before the training of the neural network begins. These include:

### Why is convolution useful in neural network design?

__Local Feature Detection__: Convolution operations are excellent for detecting local features such as edges, textures, and shapes in images. Since these local features are often the building blocks of more complex patterns, convolution allows neural networks to build up an understanding of images in a hierarchical manner.

__Parameter Efficiency__: Traditional neural networks (fully connected networks) require a vast number of parameters, leading to increased computational cost and higher risk of overfitting. Convolution layers share parameters across the entire input, significantly reducing the number of parameters and making the network more efficient and scalable.

__Translational Invariance__: Convolutional networks learn to recognize patterns regardless of their position in the input space. This property, known as translational invariance, means that once the network learns a feature (like an edge in a specific orientation), it can recognize that feature anywhere in the input

__Hierarchical Feature Learning__: In CNNs, layers are arranged hierarchically. Lower layers tend to learn simple features like edges or basic textures, while higher layers combine these simple features to detect more complex patterns. This hierarchical approach is effective for understanding complex structures in data.

    Versatility and Adaptability: Convolutional layers can be adapted for various types of data and tasks. While they are predominantly used for image data, they can also be applied to other types of data like time series, audio, and even text, where there is some spatial or temporal structure.

    Improved Performance in Vision Tasks: For tasks such as image classification, object detection, and segmentation, CNNs have been shown to significantly outperform traditional machine learning approaches due to their ability to directly learn from raw image data.

    End-to-End Learning: Convolutional layers allow for end-to-end learning, where features are automatically learned from the data rather than relying on hand-crafted features, which might miss important characteristics or be biased towards the designer's assumptions.



 - Number of features, the number of kernels. This parameter set the number of output dimensions, for example with 1 I would have an output dimension of (64,64,1), with 3 it would be (64,64,3)
 - Stride, it represents the "jump" the kernel does while convolving the image. A padding > 1 will "compress" the feature response, for example using a stride of 2 will transorm an imput of (64,64,1) into (32,32,1) 
 - Padding is used to address the filter behaviour on the border areas. 
    - Valid padding: This is also known as no padding. In this case, the last convolution is dropped if dimensions do not align.
    - Same padding: This padding ensures that the output layer has the same size as the input layer
    - Full padding: This type of padding increases the size of the output by adding zeros to the border of the input.



![convolution2.png](attachment:convolution2.png)