# CONVOLUTION

In PyTorch, convolutions are implemented using the __nn.Conv1d, nn.Conv2d, or nn.Conv3d__ modules, depending on the dimensionality of your data. Convolutional layers are the backbone of many deep learning models, particularly in computer vision and time-series applications.

A Convolution operation is a filter(kernel), to an input, e.g (image or a sequence), to produce a feature map. The filter slides across the input, performing  an element-wise multiplication and summation, effectively capturing the local patterns in the data.

![2D Convolution Animation](https://upload.wikimedia.org/wikipedia/commons/1/19/2D_Convolution_Animation.gif)


## Parameters of CNN Layer

1. in_channels
2. out_channels
3. kernel_size
4. Stride
5. padding
6. dilation
7. groups
8. bias

### In_channels

It specifies the number of input channels the layer expects. This is useful for the filters. They represent the dimensions of features of the input data. For an __RGB__ image, the channels are __Red, Green, Blue__ so __in_channels = 3__. For a __grayscale__ image, there is only one channel (intensity - how dark it is), so __in_channels = 1__. In audio or sequential data, channels might represent features like MFCCs, so in_channels depends on the number of extracted features.

### Out_channels

The __out_channels__ parameter in a convolutional layer determines the number of output channels (or feature maps) that the layer produces after applying the convolution operation. This number normally represents how many different feature maps/filters you would want the convolution layer to learn.

Each filter in the convolution layer learns a different set of features from the input, so increasing __out_channels__ increases the number of features the network learns. 

### Kernel Size

It determines the spatial dimensions of the filters (or kernels) applied to the input data. It defines the size of the window that slides over the input tensor to perform the convolution operation.

__2D Convolutions (Images):__

The kernel is a 2D matrix that slides over the 2D input image. The size of the kernel is defined by two dimensions: height and width.
Common choices for kernel sizes are __3x3, 5x5,__ and __7x7__, though other sizes can also be used depending on the problem.

__1D Convolutions (Sequences or Audio):__

The kernel is a 1D vector that slides over the 1D input sequence.
The kernel size is typically an odd number to ensure that there is a center element (like 3, 5, etc.).

Smaller kernels (e.g., 3x3 or 5x5) capture local features in a fine-grained way, allowing the network to focus on small spatial details.

Larger kernels (e.g., 7x7 or more) cover larger portions of the input and may capture more global features, but they also require more parameters and computation.