### 1. What exactly is a feature?


In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as points, edges or objects.

What is a feature in computer vision? In computer vision, a feature is a measurable piece of data in your image which is unique to this specific object. It may be a distinct color in an image or a specific shape such as a line, edge, or an image segment. A good feature is used to distinguish objects from one another.

### 2. For a top edge detector, write out the convolutional kernel matrix.


In image processing, a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an image.

Given two array X[] and H[] of length N and M respectively, the task is to find the circular convolution of the given arrays using Matrix method. Multiplication of the Circularly Shifted Matrix and the column-vector is the Circular-Convolution of the arrays.

Examples: 

Input: X[] = {1, 2, 4, 2}, H[] = {1, 1, 1} 

Output: 7 5 7 8

Input: X[] = {5, 7, 3, 2}, H[] = {1, 5} 

Output: 15 32 38 17 


Explanation: 
 
* Create a circularly shifted Matrix circular_shift_mat of K * K using the elements of array whose length is maximum(Xn in this case) where K is MAX(N, M). 
![image.png](attachment:image.png)
* Create a column-vector col_vec of length K
* Insert the elements of the array Hm into the col_vec in positions [0, m).
* As K = max(N, M), here N; M < K. Therefore fill the rest of the positions of col_vec [m, K) with 0.Therefore the col_vec will be
col_vec = { 1, 1, 1, 0 }
* Multiply the circular_shift_mat and the col_vec
* Multiplication of the Circularly Shifted Matrix (circular_shift_mat) and the column-vector (col_vec) is the Circular-Convolution of the arrays.

Approach: 
 

* Create a Circularly shifted Matrix of N * N using the elements of array of the maximum length.
* Create a column-vector of length N using elements of another array and fill up rest of the positions by 0.
* Multiplication of Matrix and the column-vector is the Circular-Convolution of arrays.

### 3. Describe the mathematical operation that a 3x3 kernel performs on a single pixel in an image.



The use of Kernels - also known as convolution matrices or masks - is invaluable to image processing. Techniques such as blurring, edge detection, and sharpening all rely on kernels - small matrices of numbers - to be applied across an image in order to process the image as a whole.

So what is a kernel? In image processing a Kernel is simply a 2-dimensional matrix of numbers. While this matrix can range in dimensions, for simplicity this article will stick to 3x3 dimensional kernels. An example of a kernel is shown below:
![image.png](attachment:image.png)
How does this matrix relate to image processing? An image is just a 2-dimensional matrix of numbers, or pixels. Each pixel is represented by a number - depending upon the image format these numbers can vary: for an 8 bit RGB image each pixel has a red, green, and blue component with a value ranging from 0 to 255. A kernel works by operating on these pixel values using straightforward mathematics to construct a new image. Lets take the above kernel and do some math: for each pixel, center the kernel over the pixel, multiply the kernel values times the corresponding pixel values, and add the result - this final value is the new value of the current pixel.
![image-2.png](attachment:image-2.png)
As each pixel is processed, a new image emerges based upon the calculated values. The new image is highly dependent upon the kernel used - each kernel has specific properties depending upon its values. Take the kernel demonstrated above: the mathematics of this matrix results in a value that is the average of all pixels in a 3x3 pixel grid. In short - each pixel is the average of its neighbors - this results in a blurred image.
![image-3.png](attachment:image-3.png)

### 4. What is the significance of a convolutional kernel added to a 3x3 matrix of zeroes?



If all border values of a kernel are set to zero, then system will consider it as a 3x3 matrix.
The filter studies successively every pixel of the image. For each of them, which we will call the “initial pixel”, it multiplies the value of this pixel and values of the 8 surrounding pixels by the kernel corresponding value. Then it adds the results, and the initial pixel is set to this final result value.

### 5. What exactly is padding?



Padding is simply a process of adding layers of zeros to our input images so as to avoid the problems mentioned above. This prevents shrinking as, if p = number of layers of zeros added to the border of the image, then our (n x n) image becomes (n + 2p) x (n + 2p) image after padding.

Padding is simply a process of adding layers of zeros to our input images so as to avoid the problems mentioned above. 
![image.png](attachment:image.png)
* This prevents shrinking as, if p = number of layers of zeros added to the border of the image, then our (n x n) image becomes (n + 2p) x (n + 2p) image after padding. So, applying convolution-operation (with (f x f) filter) outputs (n + 2p – f + 1) x (n + 2p – f + 1) images. For example, adding one layer of padding to an (8 x 8) image and using a (3 x 3) filter we would get an (8 x 8) output after performing convolution operation.

* This increases the contribution of the pixels at the border of the original image by bringing them into the middle of the padded image. Thus, information on the borders is preserved as well as the information in the middle of the image.

### 6. What is the concept of stride?



Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a time and so on.

Stride is a component of convolutional neural networks, or neural networks tuned for the compression of images and video data. Stride is a parameter of the neural network's filter that modifies the amount of movement over the image or video. For example, if a neural network's stride is set to 1, the filter will move one pixel, or unit,  at a time. The size of the filter affects the encoded output volume, so stride is often set to a whole integer, rather than a fraction or decimal.

* How does Stride work?
![image.png](attachment:image.png)
Imagine a convolutional neural network is taking an image and analyzing the content. If the filter size is 3x3 pixels, the contained nine pixels will be converted down to 1 pixel in the output layer. Naturally, as the stride, or movement, is increased, the resulting output will be smaller. Stride is a parameter that works in conjunction with padding, the feature that adds blank, or empty pixels to the frame of the image to allow for a minimized reduction of size in the output layer. Roughly, it is a way of increasing the size of an image, to counteract the fact that stride reduces the size. Padding and stride are the foundational parameters of any convolutional neural network. 
![image-2.png](attachment:image-2.png)

### 7. What are the shapes of PyTorch's 2D convolution's input and weight parameters?



* Shape:

Input: (N, C_{in}, H_{in}, W_{in})(N,C 
in
​
 ,H 
in
​
 ,W 
in
​
 )


Output: (N, C_{out}, H_{out}, W_{out})(N,C 
out
​
 ,H 
out
​
 ,W 
out
​
 ) where

H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor
H 
out
​
 =⌊ 
stride[0]
H 
in
​
 +2×padding[0]−dilation[0]×(kernel_size[0]−1)−1
​
 +1⌋
W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor


W 
out
​
 =⌊ 
stride[1]
W 
in
​
 +2×padding[1]−dilation[1]×(kernel_size[1]−1)−1
​
 +1
 
 ![image.png](attachment:image.png)

### 8. What exactly is a channel?



Channels refer to the number of colors. For example, there are three channels in a RGB image, the Red Channel, the Green Channel and the Blue Channel. Each of the channels in each pixel represents the intensity of each color that constitute that pixel.



### 9.Explain relationship between matrix multiplication and a convolution?


Convolution. Convolution is the process of adding each element of the image to its local neighbors, weighted by the kernel. This is related to a form of mathematical convolution. The matrix operation being performed—convolution—is not traditional matrix multiplication, despite being similarly denoted by *

you simply multiply the signal(X)-which is matrix with Signal(Y), which is also a matrix. So, now you will be able to understand that, Yes convolution is same as matrix multiplication(where matrix X and Y matrix of signal) but ONLY IN FREQUENCY DOMAIN.