# Understanding Convolutions
## This notebook outlines the concepts of Convolutions

### Convolutions are one of the most critical, fundamental building-blocks in computer vision and image processing

### Key Questions
- What are Image Convolutions?
- What do they do?
- Why do we use them?
- How do we apply them?

**Convolution** is simply an element-wise multiplication of two matrices followed by a sum
- Take two matrices
- Multiply them element-by-element (not a dot product)
- Sum the elements together

**Uses**
- blurring (average smoothing, Gaussian smoothing, median smoothing, etc.)
- edge detection (Laplacian, Sobel, Scharr, Prewitt, etc.)


   
### **Convolution**

Image: multi-dimensional matrix
   - width (Number of columns)
   - height (Number of rows)
   
Kernel or Convolutional matrix
   - Tiny matrix
   - Usually a square matrix

Tiny kernel **sits** on top of the big image and **slides** from left-to-right and top-to-bottom, applying a mathematical operation (i.e., a convolution) at each (x, y)-coordinate of the original image.

It’s normal to **hand-define kernels** to obtain various image processing functions

![Convolution](https://raw.githubusercontent.com/subashgandyer/datasets/main/images/Conv-1.png)

We are sliding the kernel from left-to-right and top-to-bottom along the original image.

At each (x, y)-coordinate of the original image, we stop and examine the neighborhood of pixels located at the center of the image kernel. We then take this neighborhood of pixels, convolve them with the kernel, and obtain a single output value. This output value is then stored in the output image at the same (x, y)-coordinates as the center of the kernel.

#### Sqaure 3 x 3 kernel
![Convolution](https://raw.githubusercontent.com/subashgandyer/datasets/main/images/Conv-2.png)

Use an **odd kernel** size to ensure there is a valid integer (x, y)-coordinate at the center of the image

![Convolution](https://raw.githubusercontent.com/subashgandyer/datasets/main/images/Conv-3.png)

With a 3 x 3 matrix, the center of the matrix is obviously located at **x=1, y=1** where the top-left corner of the matrix is used as the origin and our coordinates are zero-indexed.

With a 2 x 2 matrix, the center of this matrix would be located at **x=0.5, y=0.5** 

There is no such thing as pixel location (0.5, 0.5) — our pixel coordinates must be **integers**! 

This reasoning is exactly why we use **odd kernel sizes** — to always ensure there is a valid (x, y)-coordinate at the center of the kernel

A convolution requires three components:
- An input image
- A kernel matrix that we are going to apply and slide on the input image
- An output image to store the output of the input image convolved with the kernel

**Convolution Steps**

- Select an (x, y)-coordinate from the original image
- Place the **center** of the kernel at this (x, y)-coordinate
- Take the element-wise multiplication of the input image region and the kernel, then sum up the values of these multiplication operations into a single value. The sum of these multiplications is called the **kernel output**
- Use the same (x, y)-coordinates from Step #1, but this time, store the kernel output in the same (x, y)-location as the output image

Convolution Output = Kernel Matrix (3 x 3) * Image (3 x 3)

![Convolution](https://raw.githubusercontent.com/subashgandyer/datasets/main/images/Conv-4.png)

![Convolution](https://raw.githubusercontent.com/subashgandyer/datasets/main/images/Conv-5.png)

In [2]:
import numpy as np
import cv2

#### Install scikit-image

In [8]:
! pip install scikit-image



In [10]:
from skimage.exposure import rescale_intensity

#### Convolve function (Custom helper function)

In [11]:
def convolve(image, kernel):
    # grab the spatial dimensions of the image, along with the spatial dimensions of the kernel
    (iH, iW) = image.shape[:2]
    (kH, kW) = kernel.shape[:2]
    
    # allocate memory for the output image, taking care to "pad" the borders of the input image so the spatial
    # size (i.e., width and height) are not reduced
    pad = (kW - 1) // 2
    image = cv2.copyMakeBorder(image, pad, pad, pad, pad, cv2.BORDER_REPLICATE)
    output = np.zeros((iH, iW), dtype="float32")
    
    # loop over the input image, "sliding" the kernel across each (x, y)-coordinate from left-to-right and top to bottom
    for y in np.arange(pad, iH + pad):
        for x in np.arange(pad, iW + pad):
            # extract the ROI of the image by extracting the *center* region of the current (x, y)-coordinates dimensions
            roi = image[y - pad:y + pad + 1, x - pad:x + pad + 1]
            
            # perform the actual convolution by taking the element-wise multiplicate between the ROI and the kernel, then summing the matrix
            k = (roi * kernel).sum()
            
            # store the convolved value in the output (x,y)-coordinate of the output image
            output[y - pad, x - pad] = k
    
    # rescale the output image to be in the range [0, 255]
    output = rescale_intensity(output, in_range=(0, 255))
    output = (output * 255).astype("uint8")
    
    return output

#### Construct average blurring kernels used to smooth an image
- Create smallBlur as 7 x 7 filter
- Create largeBlur as 21 x 21 filter

#### Construct a Sharpening filter

#### Load the image

### filter2D( )

### Other Uses of Convolution in Image Processing
- Blurring
- Edge Detection
- Classification of Images using Convolutional Neural Network

So far, we have designed our own hand-engineered filter in the form of smallBlur, LargeBlur and so on

### **What if there was a way to learn these filters instead?**
Is it possible to define a machine learning algorithm that can look at images and eventually learn these types of operators?

In fact, there is — these types of algorithms are a sub-type of Neural Networks called **Convolutional Neural Networks (CNNs)** 

CNNs are able to **learn filters** that can detect edges and blob-like structures in lower-level layers of the network — and then use the edges and structures as building blocks, eventually detecting higher-level objects (i.e., faces, cats, dogs, cups, etc.) in the deeper layers of the networ by applying the following
- convolutional filters
- nonlinear activation functions
- pooling 
- backpropagation