# Image Processing



In [0]:
# All tasks and excercises are done in Tools.ipynb

There are four aspects of image processing: (In any Computer Vision pipeline)
- **Aquisition**

  Digitally encoded representation of the visual characteristics of a real world object. 
  
  Low level digital representation of world scene
   
- **Processing**
  - Noise removal
  - Smoothimg
  - Sharpening 
  - Contrast enhancement
  
  Altering the apperance/ enhancing the image.

- **Compression**

  Efficient storage of image to save space and efficient commucation by saving network bandwidth

- **Display**

  Rendering the image on reproduction media.

### What is an Image?

Pixel elements compose a digital image: `M x N pixels`

Number of pixels, use to give the rectangular shape of the image, along with intensity values of the pixesls define an image.

### **Greycale images**
The pixels take 8 bit values ranging from $0$ to $(2^8 - 1)$. Depends on the bit-image.

The value represents the grey level of the pixel.

### **RGB Images**
Has three colour channels: Red, Green and Blue.

Has $8 * 3 = 24$ bit pixels representations.

### **Binary** 
Combinations of 0's and 1's where 1 represents white and 0 represents black.

### **Mutispectral Images**

Captures the image data across specific wavelength ranges across the EM spectrum.

Not just in visible light but in many bands such as IR, UV.

### **Stereo Images**

Have more information than from a normal image as the depth factor is present for depth perception. Just as in humans, the idea behind having two eyes.

### **Multiview Images**

Side, top, and angled views of an object for 3D reconstruction or any other application.


In [0]:
# Task: To read and display images using OpenCV or any other Python libraries and view dimensions

### Spatial Domain Processing

Spatial domain ante image plane, processing is done on the image plane where internsity values are manipulated as needed.

- **Directly on the plane**

  Go to (x, y)  and change the value.

- **Point To Point**

  Map the value at (x, y) to another value at (x, y) in another image/ matrix.

- **Neighbourhood To Point (Local)**

  The neighbourhood of the point (A window around the point), along with the point itself is used to generate a value at (x, y)

- **Global To Point**

  The whole image is used to generate a value at (x, y)

### Intensity Transformations

Mapping the intensity value $z$ at pixel $(x, y)$ to a new value $z'$ using function $g$:

\begin{equation}
z' = g(z)
\end{equation}

The function $g$ at the most basic level can be of three types:

- **Linear**
- **Logarithmic**
- **Power Law / Gamma Transform**

Graph Input Gray Level $r$ versus Output Gray Level $s$ trends:
- **Identity function**: No change
- **Negation**: Flips the image
- **Log**: Produces more positive change than $n^{th}$ root 
- **Inverse Log**: Produces more negative change than $n^{th}$ power


### Linear Intensity Transformation

To enhance white or black in an image by flipping using negative transformation. All blacks go to white and vice versa, to gain some clarity.

*Example:* A mamogram

### Log Transform

\begin{equation}
s = c * log(1 + r)
\end{equation}

*Example:* Fourier Spectrum




In [0]:
# Task: Perform intensity transformations

### Power Law Transform / Gamma

\begin{equation}
s = c * r ^ {\gamma}
\end{equation}

\begin{equation}
\gamma = (-\infty, \infty)
\end{equation}

A parameter representing how large you want to transform your image.


When $\gamma$ is large the range of output instensities is large. Larger range will help distinguish objects better.

When $\gamma$ is small the range of output instensities is small.

$\gamma$ greater than 1 gives high intensities, that is a more black image.

$\gamma$ lesser than 1 gives low intensities that is a more washed/ white image.

In [0]:
# Excercise: Perform Power Law tansformation for different values of gamma

### Histogram Grayscale

Frequencies of intensity values are shown on histograms. Remember the bins?

In [0]:
# Task: Represent an image using its intensity histogram 

The histogram doesn't show the distribution of intensities, as in how the intensities are spread across the image. This is beacuse we are not corresponding it with the pixel location, we're throwing away that information :(

Hence two different images can have the same statistics/ same histograms! Hence we cannot reconstruct images from their histograms.


### Histogram Of Colour Intensities

Plot three histograms for RGB and combine them to obtain more comprehensie statistics.

In [0]:
# Excercise: Show colour histograms

### Contrast

Constrast in a naive sense is difference is between the minimum and maximum intensity values.

When the spread of intensities is large, high contrast and when the spread is less the image has a low contrast.

### Constrast Stretching

Low constrast images are in general useless. The features are seen better in high contrast and hence the imageis put through contrast stretching where the range of intensities is increased.

For every pixel $a$,
\begin{equation}
f(a) = a_{min} + (a - a_{low}) \frac{a_{max} - a_{min}}{a_{high} - a_{low}}
\end{equation}

The range is expanded from $[a_{low}, a_{high}]$ to $[a_{min}, a_{max}]$

I am not sure why the division by 255 is done everywhere, why can't we do the math with [0, 255]? 


In [0]:
# Excercise: Do contrast stretching

### Thresholding

- Used to genearte a mask: *Object Segmentation*.
- Used to perform morphological operations as well.

Global to point tranformation as we look at the entire image and decide threshold!

Binarization is done most of the times , where for every pixel $a$,

$$
f_{threshold}(a) = \left\{
        \begin{array}{ll}
            a_0 & \quad x < a_{threshold} \\
            a_1 & \quad x \geq a_{threshold}
        \end{array}
    \right.
$$

In binarization $a_0 = 0$, and $a_1 = 1$.

Histograms are helpful in choosing the threshold value. In case there is no significant drop in the histogram, we must prefer a different form of thresholding.

### Adaptive Thresholding

Neighborhood to point tranformation as we look at a window in the image and decide threshold for a point in the final image.

Need not always work, is an area of research.

In [0]:
# Excercise: Simple thresholding
# Read: Adaptive thresholding maths behind it 

### Filtering

Neighbourhood to point transformation.

Modifying an enhancing an image, remove feature or highlight them.
$$
  g(x, y) = T \cdot f(x, y)
$$

$T$ operates on a window of pixels.

### Convolution

Neighborhood to point operation where the output pixels is the weighted sum of the neighbourhood pixels.

Mask filter $H$ values multiplied with the image pixel values that it is covering and then summed. 

$$
I'(u, v) = \sum_{(i,j)\epsilon R_\mu} I(u + i)
 \cdot H(i, j)$$

$R_\mu$ is the set of all pixels covered by the filter.

To do convolution over the edge pixels, we must pad the image with appropriate thickness of 0 zero filled rows around the image.
 

In [0]:
# Excercise: Write a generic convolution function

### Smoothing / Low Pass Values

Reducing noise and eliminating small details. 

Example,

$$
H = \frac{1}{9}
\begin{bmatrix}
1 & 1 & 1 \\
1 & 1 & 1 \\
1 & 1 & 1
\end{bmatrix}
$$

### Sharpening / High Pass Values

Highlight fine details and enhance blurry images.

Generally 3 x 3 as larger means more effect of neighbouring pixels.

Increases the brightness of center.

Example,

$$
H = 
\begin{bmatrix}
0 & -1 & 0  \\
-1 & 5 & -1 \\
0 & -1 & 0
\end{bmatrix}
$$

Corners 0 means adjacents are only considered. 

Example,

$$
H = 
\begin{bmatrix}
-1 & -1 & -1  \\
-1 & 5 & -1 \\
-1 & -1 & -1
\end{bmatrix}
$$

Now, here diagonal is also considered.

The desired image after filtering is the original plus an appropriately scales high pass image.

$$
f_s = f + \lambda f_h
$$

$\lambda $ greater means center is sharpness increase.

### Edge Enhancement

#### Prewitt Opreator

Horizontal and vertical edges are detected.

$$
H^P_x = 
\begin{bmatrix}
-1 \; & 0 \; & 1  \\
-1 & 0 & 1 \\
-1 & 0 & 1
\end{bmatrix}
$$

$$
H^P_y = 
\begin{bmatrix}
-1 & -1 & -1\\
0 & 0 & 0\\
1 & 1 & 1
\end{bmatrix}
$$

#### Sobel Operator 

$$
H^P_x = 
\begin{bmatrix}
-1 & 0 & 1  \\
-2 & 0 & 2 \\
-1 & 0 & 1
\end{bmatrix}
$$

$$
H^P_y = 
\begin{bmatrix}
-1 & -2 & -1\\
0 & 0 & 0\\
1 & 2 & 1
\end{bmatrix}
$$

The edges obtained after using the the matrices on the image. The two results are then combined using magnitude. 

$$
magnitude = \sqrt{x^2 + y^2}
$$

Magnitude is added to the original image to emphasise the edges. Done in the same manner as linear shift.



In [0]:
# Task: Calculate number of windows using edge enhancement