# Week 1B Image Processing

## Introduction - Image Analysis

The key idea behind this includes some of aspects,
* Manipulation of image data: extract the information necessary for solving an image problem
* Preprocessing, Data Reduction and Feature Analysis
    * Preprocessing removes noise and eliminates irrelevant analysis
    * Data Reduction extracts features for the analysis process
    * During feature analysis, the extracted features are examined and evaluated for their use in the application

## Image Preprocessing

Firstly, the input and output are intensity images that is saying intensity is calculated for each unit of the image. It aims to **improve image**, by *suppressing distortions and enhancing image features* so that the result is more suitable for a specific application.

There are two types of image processing/transformation:
* Spatial domain
* Transform domain

### Spatial Domain Techniques

Two principal categories in spatial processing:
* Intensity transformation (on **single** pixel)
* Spatial filtering (on **pixel and its neighbors**)

Given function for operating directly on image pixels:
$$ g(x,y) = T[f(x,y)] $$
where,
* $f(x,y)$ is the input image
* $g(x,y)$ is the processed image
* $T$ is an operator on $f$, over a pixel at $(x,y)$, or a neighborhood of $(x,y)$

When $T$ is of size $1 \times 1$, $T$ becomes a gray-level transformation function: $s = T(r)$.
where,
* $s$ is the pixel of output image
* $r$ is the pixel of input image

> Some of the example applications are *contrast stretching and thresholding*

![](img/2.png)

![](img/3.png)

## Basic Gray-level Transformations

### Image thresholding

**That is the simple method of image segmentation that converts a gray-scale image into a binary (black-and-white) image**.

* Replace each pixel with a black pixel if $r<t$ and with a white pixel if $r>t$ (t is a fixed value selected as cut-off)
* One of the simplest transformations which is used to segment the image into two binary classes of background and foreground
* Advantage: useful for segmentation to simplify the representation of an image to something easier for analysis
* Disadvantage: if histogram does not have bimodal distribution and it does not have a sharp valley, the determined threshold may not be very useful.

### Image thresholding algorithm - Otsu's method

* Output is a binary image (two classes/variable)
* The algorithm is going to exhaustively search for a threshold that minimizes intra-class variance, by
$$ \sigma_w^2 = w_0(t)\sigma_0^2 + w_1(t)\sigma_1^2\\[1ex]
w_0(t) + w_1(t) = 1 $$
* An equivalent way to maximizing inter-class variance, and is much faster, by
$$ \sigma_b^2 = \sigma^2-\sigma_w^2 = w_0(t)w_1(t)[\mu_0(t)-\mu_1(t)]^2 $$

### Image thresholding algorithm - Balanced histogram thresholding

It is automatic image thresholding algorithm, and its output is a binary image just as the same as `Otsu's method`. Whereas balanced histogram thresholding searches for a threshold that has equal weights on both sides of the histogram (this is equal to median point)

### Multi-band thresholding

Colorful images can also be thresholded. One simple approach is to designate a separate threshold for each for the R, G and B components of the image and then recombine them with an `AND` operation. This just reflects the way how the camera works and how the data is stored in the computer, but it does not correspond to the way that people recognize color.

## Image Negatives (inverse)

It is also a linear transformation called *Negative transformation* defined by,
$$ s = L - 1 - r $$
for input image with gray levels in range [0, L-1], where,
* $L$ is the the number of gray-scale level of the image (e.g., 8 bit image would have 2^8 many levels)

The Image negative is useful for enhancing white or gray detail in dark regions of image, when black areas are dominant

![](img/4.png)

## Log Transformations

Log transformation is defined by,
$$ s = c\ log(1 + r) $$
where $c$ is constant, $r \ge 0$

It maps narrow range of low gray-level values into wider ranger of output values, also compresses dynamic ranger of images with large variations in pixel values.

![](img/6.png)

### Power-Law Transformations (imporve constrast)

Power-Law transformation is defined by,
$$ s = cr^{\gamma} $$
where, $c$ and $\gamma$ are constant

* Family of possible transformations by varying $\gamma$
* Useful in displaying an image accurately on a computer screen by pre-processing images appropriately before display
* Also useful for general-purpose contrast manipulation

![](img/7.png)

### Summarize on some basic gray-level transformations

![](img/5.png)

## Piecewise-linear Transformations

Here goes three types of such transformations:
1. Contrast stretching
2. Gray-level slicing / intensity-level slicing
3. Bit-plane slicing

### Contrast Stretching

The main idea behind this is to increase the dynamic ranger of gray levels in image, which widely used in display devices or recording media to span the full intensity range.

![](img/8.png)
> comparison between contrast stretching and thresholding

### Gray-level Slicing

* Highlighting of specific ranger of gray levels
    * Display high value for all gray levels in ranger of interest, and low value for all others produces binary image
    * Brighten the desired ranger of gray levels, while preserving background and other gray-scale tones of image
    
![](img/9.png)

### Bit-plane Slicing

A **bit plane** of a digital discrete signal is a set of bits corresponding to given bit position in each of the binary number s representing the signal.

* Highlights contribution made to total image appearance by specific bits
* The higher the number of the bit plane, the less is its contribution to the final stage
* Useful in compression and feature extraction

![](img/10.png)

## Histogram Processing

### Histogram Equalization

Aim: To get an image with equally distributed brightness levels over the whole brightness scale.
Results: Enhances contrast for brightness values near histogram maximum, decreases contrast near minimum

$ H(r) = n $, where $r$ is the gray level, $n$ is the number of pixels of that gray level

Let $r$ represents gray levels of the image and then be normalized in [0, L-1], where
* $r = 0$ represents black
* $r = L-1$ represents white

Consider transformations of the form:
$$ s = T(r),\ 0 \le r \le L-1 $$

Also assume that $T(r)$ satisfies:

* $T(r)$ is single-valued and monotonically increasing in $0 \le r \le L-1$ meaning that the inverse transformation exists and monotonicity preserves pixel order
* $0 \le T(r) \le L-1$ for $0 \le r \le L-1$ meaning that output gray levels will be in the same range as input levels

For discrete values, we get probabilities and summations instead of PDFs and integrals:
$$ p_r(r_k) = n_k / MN,\ k = 0, 1, ..., L-1 $$
where,
* $MN$ is total number of pixels in image
* $n_k$ is the number of pixels with gray level $r_k$
* $L$ is total number of gray levels

So, we have
$$ s_k = T(r_k) = (L-1)\sum_{j=0}^k p_r(r_j) = \frac{L-1}{MN}\sum_{j=0}^kn_j $$

### Histogram Matching

Aim: To get an image with a specified histogram (brightness distribution)