# Image processing
## Feature extraction

## JPEG

1. Convert RGB $\rightarrow$ YCrCb and sub-sampling (4:4:4, 4:2:2 or 4:2:0)
2. Divide into 8x8 blocks
3. DCT of each block
4. Quantization
5. Lossless compression (RLE, Huffman)

#### Quantization matrix for q=50

![](images/jpg_quant.svg)

#### Example result of quantization of DCT of an image

![](images/jpg_quant2.svg)

#### "Zig-zag" scheme used to order value befor losless compression

<img src="images/JPEG_ZigZag.svg" style="width:300px">

## Convolution filters

####  Salt and pepper

![](images/pepper_salt.png)

#### Mean filter

![](images/pepper_mean.png)

#### Gaussian filter

![](images/pepper_gauss.png)

### Median filter

![](images/pepper_median.png)

## Morphological operations

#### Dilation

![](images/dilation.png)

#### Erosion

![](images/erosion.png)

#### Dilation of "salt" image

![](images/astronaut_dilate.png)

#### Erosion of "salt" image

![](images/astronaut_erode.png)

#### Opening

![](images/astronaut_open.png)

#### Closing

![](images/astronaut_close.png)

## Hougha transform

![](images/hough_simple.png?1)

#### Hougha transform of a natural image

![](images/hough_camera.png)

#### Maxima of Houhgh transform displayed as lines

![](images/hough_lines.png)

## Harris corner detection

(1988)

\begin{equation}
E(u,v)=\sum_{x,y} \underbrace{w(x,y)}_{\text{window}} [\underbrace{I(x+u,y+v)}_{\text{shifted intensity}} - \underbrace{I(x,y)}_{\text{intensity}}]^2
\end{equation}

#### Formula intuition

<img src="images/harris.png" style="width:500px">

#### Using Taylor series expansion

\begin{equation}
E(u,v)\cong
\begin{bmatrix}
u & v
\end{bmatrix}
M
\begin{bmatrix}
u \\ v
\end{bmatrix}
\end{equation}

\begin{equation}
M=\sum_{x,y} w(x,y) 
\begin{bmatrix}
I^2_x & I_xI_y \\
I_xI_y & I^2_y
\end{bmatrix}
\end{equation}

#### Eigenvalues

<img src="images/harris2.png?1" style="width:500px">

#### Corner response value

\begin{equation}
R=\det M - k (\text{trace } M)^2
\end{equation}

\begin{equation}
\det M = \lambda_1\lambda_2 \\
\text{trace } M = \lambda_1 + \lambda_2
\end{equation}


#### Example of response on an image

![](images/checker_harris.png)

## Laplace operator

\begin{equation}
L(x,y)=\frac{\partial^2I}{\partial x^2} + \frac{\partial^2I}{\partial y^2}
\end{equation}

![](images/derivs.png)

#### Laplacian of an image

![](images/laplacian.png)

#### Thresholding of Laplacian

![](images/laplacian2.png)

### Laplacian of Gaussian

\begin{equation}
LoG(x,y)=-\frac{1}{\pi\sigma^4}\left[1-\frac{x^2+y^2}{2\sigma^2}\right]e^{\left[-\frac{x^2+y^2}{2\sigma^2}\right]}
\end{equation}

#### Different $\sigma$ values


![](images/log1.png)

#### Approximations

![](images/log_dog.png)

## SIFT

Scale-Invariant Feature Transform (2004)

1. Detect extrema in the DoG pyramid for different $\sigma$ values  
2. Keypoint location fine-tuning 
3. Determine orientation using gradient
4. Compute a descriptor
	- 16x6 macro-block divided into 16 4x4 blocks, each represented by a 8-bin orientation histogram
    - gives 128 values per descriptor

#### Calculating DoG

![](images/sift_dog.jpg)

#### Keypoints on a sample image

![](images/sift_keypoints.jpg)

### Other methods

- SURF - Speeded-Up Robust Features (2006)
- FAST - Features from Accelerated Segment Test (2006, 2010)
- CenSuRe/STAR - Center Surround Extremas (2008)
- BRIEF - Binary Robust Independent Elementary Features (2010)
- ORB - Oriented FAST and Rotated BRIEF (2011)

## Deep neural networks

![](images/hierarchical_features.png)

#### Autoencoders

![](images/autoencoder.png)

#### Autoencoder trained on the MNIST database

![](images/ae_minst.jpg)

[Youtube](https://www.youtube.com/watch?v=pM68et1o3Zk)

### Transfer learning

![](images/imagenet_vgg16.png)