In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

# Geometric Transformations of Images
## Examples of geometric transformations
* Original
* Scaling
* Rotation


## Representing the geometric transformations
* Given $(x,y)$ in input, where is the corresponding pixel in output?
* Translation:
    * $x' = x + t_x$
    * $y' = y + t_y$
* Scaling:
    * $x' = cx$
    * $y' = dy$
* More general transformations: affine
    * $x' = x + t_x$
    * $y' = y + t_y$
    * also covers rotations



* Affine transformations can be represented with matrices
    * $x' = ax + by + t_x$
    * $y' = cx + dy + t_y$
    * $\begin{bmatrix} x' \\ y' \end{bmatrix} = 
    \begin{bmatrix} a & b \\ c & d \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix}
    + \begin{bmatrix} t_x \\ t_y \end{bmatrix}$
        * $\vec p' = M \vec p' + t$

* For the moment, consider this very general form:
    * $\begin{bmatrix} x' \\ y' \end{bmatrix} = T \bigg( \begin{bmatrix} x \\ y \end{bmatrix} \bigg)$
        * $\vec p' = T(\vec p')$
    
    
   

## Geometric manipulations in principle
* Where does pixel $(x,y)$ in the original image go to in the output image?
* If $(x,y)$ goes to $(x',y')$, then:
    * $f'(x',y') = (x,y)$
    * Where $f$ is the input and $f'$ is the output
* e.g. $x' = 0.6x + 0.8y$ and $y' = 0.8x - 0.6y$
* **However**, image pixels have *integer* coordinates
    * So $x,y,x',y'$ must be integers

## Forward and inverse transformations
* Forward transformation:
    * Given $(x,y)$ in original image, where does it go to in the output?
    * $\begin{bmatrix} x' \\ y' \end{bmatrix} = T \bigg( \begin{bmatrix} x \\ y \end{bmatrix} \bigg)$
* Inverse transformation
    * Given $(x',y')$ in the output image, where does it come from in the input?
    * $\begin{bmatrix} x \\ y \end{bmatrix} = T^{-1} \bigg( \begin{bmatrix} x' \\ y' \end{bmatrix} \bigg)$
    * But what if $x$ and $y$ are not integers?
        * $\rightarrow$ *Interpolation*

## Interpolation
* Our view of images: a 2D array of values


* Another view: Image pixels are discrete *samples from an underlying *continuous function of $x$ and $y$*


* How do we reconstruct underlying *continuous function* from *discrete samples?*
* Need to make some assumption
* Typically: underlying continuous function is *smooth* in some way
* So $f(x+\Delta_x, y+\Delta_y)$ and $f(x,y)$ are similar

### Nearest Neighbor Interpolation
* Value is same as that of nearest sample

### Bilinear Interpolation
* Find the four nearest neighbors
    * $(x_l,y_l), (x_l,y_h), (x_h,y_h), (x_h,y_l)$
* Compute the weighted average of the four
    * $g(x,y) = C(x_l,y_l), D(x_l,y_h), A(x_h,y_h), B(x_h,y_l)$

### Other forms of interpolation
* Gaussian interpolation
    * Set new pixels to be weighted combination of known pixels, with **weight = -exp(squared distance)**
    * $g(x,y) = C \sum_{x'} \sum_{y'} e^{-\frac{(x-x')^2+(y-y')^2}{2\sigma^2}} f(x',y')$
* Other weights
    * $g(x,y) = C \sum_{x'} \sum_{y'} w(x,x',y,y') f(x',y')$

## Unsampling
* Unsampling by a factor of 2
    * Where does output pixel $(x',y')$ come from in the input?

## The limits of interpolation

## Some geometric transformations should be easy
* Considering *downsampling* by a factor of 2
* Should require no interpolation: simply drop every other pixel?


## The Fourier Transform
* Every 1D function $f(x)$ can be decomposed into a set of sines and cosines
    * $f(x) = a_1 \sin 1x + a_2 \sin 2x + ... + b_0 \cos 0x + b_1 \cos 1x + b_2 \cos 2x + ...$
* Things are simpler if we consider complex waves
    * instead of $\cos x$ or $\sin x$, use: $e^{ix} = \cos x + i \sin x$
    * $e^{iwx}$ has frequency $\frac{w}{2\pi}$
    * Then every function $f(X)$ can be written as a sum of such functions:
        * $f(x) = a_0e^{i0x} + a_1e^{i1x} + a_2e^{i2x} + ...$
    * The Fourier coefficients of $f(x)$ are (*in increasing order of frequency*):
        * $a_0, a_1, a_2, a_3, a_4, ...$

### The Fourier Transform in 2D
* Images are 2D functions of $x$ and $y$
* A similar fact holds:
    * $f(x,y) = \sum_k \sum_l a_{kl} e^{i(kx+ly}$ for some coefficients
    * If $a_{kl}$ is non-zero for high $k$and $l$, we say the image has high frequency components