In [1]:
import cv2
import numpy as np

# Transformation

Transformations are geometric distortions applied on an image. These distortions can be as simple as rotaion, scaling, cropping, etc. Transformation is done to remove perspective issues or correct distortions from an image.

Types of transformations are:
1. Affine Transformation<br>
2. Non-affine Transformation

## Affine transformation

An affine transformation is any transformation that preserves collinearity (i.e., all points lying on a line initially still lie on a line after transformation) and ratios of distances (e.g., the midpoint of a line segment remains the midpoint after transformation).

While an affine transformation preserves proportions on lines, it does not necessarily preserve angles or lengths. Any triangle can be transformed into any other by an affine transformation, so all triangles are affine and, in this sense, affine is a generalization of congruent and similar.

In Affine transformation, all parallel lines in the original image will still be parallel in the output image.

Affine Transform	Example	Transformation Matrix
- Translation

\begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\ t_x & t_y & 1\end{bmatrix}

tx specifies the displacement along the x axis

ty specifies the displacement along the y axis.

- Scale

\begin{bmatrix}s_x & 0 & 0\\0 & s_y & 0\\ 0 & 0 & 1\end{bmatrix}
sx specifies the scale factor along the x axis

sy specifies the scale factor along the y axis.

- Shear
\begin{bmatrix}1 & sh_y & 0\\sh_x & 1 & 0\\ 0 & 0 & 1\end{bmatrix}

shx specifies the shear factor along the x axis

shy specifies the shear factor along the y axis.

- Rotation
\begin{bmatrix}\cos(q) & \sin(q) & 0\\-\sin(q) & \cos(q) & 0\\ 0 & 0 & 1\end{bmatrix}

q specifies the angle of rotation

## Translations: moving images up, down, left & right

### cv2.warpAffine(src, M, dsize)

- src : source image.
- M : Translation matrix.
- dsize : size of the output image, it is a tuple denoting height and width (height, width).

To apply any type of transformation on an image we need to define that transformation in terms of a mathematical formula,

\text{T = M $\cdot\begin{bmatrix} x & y & 1\end{bmatrix}^{T}$}

Where, T is the output image's pixel on (x,y), M is the Translation matrix and (x,y) the pixel cordinates of input image.

$$\text{M = $\begin{bmatrix} 1 & 0 & T_x \\ 0 & 1 & T_y \end{bmatrix}$}$$

$T_{x}$ = shift along x-axis (horizontal)<br>$T_{y}$ = shift along y-axis (vertical)

In [2]:
img = cv2.imread('image.jpg')
cv2.imshow('image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

height, width = img.shape[:2]

In [3]:
M = np.float32([[1, 0, height/10], [0, 1, width/10]])

img_translation = cv2.warpAffine(img, M, (height, width))
cv2.imshow('Translated image', img_translation)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Rotations

### cv2.warpAffine(src, M, dsize)

- src : source image.
- M : Rotation matrix.
- dsize : size of the output image, it is a tuple denoting height and width (height, width).

$$\text{M = $\begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix}$}$$

Rotation matrix can be generated with function,

### cv2.getRotationMatrix2D(center, angle, scale)

- center : it is a tuple denoting center of the rotation in the source image.
- angle : Rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).
- scale : Isotropic scale factor.

cv2.getRotationMatrix2D() function does scalling and rotation at the same time.

In [4]:
M = cv2.getRotationMatrix2D((height/2, width/2), 90, 1)
rotated_img = cv2.warpAffine(img, M, (height, width))

cv2.imshow('Rotated image', rotated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Resizing

Image resizing refers to the scaling of images.It helps in reducing the number of pixels from an image and that has several advantages e.g. It can reduce the time of training of a neural network as more is the number of pixels in an image more is the number of input nodes that in turn increases the complexity of the model.

It also helps in zooming in images. Many times we need to resize the image i.e. either shirk it or scale up to meet the size requirements. OpenCV provides us several interpolation methods for resizing an image. For resizing we have 2 methods:

### Method 1 : cv2.resize(src, dsize, fx, fy, interpolation)

- src : source/input image.
- dsize : a tuple denoting custom height and width of the output image.
- fx : scale factor along the horizontal axis.
- fy : scale factor along the vertical axis.
- interpolation : sets the type of interpolation technique to be used while resizing.

Choice of Interpolation Method for Resizing –

- cv2.INTER_AREA: This is used when we need need to shrink an image.
- cv2.INTER_CUBIC: This is slow but more efficient, uses 4x4 neighborhood.
- cv2.INTER_LINEAR: This is primarily used when zooming is required. This is the default interpolation technique in OpenCV.
- cv2.INTER_NEAREST: a nearest-neighbor interpolation.
- cv2.INTER_AREA: resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the cv2.INTER_NEAREST method.
- cv2.INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood

Image interpolation occurs when you resize or distort your image from one pixel grid to another. Image resizing is necessary when you need to increase or decrease the total number of pixels, whereas remapping can occur when you are correcting for lens distortion or rotating an image. Zooming refers to increase the quantity of pixels, so that when you zoom an image, you will see more detail.

Interpolation works by using known data to estimate values at unknown points. Image interpolation works in two directions, and tries to achieve a best approximation of a pixel's intensity based on the values at surrounding pixels.

**Note:** while using cv2.resize() if fx and fy are given same value then they will preserve the aspect ratio of the image, otherwise the aspect ratio will change, the same will happen when we give dsize parameter.

**Note:** when both dsize and fx, fy are given dsize will be given priority.

In [5]:
resize_img = cv2.resize(img, None, fx = 0.5, fy = 0.5)
cv2.imshow('Resized image: fx, fy', resize_img)
cv2.waitKey(0)

resize_img_2 = cv2.resize(img, (500,500))
cv2.imshow('Resized image: dsize', resize_img_2)
cv2.waitKey(0)

resize_img_3 = cv2.resize(img, (500,500), fx = 0.5, fy = 0.5)
cv2.imshow('Priority is given to dsize', resize_img_3)
cv2.waitKey(0)

resize_img = cv2.resize(img, None, fx = 1.25, fy = 1.25, interpolation = cv2.INTER_CUBIC)
cv2.imshow('Resized image: giving cv2.INTER_CUBIC interpolation', resize_img)
cv2.waitKey(0)

cv2.destroyAllWindows()

### Method 2 : Image pyramid

Image Pyramids are one of the most beautiful concept of image processing.Normally, we work with images with default resolution but many times we need to change the resolution (lower it) or resize the original image in that case image pyramids comes handy.

We use 2 functions cv2.pyrUp() and cv2.pyrDown(). 

cv2.pyrUp() scales the image to twice its current size and cv2.pyrDown() scales the image to half its current size.

### cv2.pyrUp(src)

### cv2.pyrDown(src)

- src : input image.

**Note:** Once we scale down and if we rescale it to the original size, we lose some information and the resolution of the new image is much lower than the original one.

If we keep the original image as a base image and go on applying pyrDown function on it and keep the images in a vertical stack, it will look like a pyramid. The same is true for upscaling the original image by pyrUp function.

<img src='https://media.geeksforgeeks.org/wp-content/uploads/20190516134057/Pyramids.png'>

In [6]:
down_scale = cv2.pyrDown(img)
cv2.imshow('Scaled down image', down_scale)
cv2.waitKey(0)

up_scale = cv2.pyrUp(img)
cv2.imshow('Scaled up image', up_scale)
cv2.waitKey(0)

cv2.destroyAllWindows()

## Cropping 

cv2 does not provide direct method to crop images, but we can crop images using numpy and index slicing.

Make sure that the slicing indecies are integer.

In [7]:
start_row = int(width * 0.25)
end_row = int(width * 0.75)

start_col = int(height * 0.25)
end_col = int(height * 0.75)

cropped_img = img[start_row:end_row, start_col:end_col]
cv2.imshow('Cropped image', cropped_img)
cv2.waitKey(0)

cv2.destroyAllWindows()