The general form of the affine transformation matrix is:

$$
M = 
\begin{bmatrix}
a & b & t_x \\
c & d & t_y 
\end{bmatrix}
$$
Where:

$a, b, c, d$ --> Control rotation, scaling, and shearing.
$t_x, t_y$ --> Control translation (shift in $X$ and $Y$ directions).

- <b>Translation</b> $(t_x, t_y)$ shifts the image $t_x$ pixels horizontally and $t_y$ pixels vertically
- <b>Scaling</b> $(a, d)$ resizes the image $a$ pixels horizontally and $b$ pixels vertically
- <b>Rotation</b> $(a, b, c, d)$ where $a = \cos\theta$, $b = -\sin\theta$, $c = \sin\theta$ and $d = \cos\theta$ rotates the image by an angle $\theta$
- <b>Shearing (Skewing)</b> $(b, c)$ tilts the image along $X$ or $Y$ axis, also referred to as $X$-shearing ($sh_x$) and $Y$-shearing ($sh_y$), respectively.

In [1]:
import numpy as np
import cv2 as cv

In [2]:
img = cv.imread("road.jpg")
h0, w0, c = img.shape
print(f"Height: {h0}, width: {w0} and channels: {c}")

Height: 735, width: 1100 and channels: 3


<h3>Translation</h3>

<p>Image translation is a geometric transformation that shifts an image in the X and/or Y direction (rectilinear shift of an image from one location to another). It moves every pixel of the image by a specified number of pixels horizontally and vertically. This is useful for tasks like data augmentation, image alignment, or creating panning effects.</p>

In [3]:
new_w = w0 #+ 100  # Add tx
new_h = h0 #+ 50   # Add ty

M = np.float32([[1, 0, 100],[0, 1, 50]])
dst = cv.warpAffine(img, M, (new_w, new_h))
cv.imshow('img', dst)
cv.waitKey(0)
cv.destroyAllWindows

<function destroyAllWindows>

In the code above:

A $2 \times 3$ affine transformation matrix of type float32 is defined

The matrix is structured as:
$$
\begin{bmatrix}
1 & 0 & t_x \\
0 & 1 & t_y 
\end{bmatrix}
$$
where $t_x = 100$ and $t_y = 50$.

<h3> Rotation</h3>

<p>Image rotation is a geometric transformation that turns an image around a specified point (called the rotation center) by a given angle. This is useful for tasks like data augmentation (generate rotated versions of training images), image alignment, document alignment (correct skewed scanned documents) and correcting orientation/object detection preprocessing (handle rotated objects).</p>

<p>
The rotation transformed is achived by defining the rotation matrix listing rotation point, degree of rotation and the scaling factor:

$$
\begin{bmatrix}
\alpha & \beta & (1−\alpha) \cdot center_x − \beta \cdot center_y\\
-\beta & \alpha &   \beta \cdot center_x + (1−\alpha) \cdot center_y
\end{bmatrix}
$$

where $\alpha = \mathrm{scale}\cdot\cos\theta$ and $\beta = \mathrm{scale}\cdot\sin\theta$

The functions used are as follows:

- <b>cv2.getRotationMatrix2D()</b> : takes the rotation point, degree and scaling factor as parameters. It generates a $2 \times 3$ affine transformation matrix for rotation.  A positive angle rotates the image counter-clockwise, a negative angle rotates it clockwise. The scale factor adjusts the image size. A factor of 1.0 means no scaling
- <b>cv2.warpAffine()</b> : applies the rotation transformation.
</p>

In [5]:
h, w = img.shape[:2]
img_rotation = cv.warpAffine(img,
                             cv.getRotationMatrix2D((w/2, h/2),
                                                    30, 0.6),
                             (w, h))
cv.imshow('img', img_rotation)
cv.waitKey(0)
cv.destroyAllWindows()

In the code above the parameters:

- (cols/2, rows/2) → Center of rotation (here, the image center).
- 30 --> Rotation angle (degrees)
- 0.6 --> Scaling factor (resizes the image).

<h3> Cropping</h3>

In [None]:
cropped_img = img[100:300, 100:300]
cv.imshow('img', cropped_img)
cv.waitKey(0)
cv.destroyAllWindows()

<h3>Shearing</h3>

<p>Shearing (or skewing) is a geometric transformation that tilts an image along the X-axis (horizontal) or Y-axis (vertical). This creates a slanting effect, similar to pushing the top or side of the image while keeping the opposite side fixed.
document correction (fixing skewed scanned documents), data augmentation (generating tilted training images) and computer graphics (creating perspective effects).</p>

In [4]:
img = cv.imread("road.jpg")
h, w = img.shape[:2]

shear_x = 0.5
shear_y = 0
M_shear = np.float32([[1,shear_x,0],[shear_y, 1, 0]])
sheared_img = cv.warpAffine(img, M_shear, (w, h))

cv.imshow('img', sheared_img)
cv.waitKey(0)
cv.destroyAllWindows()