# OpenCV Hands-On

In [1]:
try:
    import cv2 as cv
except ImportError:
    %pip install opencv-python
    import cv2 as cv

## Overview of the Mat Structure

The Mat structure is the fundamental data structure in OpenCV. It is a multi-dimensional array that can store images, matrices, and vectors. The Mat structure is the primary container for all OpenCV objects. It is a template class that can store any type of data. 

In [2]:
# Load an image
im: cv.Mat = cv.imread("./images/monarch.png", cv.IMREAD_GRAYSCALE)

# Check if image is loaded fine
if im is None:
    print("Error opening image")
    exit(1)

In [3]:
# Display the image
cv.imshow("Image", im)
cv.waitKey(0)  # Wait for a key press indefinitely
cv.destroyAllWindows()  # Close the window

![image1.png](./attachments/image1.png)

## Image resizing

In this part we are going to create a 64x64 empty picture. The original picture is 256x256. We'll resize the original picture to 64x64 and display it.



In [4]:
import numpy as np

# Load an image
image: cv.Mat = cv.imread("./images/peppers-256.png", cv.IMREAD_GRAYSCALE)

# Check if image is loaded fine
if image is None:
    print("Error opening image")
    exit(1)

# Create an empty image
croppedimage = np.zeros((64, 64), dtype=image.dtype)

# Crop the image with for loops by taking every 4th pixel 
for i in range(64):
    for j in range(64):
        croppedimage[i, j] = image[i*4, j*4]

# Display the images
cv.imshow("Image", image)
cv.imshow("Cropped image", croppedimage)
cv.waitKey(0)  
cv.destroyAllWindows()

![image2.png](./attachments/image2.png)

The resized imaged is still recognizable, but the quality is not as good as the original image. It looks pixelated and the edges are not as sharp.

## Applying convolutional filters

We're going to apply filters using the `filter2D` function. This function takes the input image and the kernel as arguments. 
We'll use the `peppers-512.png` image as input and apply different filters to it.

In [5]:
imk = cv.imread("./images/peppers-512.png", cv.IMREAD_GRAYSCALE)

### Kernel 1

The first kernel is defined as:
$$
\frac{1}{16} \begin{pmatrix}
    1 & 2 & 1 \\
    2 & 4 & 2 \\
    1 & 2 & 1
\end{pmatrix}
$$

In [6]:
# Define the kernel
kernel1 = np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]]) / 16

# Apply the filter using filter2D
imk1_filtered = cv.filter2D(imk, -1, kernel1)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk1_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image3.png](./attachments/image3.png)

The applied filter is a Gaussian blur filter. It slightly blurs the output image. The edges are not as sharp as in the original image.

### Kernel 2

The second kernel is defined as:

$$
\frac{1}{9} \begin{pmatrix}
    1 & 1 & 1 \\
    1 & 1 & 1 \\
    1 & 1 & 1
\end{pmatrix}
$$

In [7]:
# Define the kernel
kernel2 = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]]) / 9

# Apply the filter using filter2D
imk2_filtered = cv.filter2D(imk, -1, kernel2)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk2_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image4.png](./attachments/image4.png)

The applied filter is a box filter. It also slightly blurs the image and the edges are not as sharp as in the original image.

### Kernel 3

The third kernel is defined as:
$$
\begin{pmatrix}
    1 & -3 & 1 \\
    -3 & 9 & -3 \\
    1 & -3 & 1
\end{pmatrix}
$$

In [8]:
# Define the kernel
kernel3 = np.array([[1, -3, 1], [-3, 9, -3], [1, -3, 1]])

# Apply the filter using filter2D
imk3_filtered = cv.filter2D(imk, -1, kernel3)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk3_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image5.png](./attachments/image5.png)

In this case, the filter is a sharpening filter. It enhances the edges of the image and makes them more pronounced.

### Kernel 4

The fourth kernel is defined as:
$$
\begin{pmatrix}
    -1 & 0 & 1 \\
    -2 & 0 & 2 \\
    -1 & 0 & 1
\end{pmatrix}
$$

In [9]:
# Define the kernel
kernel4 = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]])

# Apply the filter using filter2D
imk4_filtered = cv.filter2D(imk, -1, kernel4)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk4_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image6.png](./attachments/image6.png)

This is the Sobel filter for detecting horizontal edges. It emphasizes vertical intensity changes in an image.

### Kernel 5

The fifth kernel is defined as:
$$
\begin{pmatrix}
    0 & -1 & -1 \\
    1 & 0 & -1 \\
    1 & 1 & 0
\end{pmatrix}
$$

In [10]:
# Define the kernel
kernel5 = np.array([[0, -1, -1], [1, 0, -1], [1, 1, 0]])

# Apply the filter using filter2D
imk5_filtered = cv.filter2D(imk, -1, kernel5)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk5_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image7.png](./attachments/image7.png)

This is a Kirsch compass kernel, specifically for edge detection. This kernel detects edges in a specific diagonal direction (bottom-left to top-right). We can see that because on the filtered image the edges are highlighted in that direction.

### Kernel 6

The sixth kernel is defined as:
$$
\begin{pmatrix}
    0 & 1 & 0 \\
    1 & -4 & 1 \\
    0 & 1 & 0
\end{pmatrix}
$$

In [11]:
# Define the kernel
kernel5 = np.array([[0, 1, 0], [1, -4, 1], [0, 1, 0]]) 

# Apply the filter using filter2D
imk5_filtered = cv.filter2D(imk, -1, kernel5)

# Display the image
cv.imshow("Image", imk)
cv.imshow("Filtered Image", imk5_filtered)
cv.waitKey(0)
cv.destroyAllWindows()

![image8.png](./attachments/image8.png)

This is the Laplacian filter, commonly used for edge detection. It highlights regions of rapid intensity change by approximating the second derivative of the image, detecting edges regardless of direction.

## Mathematical morphology

The goal of this part is to manipulate the images pixel by pixel and not to use OpenCV functions. We'll therefore implement the dilation and erosion operations by hand.

There are 3 kernels, rectangle, cross and ellipse


Rectangle (kernel1):

$$
\begin{pmatrix}
    1 & 1 & 1 \\
    1 & 1 & 1 \\
    1 & 1 & 1
\end{pmatrix}
$$

Cross (kernel2):

$$
\begin{pmatrix}
    0 & 1 & 0 \\
    1 & 1 & 1 \\
    0 & 1 & 0
\end{pmatrix}
$$

Ellipse (kernel3):

$$
\begin{pmatrix}
    0 & 1 & 1 & 0\\
    1 & 1 & 1 & 1\\
    0 & 1 & 1 & 0
\end{pmatrix}
$$

### Erosion

In [12]:
img = cv.imread("./images/peppers-512.png", cv.IMREAD_GRAYSCALE)

# Check if image is loaded fine
if im is None:
    print("Error opening image")
    exit(1)
 
# Create a binary image
_, binary_img = cv.threshold(img, 127, 255, cv.THRESH_BINARY) # Thresholding consists of setting a pixel to a certain value if it is greater than a threshold.

# Define the structuring elements (kernels)
kernel1 = np.array([[1, 1, 1],
                    [1, 1, 1],
                    [1, 1, 1]], dtype=np.uint8)  # Rectangle

kernel2 = np.array([[0, 1, 0],
                    [1, 1, 1],
                    [0, 1, 0]], dtype=np.uint8)  # Cross

kernel3 = np.array([[0, 1, 1, 1, 0],
                    [1, 1, 1, 1, 1],
                    [0, 1, 1, 1, 0]], dtype=np.uint8)  # Ellipse

# Function to apply erosion manually
def apply_erosion(image, kernel):
    eroded_img = np.zeros_like(image) # Create an empty image to store the eroded image
    img_height, img_width = image.shape # Get the image dimensions
    k_height, k_width = kernel.shape # Get the kernel dimensions
    offset_y = k_height // 2 # Calculate the offset in the y direction in order to avoid the borders
    offset_x = k_width // 2 # Calculate the offset in the x direction in order to avoid the borders

    # Iterate through each pixel in the image, avoiding borders
    for i in range(offset_y, img_height - offset_y):
        for j in range(offset_x, img_width - offset_x):
            match = True  # Flag to check if the kernel fits
            # Iterate through the kernel
            for m in range(k_height):
                for n in range(k_width):
                    # Only consider the kernel positions where the kernel has a '1' 
                    if kernel[m, n]:
                        if image[i + m - offset_y, j + n - offset_x] == 0:
                            match = False
                            break  # No need to check further
                if not match:
                    break
            if match:
                eroded_img[i, j] = 255  # Set pixel to white if all positions match
            else:
                eroded_img[i, j] = 0    # Set pixel to black otherwise
    return eroded_img

# Apply erosion with each kernel
eroded1_img = apply_erosion(binary_img, kernel1)
eroded2_img = apply_erosion(binary_img, kernel2)
eroded3_img = apply_erosion(binary_img, kernel3)

# Display the results
cv.imshow("Original Binary Image", binary_img)
cv.imshow("Eroded Image (Rectangle Kernel)", eroded1_img)
cv.imshow("Eroded Image (Cross Kernel)", eroded2_img)
cv.imshow("Eroded Image (Ellipse Kernel)", eroded3_img)
cv.waitKey(0)
cv.destroyAllWindows()


![image9.png](./attachments/image9.png)

#### Rectangle kernel

The 3×3 rectangular kernel treats all pixels in the neighborhood equally. This results in a uniform erosion effect, shrinking white regions and expanding black regions.

#### Cross kernel

The cross-shaped kernel considers fewer pixels during erosion compared to the rectangle. This results in less erosion in diagonal directions, preserving thin diagonal structures while eroding horizontal and vertical edges more significantly.

#### Ellipse kernel

The ellipse-shaped kernel has more weight around the center and reduces erosion in diagonal and edge areas. This creates a smoother effect, retaining rounded features while eroding straight edges less aggressively than the rectangle kernel.

### Dilation

In [13]:
# Create a binary image
_, binary_img = cv.threshold(img, 127, 255, cv.THRESH_BINARY)

# Define the structuring elements (kernels)
kernel1 = np.array([[1, 1, 1],
                    [1, 1, 1],
                    [1, 1, 1]], dtype=np.uint8)  # Rectangle

kernel2 = np.array([[0, 1, 0],
                    [1, 1, 1],
                    [0, 1, 0]], dtype=np.uint8)  # Cross

kernel3 = np.array([[0, 1, 0, 1, 0],
                    [1, 1, 1, 1, 1],
                    [0, 1, 1, 1, 0]], dtype=np.uint8)  # Ellipse-like

# Function to apply dilation manually
def apply_dilation(image, kernel):
    dilated_img = np.zeros_like(image)
    img_height, img_width = image.shape
    k_height, k_width = kernel.shape
    offset_y = k_height // 2
    offset_x = k_width // 2

    # Iterate through each pixel in the image, avoiding borders
    for i in range(offset_y, img_height - offset_y):
        for j in range(offset_x, img_width - offset_x):
            found = False  # Flag to check if any kernel position matches
            # Iterate through the kernel
            for m in range(k_height):
                for n in range(k_width):
                    # Only consider the kernel positions where the kernel has a '1'
                    if kernel[m, n]:
                        if image[i + m - offset_y, j + n - offset_x] == 255:
                            found = True
                            break  # No need to check further
                if found:
                    break
            if found:
                dilated_img[i, j] = 255  # Set pixel to white if any position matches
            else:
                dilated_img[i, j] = 0    # Set pixel to black otherwise
    return dilated_img

# Apply dilation with each kernel
dilated1_img = apply_dilation(binary_img, kernel1)
dilated2_img = apply_dilation(binary_img, kernel2)
dilated3_img = apply_dilation(binary_img, kernel3)

# Display the results
cv.imshow("Original Binary Image", binary_img)
cv.imshow("Dilated Image (Rectangle Kernel)", dilated1_img)
cv.imshow("Dilated Image (Cross Kernel)", dilated2_img)
cv.imshow("Dilated Image (Ellipse Kernel)", dilated3_img)
cv.waitKey(0)
cv.destroyAllWindows()

![image10.png](./attachments/image10.png)

#### Rectangle kernel

Using the 3×3 rectangular kernel, dilation adds pixels uniformly around white regions in all directions. This makes objects thicker, fills small gaps, and connects nearby white regions.

#### Cross kernel

The cross-shaped kernel dilates primarily in vertical and horizontal directions, expanding less in diagonal directions. This results in a more restrained expansion compared to the rectangular kernel, preserving some structures.

#### Ellipse kernel

The ellipse-shaped kernel provides a smoother dilation effect, favoring rounded expansion around white regions. It avoids sharp corners, making the objects appear more rounded and natural.

### Opening

Opening is an erosion followed by a dilation

In [14]:
# Function to apply opening (erosion followed by dilation)
def apply_opening(image, kernel):
    eroded_img = apply_erosion(image, kernel)
    opened_img = apply_dilation(eroded_img, kernel)
    return opened_img

# Create a binary image
_, binary_img = cv.threshold(img, 127, 255, cv.THRESH_BINARY)

# Apply opening with the kernel
opened_img = apply_opening(binary_img, kernel1)

# Display the results
cv.imshow("Original Binary Image", binary_img)
cv.imshow("Opened Image", opened_img)
cv.waitKey(0)
cv.destroyAllWindows()

![image11.png](./attachments/image11.png)

After applying the opening operation, the small white noise regions have been removed, and the boundaries of the white objects are smoothed. Opening is performed using erosion followed by dilation. This process eliminates smaller white areas while preserving the shape of larger objects.

### Closing

Closing is a dilation followed by an erosion

In [15]:
# Function to apply closing (dilation followed by erosion)
def apply_closing(image, kernel):
    dilated_img = apply_dilation(image, kernel)
    closed_img = apply_erosion(dilated_img, kernel)
    return closed_img

# Create a binary image
_, binary_img = cv.threshold(img, 127, 255, cv.THRESH_BINARY)

# Apply closing with the kernel
closed_img = apply_closing(binary_img, kernel1)

# Display the results
cv.imshow("Original Binary Image", binary_img)
cv.imshow("Closed Image", closed_img)
cv.waitKey(0)
cv.destroyAllWindows()

![image12.png](./attachments/image12.png)

After applying the closing operation, the small black noise regions have been removed, and the boundaries of the white objects are smoothed. Closing is performed using dilation followed by erosion. This process eliminates smaller black areas while preserving the shape of larger objects.