<img src="https://robomous.ai/images/layout/robomous-banner.svg" alt="Robomous.ai" width=300 />

<a href="https://colab.research.google.com/github/Robomous/notebooks/blob/main/2024/Getting_started_with_OpenCV_in_Python.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

This document was created by [robomous.ai](https://robomous.ai) as support material for the article titled "*Getting started with OpenCV in Python*" and can be accessed at the following link. The content of this notebook can be used, copied, or modified according to your convenience. Robomous authorizes the free use of its educational content shared with the public from its platform.

# Starting with OpenCV in Python

If you are working with Google Colab, this library has already been installed, and you don't need to installed it.

Importing OpenCV in python is simple:

In [None]:
import cv2

We can review the current version of the library.

In [None]:
print(cv2.__version__)

# 1.- Opening the first image

OpenCV includes a function to load images in different formats (jpg, png, bmp, ...). Usually, we work with images in JPG format. The function has two parameters: the image file path and the mode to open the image; these parameters could be one of the following options.

*   cv2.IMREAD_COLOR
*   cv2.IMREAD_GRAYSCALE
*   cv2.IMREAD_UNCHANGED

The first option loads the image in color mode, the second in grayscale, and the third loads the image as it is (including the alpha channel).

The color images in OpenCV are loaded as an RGB image, but the order of the channels is BGR (Blue, Green, Red). This is for historical reasons, and you can change the order of the channels using the function cv2.cvtColor().

Download an example image from the internet to start working with OpenCV.

In [None]:
!wget -q --show-progress https://img.freepik.com/free-photo/selective-focus-shot-adorable-german-shepherd_181624-30217.jpg -O dog.jpg

In [None]:
image = cv2.imread('dog.jpg', cv2.IMREAD_COLOR)

We can try to show the image in this notebook by using the function imshow from matplotlib and see the result.

First, import the necessary library:

In [None]:
import matplotlib.pyplot as plt

# Define a function to display the image, avoiding the need to duplicate the code each time.
# This function will display the image in the notebook in the actual size of the image.
def display_image(image):
    dpi = 80
    height, width, _ = image.shape

    # What size does the figure need to be in inches to fit the image?
    figsize = width / float(dpi), height / float(dpi)

    # Create a figure of the right size with one axes that takes up the full figure
    fig = plt.figure(figsize=figsize, dpi=dpi)
    ax = fig.add_axes([0, 0, 1, 1])

    # Hide spines, ticks, etc.
    ax.axis('off')

    # Display the image.
    ax.imshow(image, interpolation='nearest')
    plt.show()

Show the image in the notebook:

In [None]:
display_image(image)

You can see that the image is not displayed correctly. This is because the image is loaded in BGR mode, as mentioned before. We can convert the image to RGB mode using the cvtColor function.

In [None]:
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

display_image(image_rgb)

Now, the image is displayed correctly and it is using the RGB mode.

# 2.- Manipulating the image

Some of the most common operations that we can do with images are:

- Resize the image
- Rotate the image
- Flip the image
- Crop the image
- Change the color space of the image
- Add text to the image
- Draw shapes on the image

We will perform some of these operations in the next sections.

## 2.1 - Resize the image

Resizing an image is a common operation when working with images. We can resize the image using the resize function.

In [None]:
# Reduce the size of the image to an specific size
image_resized = cv2.resize(image_rgb, (240, 240))

display_image(image_resized)

In [None]:
# Image resized to 50% of the original size
small_image = cv2.resize(image_rgb, (0,0), fx=0.5, fy=0.5)

display_image(small_image)

In [None]:
# Increase the size of the small image to 200% of its current size
big_image = cv2.resize(small_image, (0,0), fx=2, fy=2)

display_image(big_image)

You can notice that the image resized to a bigger size has some reduction in its quality. This is because the function uses interpolation to calculate the new pixel values. By default, the interpolation method used is cv2.INTER_LINEAR, which is a linear interpolation method. 

Changing the interpolation method using the interpolation parameter and using a different method to resize the image is possible.

Let's resize the image using the cv2.INTER_CUBIC and cv2.INTER_NEAREST interpolation methods.

In [None]:
big_image_2 = cv2.resize(small_image, (0,0), fx=2, fy=2, interpolation=cv2.INTER_NEAREST)

display_image(big_image_2)

In [None]:
big_image_3 = cv2.resize(small_image, (0,0), fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

display_image(big_image_3)

## 2.2 - Rotate the image

We can start with the simplest method, the cv2.rotate function, to rotate an image. This function rotates the image in multiples of 90 degrees.

The function has two parameters: the image and the rotation mode. The rotation mode could be one of the following options:

- cv2.ROTATE_90_CLOCKWISE
- cv2.ROTATE_180
- cv2.ROTATE_90_COUNTERCLOCKWISE


In [None]:
rotated_image = cv2.rotate(image_rgb, cv2.ROTATE_90_CLOCKWISE)

display_image(rotated_image)

In [None]:
rotated_image_2 = cv2.rotate(image_rgb, cv2.ROTATE_90_COUNTERCLOCKWISE)

display_image(rotated_image_2)

In [None]:
rotated_image_3 = cv2.rotate(image_rgb, cv2.ROTATE_180)

display_image(rotated_image_3)

To rotate the image in other angles, we can use the cv2.getRotationMatrix2D function to create a rotation matrix and the cv2.warpAffine function to apply the rotation matrix to the image.

In [None]:
# Rotate tjeh image 45 degrees
rotation_matrix = cv2.getRotationMatrix2D((image_rgb.shape[1] / 2, image_rgb.shape[0] / 2), 45, 1)
rotated_image_4 = cv2.warpAffine(image_rgb, rotation_matrix, (image_rgb.shape[1], image_rgb.shape[0]))

display_image(rotated_image_4)

## 2.3 - Flip the image

Flip an image is a simple operation that can be done using the cv2.flip function. The function has two parameters: the image and the flip code. The flip code could be one of the following options:

- 0: flip the image vertically
- 1: flip the image horizontally

In [None]:
# Flip the image horizontally
image_flipped_horizontal = cv2.flip(image_rgb, 1)

display_image(image_flipped_horizontal)

In [None]:
# Flip the image vertically
image_flipped_vertical = cv2.flip(image_rgb, 0)

display_image(image_flipped_vertical)

## 2.4 - Crop the image

Before starting to crop the image, we need to know that an image in OpenCV is represented as a NumPy matrix of 3 dimensions (height, width, channels). To crop the image, we need to define the region of interest (ROI) for which we want to keep using the NumPy slicing operation.

The second to understand is the NumPy slicing operation. The slicing operation is used to extract a part of the matrix. The syntax is a matrix[start:end, start:end]. The first part of the syntax is the row range, and the second part is the column range.

In [None]:
cropped_image = image_rgb[100:300, 200:400]

display_image(cropped_image)

## 2.5 - Change the color space of the image

Changing the image's color space is possible using the cv2.cvtColor function. The function has two parameters: the image and the color conversion code. Several color conversion codes are available in OpenCV; you can see the complete list in the OpenCV documentation. One of the most common color conversion codes is cv2.COLOR_BGR2GRAY, which converts the image from BGR to grayscale.

In [None]:
gray_image = cv2.cvtColor(image_rgb, cv2.COLOR_RGB2GRAY)

# Display the image in grayscale will require the use of matplotlib with the cmap parameter set to 'gray'
plt.imshow(gray_image, cmap='gray')
plt.axis('off')
plt.show()

## 2.6 - Add text to the image

Adding text to an image is a joint operation when working with images. We can use the cv2.putText function to add text to the image. The function has several parameters: the image, the text to add, the position of the text, the font, the font scale, the color, the thickness, and the line type.

Let's add some text to the image.

In [None]:
# Copy the original image, because we will draw on it and it
# will modify the original image
image_with_text = image_rgb.copy()

# Select the font to use
font = cv2.FONT_HERSHEY_SIMPLEX

# Add the text to the image
cv2.putText(image_with_text, "Hello, I'm a dog! Woof!", (10, 50), font, 1, (255, 255, 255), 2, cv2.LINE_AA)

display_image(image_with_text)

As you can see, adding text to the image is a simple operation. You can try to change the font, the font scale, the color, the thickness, and the line type to see the result.

## 2.7 - Draw shapes on the image

Adding shapes to an image is another joint operation when working with images in OpenCV. We need to draw the bounding boxes around the detected objects when working with Deep Learning algorithms, like object detection with YOLO. This operation can be done using the cv2.rectangle function.

Let's draw a rectangle around the dog in the image. The bounding box should be from the point (218, 90) to the point (404, 412).

In [None]:
image_with_rectangle = image_rgb.copy()

# Draw a rectangle on the image
cv2.rectangle(image_with_rectangle, (218, 90), (404, 412), (255, 0, 0), 2)

display_image(image_with_rectangle)

Now, lets to add a text label to the bounding box like in a prediction of an object detection algorithm.

In [None]:
image_with_rectangle = image_rgb.copy()

# Defined color red using the RGB format
color_red = (255, 0, 0)

# Draw a rectangle on the image
cv2.rectangle(image_with_rectangle, (218, 90), (404, 412), color_red, 2)

# Add the lable "Dog" to the bounding box
font_size = 0.7
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(image_with_rectangle, "Dog", (218, 80), font, font_size, color_red, 2, cv2.LINE_AA)

display_image(image_with_rectangle)

# 3.- Displaying the image in a window

**NOTE:** The following code creates a window and displays the image in the window. This operation can't be done in Google Colab, but you can run this code on your local machine and see the result.

Sometimes, you must display the image in a window to interact with the image using the mouse or the keyboard; these operations are common when working with video. To display the image in a window, we can use the cv2.imshow function.

Try to run the following code on your local machine. It will open a window with the image, so wait until you press any key to close it.

```python
import cv2

def main():
    # Load the image, we will work with the image in BGR format this time
    image_bgr = cv2.imread("dog.jpg", cv2.IMREAD_COLOR)

    image_with_rectangle = image_bgr.copy()

    # Defined color red using the RGB format
    color_red = (255, 0, 0)

    # Draw a rectangle on the image
    cv2.rectangle(image_with_rectangle, (218, 90), (404, 412), color_red, 2)

    # Add the lable "Dog" to the bounding box
    font_size = 0.7
    font = cv2.FONT_HERSHEY_SIMPLEX
    cv2.putText(image_with_rectangle, "Dog", (218, 80), font, font_size, color_red, 2, cv2.LINE_AA)

    # Display the image in the window until any key is pressed
    cv2.imshow('Dog', image_bgr)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


if __name__ == "__main__":
    main()
```

Working with images in OpenCV is simple and powerful. You can perform several operations with images, like resizing, rotating, flipping, cropping, changing the color space, adding text, and drawing shapes. These operations are essential when working with images in computer vision projects.

In future posts, we will see how to work with videos and webcams and apply image processing techniques to images.

This document is distributed under the Apache License, Version 2.0, available in this link: https://www.apache.org/licenses/LICENSE-2.0