<a href="https://colab.research.google.com/github/dajopr/lectures/blob/main/image_processing/lecture_01_image_processing_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Image Processing in Python with colab

This exercise sheet provides a hands-on introduction to image processing using Python, focusing on the core libraries:

- OpenCV (cv2): For image and video manipulation.

- NumPy: For efficient numerical operations on images (represented as arrays).

- Matplotlib: For visualizing images and plotting results.

Before you begin:

Let's first get familiar with Colab
1. Create a new code cell: Click the "+ Code" button above
2. In the code cell you just created, type print("Hello, Colab!") and then press Control+Enter to run the code.  The output will appear below the cell. Shift+Enter also runs the current cell, but also steps to the next cell.
3. Upload an image.
    - Open the files tab in the side bar.
    - Select upload to session storage.
    - Select a file from your computer to upload to the session. This will be discarded after the session is closed.
    - You can also drag and drop files into the highlighted area at the bottom.

    Make sure you have a Google Colab environment set up.

    Install the necessary libraries (should be installed already)

In [None]:
!pip install opencv-python
!pip install numpy
!pip install matplotlib



    Import the necessary libraries

In [31]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from skimage import io

## Exercise 1: Image fundamentals

### Exercise 1a – Loading an image

1. Use cv2.imread("path/to/image") to load an image from a file. Assign the image to a variable named image_original.

You can upload a new image to your Colab environment or use the image added in the previous steps.

Ignore any coloring issues. This will be address in the upcoming exercises.




### Exercise 1b – Image display

1. Use plt.imshow(image_name) to display the image in the Colab notebook.
2. Assign the current image display to a variable named **ax** with plt.gca() (get current axis)
3. Turn off the x axis with ax.get_xaxis().set_visible(False)
4. Turn off the y axis as well


### Exercise 1c – Color spaces

OpenCV loads images into BGR space by default while matplotlib expects images to be in RGB. To display the image correctly the color space needs to be changed.

1. Convert the image to RGB and assign it to variable image_rgb
    - Use OpenCV's "convert color" function cv2.cvtColor
    - It takes two arguments:
        1. The image to be converted (image_original)
        2. An integer denoting the color space conversion
    - Color space conversion codes are available as constants in the cv2 module
    - For BGR to RGB use cv2.COLOR_BGR2RGB
2. Display the converted image in colab natively by ending a codeblock with the name of the variable

### Exercise 1d – Displaying multiple images

Display the original image in BGR (which will look incorrect) and the RGB image side by side.

1. Access the first subplot with plt.subplot
    - The first two arguments define the number of rows and columns respectively.
    - To display the images next to eachother use plt.subplot(1, 2, ...)
    - The third argument selects the current subplot. In this case either 1 or 2. The indices start in the top right and first increase to the right and then down. So plt.subplot(2,3,4) would be the first subplot in the second row.
2. Display the image in BGR using plt.imshow
3. Set the title of the subplot with plt.title
4. Repeat for the image in RGB

## Exercise 1e – Resizing an image

1. Resize the image to size 224 px by 224 px
    - Use cv2.resize() to change the dimensions of the image.
2. Use with area and linear interpolation and display side by side
    - For available interpolation methods see https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121
    -

## Exercise 2 - Images as arrays

**Images as NumPy ndarrays**

In the world of digital image processing, images are often represented as multi-dimensional arrays, or ndarrays, using the NumPy library in Python. This representation allows for efficient storage and manipulation of image data.

Understanding the Structure

- Grayscale Images: A grayscale image is typically represented as a 2D array. Each element in the array corresponds to the intensity of a pixel, with values ranging from 0 (black) to 255 (white). The shape of the array is (height, width).

- Color Images: A color image, such as an RGB image, is represented as a 3D array. The first two dimensions represent the height and width of the image, while the third dimension represents the color channels. For RGB images, the third dimension has a size of 3, corresponding to the red, green, and blue color intensities. The shape of the array is (height, width, 3).

### Exercise 2a – Accessing pixel values
1. Print the shape of the image
    - Use image.shape to access the shape
1. Use NumPy array indexing to access the pixel value at a specific location (x, y) i.e. (50, 50).
    - Values are accessed via square brackets e.g. image[y, x] access pixel at (x,y)
    - Remember that in NumPy arrays, the first index is the row (y-coordinate), and the second index is the column (x-coordinate).
2. Access a region of interest
    - Multiple pixel values can be accessed by specifying a range image[y_start:y_end, x_start:x_end]
    - Omitting one or both values selects all remaining values. E.g. image[:, :10] selects the first 10 columns.
3. Set the first 10 rows to 0
    - Use image[y_start:y_end, x_start:x_end] = value

### Exercise 2b – Color channels
The color channels of the image can be accessed individually.

1. Display the red, green and blue channels next to eachother
2. Display the channels with a different colormap
    - In plt.imshow the colormap can be set by setting the keyword argument cmap
    - For available default colormaps see https://matplotlib.org/stable/users/explain/colors/colormaps.html


### Exercise 2c – Image arithmetic

The numpy arrays that make up an image also support arithmetic operations.

1. Convert the image to float
    -
2. Calculate the mean value of the green channel
3. Subtract the mean value from the green channel in the image


## Exercise 3: Histogram equalization
### Exercise 3a: Histogram Equalization with NumPy

Objective: Implement and understand the histogram equalization algorithm using only NumPy for calculations and Matplotlib for visualization.

Motivation: Histogram equalization is a fundamental technique for improving contrast in images, especially those where pixel values are clustered in a narrow range. This exercise will help you understand how it works step-by-step by manipulating histograms and cumulative distribution functions (CDFs).

Tasks:

1. Loadimage: Start with a sample grayscale image represented as a NumPy array (a simple low-contrast example is provided in the code).
    - Use io.imread("https://upload.wikimedia.org/wikipedia/commons/0/08/Unequalized_Hawkes_Bay_NZ.jpg") to load the low contrast image
2. Convert the image to 0 - 1 range
3. Calculate the original histogram
    - Use np.histogram to get the intensity distribution of the original image.
4. Plot this histogram using plt.plot
5. Calculate the Original CDF
    - Hint: The histogram values need to be cumulatively summed and then normalized.
6. Plot the CDF

7. Perform the equalization
    - You can use np.interp (see https://numpy.org/doc/stable/reference/generated/numpy.interp.html) to map the cdf to the image
    - Hint: If the shapes of the cdf and the bins don't match check the first and last bin returned by np.histogram - which is a valid value for an image

8. Visualize Results: Display the original image and the equalized image side-by-side.
9. Visualize the histogram and CDF of the qualized image
10. Analyze: Compare the original and equalized histograms and images. How has the intensity distribution changed? How has the visual appearance improved? Observe the shape of the CDF.

In [None]:
image = io.imread("https://upload.wikimedia.org/wikipedia/commons/0/08/Unequalized_Hawkes_Bay_NZ.jpg")

### Exercise 3b: Histogram equalization with OpenCV

Objective: Enhance the contrast of a color image using histogram equalization applied selectively to the luminance channel, preserving the original color information as much as possible.

Motivation: Applying histogram equalization directly to each channel (R, G, B) of a color image often leads to unnatural color shifts and artifacts. By converting the image to a color space like YCbCr, which separates intensity (luminance – Y) from color information (chrominance – Cb, Cr), we can apply equalization only to the Y channel. This enhances the contrast based on brightness levels while keeping the color components (Cb, Cr) unchanged, resulting in a more natural-looking enhancement.

Tasks:

1. Load Image: Load the specified low-contrast color image from the provided URL (http://photography.bastardsbook.com/assets/content/images/large/florence-afternoon-_-7267492940.jpg)
    - Note: io.imread loads images as RGB unlike opencv which defaults to BGR
2. Equalize luminance of image
    - Apply OpenCV's cv2.equalizeHist function only to the Y channel in YCbCr color space. Note that cv2.equalizeHist works on single-channel 8-bit images.

3. Visualize Results: Display the original RGB image and the final equalized RGB image side-by-side using Matplotlib.

4. Visualize Histograms (Optional but Recommended):
    - Calculate and plot the histogram of the original Y channel.
    - Calculate and plot the histogram of the equalized Y channel. This helps visualize the effect of the equalization step.

5. Analyze:

    - Compare the original and equalized images visually. How has the contrast changed?
    - Do the colors in the equalized image look natural, or are there significant shifts?
    - Compare the histograms of the original and equalized Y channels. How does the distribution change?
    - Why is this YCbCr approach generally preferred for equalizing color images compared to applying equalizeHist to each R, G, and B channel independently?

In [None]:
image = io.imread("http://photography.bastardsbook.com/assets/content/images/large/florence-afternoon-_-7267492940.jpg")

## Exercise 4: Image morphing (optional)

Objective: Explore the effect of linear combination (cross-dissolving) as a simple image transition technique and understand the conditions under which it can produce a visually smooth effect, approximating a morph without complex warping.

Motivation: While true image morphing often requires complex steps like feature matching and warping, a simple cross-dissolve can sometimes suffice if the source and destination images are very similar and perfectly aligned. This exercise demonstrates this basic technique and highlights the critical importance of image alignment and structural similarity for any morphing or transition effect. It serves as a contrast to more complex feature-based morphing methods.

Key Concept:

- Linear Combination / Cross-Dissolving: Creating an intermediate image by taking a weighted average of two input images. The formula is:
    output_image = (1 - alpha) * image1 + alpha * image2
    where alpha is a value between 0.0 and 1.0.

    When alpha = 0.0, the output is identical to image1.

    When alpha = 1.0, the output is identical to image2.

    Values between 0 and 1 create a blend, effectively "fading" from image1 to image2 as alpha increases.

Tasks:

1. Find Suitable Image Pairs: This is the most crucial step for this specific exercise! Search for or create pairs of images that meet these criteria:

    - Identical Subject & Viewpoint: The images should show the same scene or subject photographed from the exact same camera position. Using a tripod is ideal.

    - Structural Alignment: Major shapes, lines, and features should be in the same location within both image frames.

    - Allowed Differences: The images can differ in aspects like:

        - Lighting (e.g., day vs. night, different artificial light).

        - Color palette (e.g., different color grading or filters applied).

        - Minor details (e.g., a person making a slightly different facial expression, clouds moving slightly, seasonal changes in foliage if the camera position is fixed).

        - Texture or artistic style applied digitally.

    - Same Dimensions: Both images must have the exact same height and width.

    - Examples: Two photos from a fixed tripod at different times of day, two identical photos with different color filters applied, two frames from a static video where only lighting or a minor element changes.

2. Load Images: Load your chosen pair of images (image1, image2) using a library like imageio. Double-check that their dimensions match.

3. Implement Cross-Dissolve Function:

    - Create a simple Python function, e.g., cross_dissolve(img1, img2, alpha).

    - Inputs: The two images (as NumPy arrays) and the alpha value (between 0 and 1).

    - Inside the function:

        - Ensure images are converted to a floating-point type (e.g., float32) before multiplication to avoid overflow/clipping issues with uint8.

        - Calculate output_image.

        - Clip values to the valid range if necessary (though less likely if inputs are 0-255 and alpha is 0-1).

        - Convert the resulting floating-point image back to the original data type (e.g., uint8) for display.

        - Return the output_image.

4. Generate and Visualize Intermediate Frames:

    - Call your cross_dissolve function with several different alpha values (e.g., 0.25, 0.5, 0.75).

    - Display the original image1, image2, and the generated intermediate frames using Matplotlib. Arrange them logically to show the transition.

5. Analyze:

    - Evaluate the smoothness of the transition for your chosen image pair. Why did the simple cross-dissolve work reasonably well (or not)? Relate this to the image selection criteria.

    - What visual artifacts, if any, are still present (e.g., "ghosting" where elements aren't perfectly aligned)?

    - Under what specific conditions is this simple linear combination technique sufficient for creating a transition effect?

    - How does this compare to the feature-based morphing technique involving triangulation and warping? When would the more complex method be absolutely necessary?


This exercise emphasizes the importance of input data quality (alignment and similarity) when choosing image processing techniques.