# **Preprocessing**

<div style="color:#777777;margin-top: -15px;">
<b>Author</b>: Norman Juchler |
<b>Course</b>: MSLS CO4 |
<b>Version</b>: v1.1 <br><br>
<!-- Date: 05.03.2025 -->
<!-- Comments: ... -->
</div>

Image preprocessing is a crucial step in medical image analysis, involving techniques such as resizing, cropping (region of interest selection), denoising, and intensity or color adjustments. The goal is to enhance image quality and optimize it for further processing, such as segmentation or classification. In this tutorial, we will use the OpenCV library to apply some of these preprocessing techniques.

We will work with hematological images displaying red and white blood cells from a blood smear. In these images, red blood cells appear red and white blood cells are blue/purple. The background is white/gray. We will leverage color channel information to segment different cell types in later steps.

![Hematology data](../data/images/hematology-collage.svg?9)

This dataset presents several imperfections that can affect image quality and analysis: Low resolution limits details in cell structures; noise introduces unwanted pixel intensity variations; compression artifacts causes blockiness due to JPEG compression; and cell overlap makes segmentation more challenging, as blood cells often touch or obscure each other.

To address these challenges and improve image quality, we will apply the following techniques:
- **Cropping**: Select a region of interest (ROI)
- **Resizing**: Standardize image dimensions
- **Masking**: Isolate specific areas
- **Denoising**: Reduce image noise
- **Enhancing contrast**: Improve visibility of features
- **Sharpening**: Highlight edges and fine details
- **(Artifact removal**: Mitigate compression effects)
- **Color conversion**: Convert images between color spaces (e.g., RGB to grayscale)
- **Color correction / white balancing**: Correct color imbalances
- **Background removal**: Eliminate unwanted background elements

### **Further reading**: 

For a deeper understanding of image preprocessing, check out:
- Geeks for Geeks: Image Enhancement Techniques using OpenCV. [Link](https://www.geeksforgeeks.org/image-enhancement-techniques-using-opencv-python/)



---

## **Preparations**

Let's begin with some preparatory steps...

In [None]:
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt

# Jupyter / IPython configuration:
# Automatically reload modules when modified
%load_ext autoreload
%autoreload 2

# Enable vectorized output (for nicer plots)
%config InlineBackend.figure_formats = ["svg"]

# Inline backend configuration
%matplotlib inline

# Enable this line if you want to use the interactive widgets
# It requires the ipympl package to be installed.
#%matplotlib widget

import sys
sys.path.insert(0, "../")
import tools

In [None]:
# Read in the data
img1 = cv.imread("../data/images/hematology-baso1.jpg", cv.IMREAD_COLOR)
img2 = cv.imread("../data/images/hematology-baso2.jpg", cv.IMREAD_COLOR)
img3 = cv.imread("../data/images/hematology-blast1.jpg", cv.IMREAD_COLOR)

plt.imshow(img1)
plt.axis("off");

**Note:** OpenCV uses the BGR color space by default, while Matplotlib uses RGB. To ensure consistency, we convert images to RGB before displaying them.

In [None]:
img1 = cv.cvtColor(img1, cv.COLOR_BGR2RGB)
img2 = cv.cvtColor(img2, cv.COLOR_BGR2RGB)
img3 = cv.cvtColor(img3, cv.COLOR_BGR2RGB)

images = [img1, img2, img3]

# Let's display the images
plt.imshow(img1)
plt.axis("off");

In the following sections, we will use various functions to display images. Before proceeding, here is a brief overview of the available options.

In [None]:
# Plain matplotlib
plt.imshow(img1)
plt.axis("off");   # Hide axes.

# Based on Jupyter's display() function.
tools.display_image(img1)

# Show pairs of images
tools.show_image_pair(img1, img2, title1="Baso 1", title2="Baso 2")

# Show axes coordinates
tools.show_image_pair(img1, img2, title1="Baso 1", title2="Baso 2", show_axes=True)

# Show a chain of images
tools.show_image_chain([img1, img2, img3], titles=["Baso 1", "Baso 2", "Blast 1"])

# Show a grid of images
images_tmp = images*2
titles = ["Baso 1", "Baso 2", "Blast 1"]
titles *= 2
tools.show_image_grid(images_tmp, titles=titles)

---

## **Cropping**

Cropping an image involves selecting a rectangular region of interest (ROI) within the image. In Python, this can be achieved using the slicing operator to extract the desired region.


In [None]:
######################
###    EXERCISE    ###
######################

# Crop the image such that only the purple white blood cell is visible.
# Use the slicing operator to crop the image. 

# Specify the coordinates of the bounding box
xs, ys = 250, 160
h, w = 100, 100

# Crop the image with the slicing operator
img1_crop = ...

# Display the image
tools.show_image_pair(img1, img1_crop, title1="Original", title2="Cropped", 
                      shape=None, box_aspect=1)

---

## **Resizing**

We can use OpenCV’s [`resize()`](https://docs.opencv.org/4.x/da/d54/group__imgproc__transform.html#ga47a974309e9102f5f08231edc7e7529d) function to resample an image to a specified size. This allows for both downsampling (reducing the image size) and upsampling (increasing the image size). The `interpolation` parameter controls the method used for resizing.

In [None]:
######################
###    EXCERISE    ###
######################

# Resize the image to half and double its size.
img1_half = ...
img2_double = ...

tools.show_image_chain(images=[img1_half, img1, img2_double], 
                       titles=["Half", "Original", "Double"])

---

## **Masking**

Masking allows us to isolate a cell from the background, remove artifacts, or mark specific regions as unimportant by setting their pixel values to zero.  

A mask is a binary image (with values `True` or `False`). When a mask has the same shape as the input image, it can be used to selectively modify pixel values, enabling targeted processing or segmentation.

In [None]:
######################
###    EXCERISE    ###
######################

# 1. Create a circular mask with the same size as the image.
#    The center and the radius of the circle are provided below.
#    You can use cv.circle() or define your own function.
# 2. Apply the mask to the image img2.

# Center and radius of the circle
cx, cy = 220, 300
r = 44

# We use here a 2D binary mask
mask = ...

# Apply the mask to the RGB image, set all values to black

# Visualize
tools.show_image_pair(img2, img2_masked, title1="Original", title2="Masked")

---

## **Denoising**

Denoising is essential for improving image quality, but it requires balancing noise reduction with detail preservation. OpenCV provides several methods for this purpose, including:  

- **Gaussian blur**: Smooths the image by averaging pixel values with a weighted kernel.  
- **Median blur**: Replaces each pixel value with the median of its neighbors, effective for salt-and-pepper noise.  
- **Bilateral filter**: Preserves edges while reducing noise by considering both spatial distance and intensity differences.  


Before proceeding, review these OpenCV tutorials on [Smoothing](https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html) and [Image denoising](https://docs.opencv.org/4.x/d5/d69/tutorial_py_non_local_means.html).

<!-- - Cropping
- Resizing
- Masking
- Denoising
- Enhancing contrast
- Sharpening
- (Removing artifacts)
- Color conversion
- Color correction / white balancing
- Background removal -->

In [None]:
######################
###    EXCERISE    ###
######################

# 1. Apply Gaussian blur
# 2. Apply median blur
# 3. Apply bilateral filter (edge preserving)
# 4. Apply a means denoising filter (second link)

dst = ...
tools.show_image_chain(images=[img1, dst], titles=["Original", "Denoised"])

---

## **Sharpening**

Sharpening enhances image edges, making details more distinct. Two commonly used methods are **unsharp masking** and **Laplacian filtering**.  

### **Unsharp masking:**

This method enhances details by subtracting a blurred version of the image from the original:  

1. Apply a Gaussian blur to the image.  
2. Subtract the blurred image from the original.  
3. Add the result back to the original image to enhance edges. 

### **Laplacian Filter**

This method applies a second-order derivative filter to emphasize edges:  

- A second-order derivative operator/filter/mask. 
- Uses specific convolution kernels: [0 1 0; 1 -4 1; 0 1 0] or [-1 -1 -1; -1 8 -1; -1 -1 -1]
- Note, the sum of the values of this filter is 0. 
- Apply using `cv.conv2(img, kernel, "same")``

Further reading:
- [Stackoverflow](https://stackoverflow.com/questions/4993082)
- [Geeks for Geeks](https://www.geeksforgeeks.org/image-sharpening-using-laplacian-filter-and-high-boost-filtering-in-matlab/)


<!-- - Cropping
- Enhancing contrast
- (Removing artifacts)
- Color conversion
- Color correction / white balancing
- Background removal -->

In [None]:
######################
###    EXCERISE    ###
######################

# Implement one of the above methods to sharpen an image.
kernel = ...
img1_sharp = ...

tools.show_image_pair(img1, img1_sharp, 
                      title1= "Original", 
                      title2="Sharpened")

---

## **Color / intensity enhancements**

In the previous exercise, we explored several histogram-based techniques for enhancing image contrast, including histogram stretching, histogram equalization, and histogram matching.  

For specific adjustments, converting to an alternative color space can be beneficial. 
The **HSV color space** is commonly used to modify saturation and brightness independently.
Other color spaces, such as **YCrCb, L\*a\*b, and Luv**, separate luminance (intensity) from chrominance (color), allowing for more precise intensity adjustments without affecting color balance.  


<!-- - Cropping
- Enhancing contrast
- (Removing artifacts)
- Color conversion
- Color correction / white balancing
- Background removal -->

In [None]:
######################
###    EXERCISE    ###
######################

# Using the previous notebook 01-image-processing, 
# - increase the contrast in the image
# - increase the saturation
# - try to whiten the background, without losing the cells

---

## **Background removal**

Background removal can be achieved through thresholding (detecting background color), identifying background structure, or applying segmentation techniques to isolate the foreground.  

In the next notebook, we may explore segmentation-based background removal. For now, we can experiment with a pre-trained model using the [RemBG package](https://github.com/danielgatis/rembg). This tool leverages a convolutional neural network available on Hugging Face ([see here](https://huggingface.co/spaces/KenjieDec/RemBG)) to automatically remove backgrounds from images.  

While the model does not perform well on hematological images, it may work for other types of images. Give it a try and see how it performs on your dataset!


In [None]:
# If rembg is not installed yet, uncomment the following line:
#%pip install rembg
#%pip install onnxruntime
from rembg import remove 
from PIL import Image 

# # RemBG requires a Pillow image as input. Let's
# # convert the NumPy array into a Pillow image.
# img_pil = Image.fromarray(img2) 
# # RemBG does not work for our dataset img1.
# img_nobg = remove(img_pil) 
# tools.show_image_pair(np.asarray(img_pil), np.asarray(img_nobg), background_color="pink")

# But it works for other datasets.
files = [ "hematology-blast1.jpg",
          "hematology-baso2.jpg",
          "hematology-baso1.jpg",
          "kingfisher.jpg", 
          "kingfisher-gray.jpg", 
          "histology-image.jpg", 
          "ct-brain-slices.jpg" ]

for f in files:
    img_pil = Image.open("../data/images/"+f)
    img_nobg = remove(img_pil) 
    img_nobg.save("oink.png")
    tools.show_image_pair(np.asarray(img_pil), np.asarray(img_nobg), 
                          title1=f, title2="No background",
                          background_color="pink")  