# Lab 04 : Image Data Handling


#### Lab Overview

This workshop focuses on data handling and preparation in case of a dataset consisting of images.

---

#### Objective

By the end of this lab, you will be able to:

1.
2.
3.

---


#### What is Image Processing?

- **Definition**: Manipulating pixel-based (raster) images to enhance them, extract information, or transform them.
- **Domains**:
  - **Low-level processing**: Noise removal, contrast adjustment, filtering.
  - **Mid-level processing**: Segmentation, feature extraction.
  - **High-level processing**: Interpretation, object recognition, scene understanding.


#### Data loading

| Concept                           | Description                                                        | Syntax (Library)                                                                                                                                                                                                                                                     |
| --------------------------------- | ------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Directory Traversal**           | Recursively gather all image file paths in a folder                | `from pathlib import Path`<br>`paths = list(Path('data/images').rglob('*.jpg'))  # pathlib`                                                                                                                                                                          |
| **Batch Loading & Preprocessing** | Loop over paths to read, convert color, resize, etc.               | `import cv2`<br>for path in paths:<br>`    img = cv2.imread(str(path))               # BGR image`<br>`    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # convert to RGB`<br>`    img = cv2.resize(img, (224,224))  # resize`<br>`    label = path.stem.split('_')[0]` |
| **Saving Processed Images**       | Write your processed arrays back to disk, keeping label subfolders | `from pathlib import Path`<br>`out_dir = Path('processed')/label`<br>`out_dir.mkdir(parents=True, exist_ok=True)`<br>`cv2.imwrite(str(out_dir/path.name), processed_img)  # cv2 + pathlib`                                                                           |


#### Digital Image Representation

| Concept                              | Description                                        | Syntax (Library)                                                                                                                                                                        |
| ------------------------------------ | -------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Grayscale Loading**                | Load image as single-channel (0–255)               | `gray = cv2.imread('path/to/image.jpg', cv2.IMREAD_GRAYSCALE)` (cv2)                                                                                                                    |
| **Grayscale Conversion & Normalize** | Decode JPEG → to gray → normalize to [0,1] float32 | `raw = tf.io.read_file('path'); img = tf.image.decode_jpeg(raw, channels=3); gray = tf.image.rgb_to_grayscale(img); gray = tf.image.convert_image_dtype(gray, tf.float32)` (TensorFlow) |
| **Display Grayscale**                | Show gray image with colormap                      | `plt.imshow(gray, cmap='gray'); plt.axis('off')` (Matplotlib)                                                                                                                           |
| **Color Loading & BGR→RGB**          | Read BGR image and convert to RGB                  | `img_bgr = cv2.imread('path'); img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)` (cv2)                                                                                                |
| **Display Color**                    | Show RGB image                                     | `plt.imshow(img_rgb); plt.axis('off')` (Matplotlib)                                                                                                                                     |
| **Resize**                           | Resize image to 512×512 px                         | `resized = cv2.resize(img, (512, 512), interpolation=cv2.INTER_LINEAR)` (cv2)                                                                                                           |
| **Resize (TensorFlow)**              | Resize tensor image to 512×512                     | `resized = tf.image.resize(img, [512, 512], method='bilinear')` (TensorFlow)                                                                                                            |
| **Figure Scaling**                   | Control display size (inches)                      | `plt.figure(figsize=(6, 6))` (Matplotlib)                                                                                                                                               |
| **Load High Bit-Depth**              | Read image preserving original bit depth           | `img16 = cv2.imread('path', cv2.IMREAD_UNCHANGED)` (cv2)                                                                                                                                |
| **Scale uint16→uint8**               | Convert 16-bit image to 8-bit                      | `img8 = cv2.convertScaleAbs(img16, alpha=255/65535)` (cv2)                                                                                                                              |
| **Normalize dtype**                  | Cast to float32 & scale pixel values to [0,1]      | `img_f32 = tf.image.convert_image_dtype(img, tf.float32)` (TensorFlow)                                                                                                                  |
| **BGR→HSV**                          | Convert BGR image to HSV                           | `hsv = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2HSV)` (cv2)                                                                                                                                  |
| **RGB→HSV (TensorFlow)**             | Convert RGB float image to HSV                     | `hsv_tf = tf.image.rgb_to_hsv(tf.image.convert_image_dtype(img_rgb, tf.float32))` (TensorFlow)                                                                                          |
| **BGR→Lab**                          | Convert BGR image to CIELab                        | `lab = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2Lab)` (cv2)                                                                                                                                  |


#### Fundamental Operations

| Concept                            | Description                                                  | Syntax (Library)                                                                                                                                                                                         |
| ---------------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Brightness / Contrast**          | Pixel-wise linear scaling & shift                            | `adj = cv2.convertScaleAbs(img, alpha=α, beta=β)` (cv2)<br>`bright = tf.image.adjust_brightness(img_tf, delta)` (TensorFlow)<br>`contr = tf.image.adjust_contrast(img_tf, contrast_factor)` (TensorFlow) |
| **Gamma Correction**               | Non-linear mapping: Iout = 255·(Iin/255)ᵞ                    | `gamma_np = np.power(img/255.0, γ) * 255` (NumPy)<br>`img_gc = tf.image.adjust_gamma(img_tf, gamma=γ)` (TensorFlow)                                                                                      |
| **Global Thresholding**            | Binary conversion using a fixed threshold                    | `_, th = cv2.threshold(gray, t, 255, cv2.THRESH_BINARY)` (cv2)<br>`binary = tf.where(gray_tf > t, 1, 0)` (TensorFlow)                                                                                    |
| **Adaptive Thresholding**          | Local binary conversion per neighborhood                     | `th = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, blockSize, C)` (cv2)                                                                                           |
| **Mean Filter**                    | Linear neighborhood averaging                                | `mean = cv2.blur(img, (k, k))` (cv2)<br>`mean_tf = tf.nn.avg_pool2d(img_batch, ksize=k, strides=1, padding='SAME')` (TensorFlow)                                                                         |
| **Gaussian Blur**                  | Weighted linear smoothing                                    | `gblur = cv2.GaussianBlur(img, (k, k), σ)` (cv2)                                                                                                                                                         |
| **Sharpening Filter**              | Edge enhancement via convolution                             | `kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]])` (NumPy)<br>`sharp = cv2.filter2D(img, -1, kernel)` (cv2)                                                                                      |
| **Median Filter**                  | Non-linear neighborhood filter for impulse noise             | `med = cv2.medianBlur(img, k)` (cv2)                                                                                                                                                                     |
| **Morphological Opening**          | Erosion → dilation to remove small objects                   | `opened = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)` (cv2)                                                                                                                                           |
| **Morphological Closing**          | Dilation → erosion to fill small holes                       | `closed = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)` (cv2)                                                                                                                                          |
| **Histogram Equalization**         | Redistribute intensities for global contrast enhancement     | `eq = cv2.equalizeHist(gray)` (cv2)                                                                                                                                                                      |
| **DFT / FFT**                      | Convert to frequency domain                                  | `dft = cv2.dft(np.float32(gray), flags=cv2.DFT_COMPLEX_OUTPUT)` (cv2)<br>`fft = tf.signal.fft2d(tf.cast(gray_tf, tf.complex64))` (TensorFlow)                                                            |
| **Low-Pass Filtering (Freq-dom)**  | Suppress high-frequency components via frequency-domain mask | `fshift = np.fft.fftshift(dft)` (NumPy)<br>`mask[...] = 1  # central low-pass mask` (NumPy)<br>`filt = fshift * mask[:, :, None]` (NumPy)<br>`img_back = cv2.idft(np.fft.ifftshift(filt))` (cv2)         |
| **High-Pass Filtering (Freq-dom)** | Suppress low-frequency components via inverted mask          | Same as low-pass but invert mask (cv2/NumPy)                                                                                                                                                             |


#### Geometric Transformations

| Concept                  | Description                                                                                                  | Syntax (Library)                                                                                                                                                                                                           |
| ------------------------ | ------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Scaling**              | Enlarge or shrink an image by scaling factors along X/Y axes                                                 | `rescaled = cv2.resize(img, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_LINEAR)`                                                                                                                                         |
| **Translation**          | Shift an image by (tx, ty) pixels                                                                            | `M = np.float32([[1, 0, tx], [0, 1, ty]])`<br>`translated = cv2.warpAffine(img, M, (width, height))`                                                                                                                       |
| **Rotation**             | Rotate an image by θ degrees around its center                                                               | `center=(w/2,h/2)`<br>`M = cv2.getRotationMatrix2D(center, angle, scale=1.0)`<br>`rotated = cv2.warpAffine(img, M, (w, h))`<br>`python<br># TensorFlow Addons<br>rotated_tf = tfa.image.rotate(img_tf, angles_in_radians)` |
| **Affine Transform**     | 6-parameter linear transform (preserves parallelism) mapping three source points to three destinations       | `pts1 = np.float32([[x1,y1],[x2,y2],[x3,y3]])`<br>`pts2 = np.float32([[x1',y1'],[x2',y2'],[x3',y3']])`<br>`M = cv2.getAffineTransform(pts1, pts2)`<br>`affine = cv2.warpAffine(img, M, (w, h))`                            |
| **Projective Transform** | 8-parameter perspective transform (handles vanishing points), mapping four source to four destination points | `src = np.float32([[x1,y1],[x2,y2],[x3,y3],[x4,y4]])`<br>`dst = np.float32([[x1',y1'],[x2',y2'],[x3',y3'],[x4',y4']])`<br>`M = cv2.getPerspectiveTransform(src, dst)`<br>`persp = cv2.warpPerspective(img, M, (w, h))`     |


#### Image Segmentation

| Concept                                        | Description                                                              | Syntax (Library)                                                                                                                                                      |
| ---------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Image Segmentation**                         | Dividing an image into meaningful regions or objects.                    | —                                                                                                                                                                     |
| **Global Thresholding**                        | Segment image by applying a single intensity cutoff.                     | `_, th = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)` (cv2)<br>`th_tf = tf.where(gray_tf > 0.5, 1.0, 0.0)` (TensorFlow)                                          |
| **Adaptive Thresholding**                      | Local threshold based on neighborhood statistics (handles uneven light). | `clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)); th_adapt = clahe.apply(gray)` (cv2)<br>`# TF: use sliding window & tf.where for custom local thresholds` |
| **Edge-based Segmentation (Canny + Contours)** | Detect edges, then trace contours to outline objects.                    | `edges = cv2.Canny(gray, 100, 200)`<br>`contours, _ = cv2.findContours(edges, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)` (cv2)                                      |


#### Typical Workflow

1. **Acquire & load** images (DICOM/Png/Jpeg).
2. **Preprocess**: resize, normalize, denoise, correct illumination.
3. **Transform**: apply filters, histogram equalization.
4. **Segment** regions of interest.
5. **Extract** features or feed into a deep model.
6. **Post-process**: morphological cleanup, threshold refinement.
7. **Analyze** or visualize results.

#### Practical Tips

- Always inspect a few raw images to understand noise patterns and artifacts.
- Start with simple filters (median, Gaussian) before jumping to complex methods.
- Normalize pixel values (e.g., scale to [0, 1]) when using neural networks.
- Use cross-validation and pay attention to data leakage, especially in medical imaging.
- Leverage pre-trained models and transfer learning to save time and data.


---

#### Hands-on Activity

For the assigned dataset perform the following tasks: <br>
**Task 1** : Load all the image data, their bounding box coordinates, and filename <br>
**Task 2** : create a dataframe to store in it for each image original filename, bounding box coordinates, class, and modified filename. <br>
**Task 3** : create a new directory containing 4 subdirectries (one directory for each class) <br>
**Task 4** : for each loaded extract the class from the filename and store it in the correct new subdirectory created and rename image to be `img_[i].jpg` <br>
**Task 5** : add the necessary data about the image in the created dataframe <br>
**Task 6** : load and display 5 images for each class to tests image processing on.

- for each image processing step display the output to see the change done on the image.
- display images of same class on the same row <br>

**Task 7** : Apply at least 5 suitable image processing techniques from the mentioned above. Explain why you selected a certain method and how did it affect the image. (each image processing technique in a seperate code block)

##### Note the following:

- When necessary display/add briefly the logic/reasoning of a data procedure done.
- Write clean code, allocate at least 1 code block for each task.


In [37]:
#library

from pathlib import Path
import shutil
import os
import pandas as pd

In [None]:
#task 1

path_to_data = Path('banana_dataset')
splits = ['train', 'valid', 'test']
image_data = []

# Go through each split (train, valid, test)
for split in splits:
    images_dir = path_to_data / split / 'images'
    labels_dir = path_to_data / split / 'labels'

    image_paths = list(images_dir.glob('*.jpg'))

    for img_path in image_paths:
        label_path = labels_dir / img_path.with_suffix('.txt').name

        with open(label_path, 'r') as label_file:
            line = label_file.readline()


        parts = line.strip().split()
        if len(parts) != 5: #drop image and label if found
            print(f"{label_path}")
            image_path_to_drop = label_path.with_suffix('.jpg')
            #image_path_to_drop = image_path_to_drop.replace(r"\labels", r"\images")
            # os.remove(label_path)
            # os.remove(image_path_to_drop)
            print(image_path_to_drop)
            continue


        class_id = int(parts[0])
        x_center, y_center, width, height = map(float, parts[1:])

        image_data.append({
            'image_path': str(img_path),
            'original_filename': img_path.name,
            'class_id': class_id,
            'x_center': x_center,
            'y_center': y_center,
            'width': width,
            'height': height
        })
        


banana_dataset\train\labels\musa-acuminata-unripe-627d985c-2653-11ec-a294-d8c4975e38aa_jpg.rf.82f61c4245036cc83baeef3cd34c57d6.txt


TypeError: Path.replace() takes 2 positional arguments but 3 were given

In [44]:
#task 1 _ test
print(f"Number of images: {len(image_data)}")

Number of images: 799


In [None]:
#task 2

df = pd.DataFrame(image_data)

#df.to_csv('.\data\lab_04_A.csv', index=False)

                                   original_filename  class_id  x_center  \
0  musa-acuminata-overripe-9d459010-1d0a-11ec-89c...         0  0.498958   
1  musa-acuminata-overripe-9d6229db-1d0a-11ec-90a...         0  0.502083   
2  musa-acuminata-overripe-9d648d3b-1d0a-11ec-838...         0  0.495833   
3  musa-acuminata-overripe-9d9b5da1-1d0a-11ec-a47...         0  0.498958   
4  musa-acuminata-overripe-9dac0cb7-1d0a-11ec-83d...         0  0.498958   

   y_center     width    height splitted_to  
0  0.623611  0.997917  0.652778       train  
1  0.581944  0.895833  0.558333       train  
2  0.409722  0.895833  0.563889       train  
3  0.500000  0.997917  0.611111       train  
4  0.479167  0.997917  0.958333       train  


In [33]:
#task 2 _ display
df = pd.read_csv(".\data\lab_04_A.csv")
display(df)
df.info()

Unnamed: 0,image_path,original_filename,class_id,x_center,y_center,width,height
0,banana_dataset\train\images\musa-acuminata-ove...,musa-acuminata-overripe-9d459010-1d0a-11ec-89c...,0,0.498958,0.623611,0.997917,0.652778
1,banana_dataset\train\images\musa-acuminata-ove...,musa-acuminata-overripe-9d6229db-1d0a-11ec-90a...,0,0.502083,0.581944,0.895833,0.558333
2,banana_dataset\train\images\musa-acuminata-ove...,musa-acuminata-overripe-9d648d3b-1d0a-11ec-838...,0,0.495833,0.409722,0.895833,0.563889
3,banana_dataset\train\images\musa-acuminata-ove...,musa-acuminata-overripe-9d9b5da1-1d0a-11ec-a47...,0,0.498958,0.500000,0.997917,0.611111
4,banana_dataset\train\images\musa-acuminata-ove...,musa-acuminata-overripe-9dac0cb7-1d0a-11ec-83d...,0,0.498958,0.479167,0.997917,0.958333
...,...,...,...,...,...,...,...
794,banana_dataset\test\images\musa-acuminata-over...,musa-acuminata-overripe-a00bd29a-1d0a-11ec-b8c...,0,0.498958,0.423611,0.997917,0.558333
795,banana_dataset\test\images\musa-acuminata-over...,musa-acuminata-overripe-a01c81c2-1d0a-11ec-96d...,0,0.536458,0.541667,0.906250,0.761111
796,banana_dataset\test\images\musa-acuminata-over...,musa-acuminata-overripe-a0260a6b-1d0a-11ec-888...,0,0.498958,0.427778,0.997917,0.855556
797,banana_dataset\test\images\musa-acuminata-over...,musa-acuminata-overripe-a04e8f14-1d0a-11ec-a09...,0,0.415625,0.498611,0.493750,0.997222


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 799 entries, 0 to 798
Data columns (total 7 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   image_path         799 non-null    object 
 1   original_filename  799 non-null    object 
 2   class_id           799 non-null    int64  
 3   x_center           799 non-null    float64
 4   y_center           799 non-null    float64
 5   width              799 non-null    float64
 6   height             799 non-null    float64
dtypes: float64(4), int64(1), object(2)
memory usage: 43.8+ KB


In [None]:
#task 3

processed_directory_path = 'processed_images'
os.makedirs(processed_directory_path, exist_ok=True)


class_dirs = {} #to be used in task 4 

for class_id in range(4):
    class_path = os.path.join(processed_directory_path, f'class_{class_id}')
    os.makedirs(class_path, exist_ok=True)
    class_dirs[class_id] = class_path

print("Subdirectories created:")
print(class_dirs)

Subdirectories created:
{0: 'processed_images\\class_0', 1: 'processed_images\\class_1', 2: 'processed_images\\class_2', 3: 'processed_images\\class_3'}


In [None]:
#task 4

# Track new paths
new_paths = []

# Go through each row in the DataFrame
for i, row in df.iterrows():
    class_id = row['class_id']
    src_path = row['image_path']
    dst_filename = row['modified_filename']
    dst_path = os.path.join(class_dirs[class_id], dst_filename)

    # Copy the file to the new location
    shutil.copy(src_path, dst_path)

    # Store new path
    new_paths.append(dst_path)

# Add the new image path to the DataFrame
df['new_path'] = new_paths

# Preview updated DataFrame
print(df[['original_filename', 'class_id', 'modified_filename', 'new_path']].head())

