
We start our dataset preprocessing by extracting and cropping hand gestures from the dataset explored in the previous notebook.

In [None]:
from utils.mediapipe_cropper.cropper import process_images
process_images('shared_artifacts/images/hagrid_30k', 'shared_artifacts/images/hagrid_30k_cropped')

When we have cropped hand gesture images saved, they are all of different sizes. We want them to have an optimal size so that it is: 
1. Big enough to capture important hand-shape details
2. Small enough to train fast
3. Consistent across the dataset
    
So we chose to look at the distribution of all our image dimensions and choose a size close to the *median* or *mean*, then round to a CNN-friendly size (like 64, 96, 128). 
- *Motivation to this is:* Powers of 2 and Divisibility - many of these numbers are powers of 2 (64, 128, 256, which is close to 224 in practical terms) or easily divisible by 32. This is crucial because standard CNN architectures use multiple layers of pooling operations that typically reduce the image dimensions by half at each stage.

In [11]:
import os
import cv2
import numpy as np

input_folder = "shared_artifacts/images/hagrid_30k_cropped"

widths = []
heights = []

for label in os.listdir(input_folder):
    label_folder = os.path.join(input_folder, label)

    for f in os.listdir(label_folder):
        img = cv2.imread(os.path.join(label_folder, f))
        h, w = img.shape[:2]
        widths.append(w)
        heights.append(h)

print(f"Mean width: {np.mean(widths):.0f} px")
print(f"Mean height: {np.mean(heights):.0f} px")
print(f"Median width: {np.median(widths)} px")
print(f"Median height: {np.median(heights)} px")

Mean width: 88 px
Mean height: 105 px
Median width: 84.0 px
Median height: 100.0 px


#### So dataset (cropped images of "swipe" hand gestures with margin 20px) has:
- Width ~ 75 px
- Height ~ 125 px

Which indicates that:
- the images are not square
- The aspect ratio is roughly 3:5 (75:125 ≈ 0.6)

Since the cropped images are naturally rectangular, if we resize directly to square dimensions like:
- 96×96 -> hands will get squashed
- 128×128 -> same distortion problem

Therefore, we decided to resize while preserving aspect ratio, then pad to a square (add plack pixels)

In [None]:
from utils.resizer.resizer import process_images

input_path = "shared_artifacts/images/hagrid_30k_cropped"
output_path = "shared_artifacts/images/hagrid_30k_resized" 

TARGET_SIZE = 94

process_images(input_path, TARGET_SIZE, TARGET_SIZE, output_path)