When we have cropped hand gesture images saved, they are all of different sizes. We want them to have an optimal size so that it is: 
1. Big enough to capture important hand-shape details
2. Small enough to train fast
3. Consistent across the dataset
    
So we chose to look at the distribution of all our image dimensions and choose a size close to the *median* or *mean*, then round to a CNN-friendly size (like 64, 96, 128). 
- *Motivation to this is:* Powers of 2 and Divisibility - many of these numbers are powers of 2 (64, 128, 256, which is close to 224 in practical terms) or easily divisible by 32. This is crucial because standard CNN architectures use multiple layers of pooling operations that typically reduce the image dimensions by half at each stage.

In [None]:
import os
import cv2
import numpy as np

input_folder = "./mediapipe_cropper/output"

widths = []
heights = []
count = 0

for filename in os.listdir(input_folder):
    img = cv2.imread(os.path.join(input_folder, filename))
    h, w = img.shape[:2]
    widths.append(w)
    heights.append(h)

print("Mean width: ", np.mean(widths), "px")
print("Mean height: ", np.mean(heights), "px")
print("Median width: ", np.median(widths), "px")
print("Median height: ", np.median(heights), "px")

print("Min size:", min(widths), "X", min(heights), "px")
print("Max size:", max(widths), "X", max(heights), "px")

#### So dataset (cropped images of "swipe" hand gestures with margin 20px) has:
- Width ~ 75 px
- Height ~ 125 px

Which indicates that:
- the images are not square
- The aspect ratio is roughly 3:5 (75:125 ≈ 0.6)

Since the cropped images are naturally rectangular, if we resize directly to square dimensions like:
- 96×96 -> hands will get squashed
- 128×128 -> same distortion problem

Therefore, we decided to resize while preserving aspect ratio, then pad to a square (add plack pixels)

In [None]:
import argparse 
import sys

# Simulate command line arguments
sys.argv = ['script.py', '-input', './mediapipe_cropper/output/', '-output', './resizer/output/']
parser = argparse.ArgumentParser()

#-i INPUT -o OUTPUT 
parser.add_argument("-input", "--input", dest = "input", default = "./input/", help="Path to input folder")
parser.add_argument("-output", "--output", dest = "output", default = "./output/", help="Path to output folder")

args = parser.parse_args()

input_path = args.input
output_path = args.output
TARGET_SIZE = 128

def resize_and_save(img, save_file_path, size):
    
    h, w = img.shape[:2]
    scale = size / max(h, w)
    new_w, new_h = int(w * scale), int(h * scale)
    resized = cv2.resize(img, (new_w, new_h))

    # compute padding
    top = (size - new_h) // 2
    bottom = size - new_h - top
    left = (size - new_w) // 2
    right = size - new_w - left

    padded = cv2.copyMakeBorder(
        resized, top, bottom, left, right,
        cv2.BORDER_CONSTANT, value=[0,0,0]
    )
        
    # Save cropped image
    cv2.imwrite(save_file_path, padded)


for subdir, dirs, files in os.walk(input_path, topdown=True):
    for dir in dirs:
        if not os.path.isdir(os.path.join(output_path, dir)):
            os.makedirs(os.path.join(output_path, dir))
    for file in files:
        if not file.endswith((".jpg", ".jpeg", ".png")):
            continue
        file = os.path.relpath(os.path.join(subdir, file), input_path)
        
# Load the input image.
        input_file_path = os.path.join(input_path, file)
        image = cv2.imread(input_file_path)

# Resize and save image 
        output_file_path = os.path.join(output_path, file)
        print("processing ", file, "...")
        resize_and_save(image, output_file_path, TARGET_SIZE)
        
print("Done resizing")