When we have cropped hand gesture images saved, they are all of different sizes. We want them to have an optimal size so that it is: 
1. Big enough to capture important hand-shape details
2. Small enough to train fast
3. Consistent across the dataset
    
So we chose to look at the distribution of all our image dimensions and choose a size close to the *median* or *mean*, then round to a CNN-friendly size (like 64, 96, 128). 
- *Motivation to this is:* Powers of 2 and Divisibility - many of these numbers are powers of 2 (64, 128, 256, which is close to 224 in practical terms) or easily divisible by 32. This is crucial because standard CNN architectures use multiple layers of pooling operations that typically reduce the image dimensions by half at each stage.

In [6]:
import os
import cv2
import numpy as np

input_folder = "./mediapipe_cropper/gestures_processed/train_point_up/"

widths = []
heights = []
count = 0

for filename in os.listdir(input_folder):
    img = cv2.imread(os.path.join(input_folder, filename))
    h, w = img.shape[:2]
    widths.append(w)
    heights.append(h)

print("Mean width: ", np.mean(widths), "px")
print("Mean height: ", np.mean(heights), "px")
print("Median width: ", np.median(widths), "px")
print("Median height: ", np.median(heights), "px")

print("Min size:", min(widths), "X", min(heights), "px")
print("Max size:", max(widths), "X", max(heights), "px")

Mean width:  75.06502242152466 px
Mean height:  126.26083707025411 px
Median width:  74.0 px
Median height:  124.0 px
Min size: 44 X 60 px
Max size: 161 X 304 px


#### So dataset has:
- Width ~ 75 px
- Height ~ 125 px

Which indicates that:
- the images are not square
- The aspect ratio is roughly 3:5 (75:125 ≈ 0.6)

Since the cropped images are naturally rectangular, if we resize directly to square dimensions like:
- 96×96 -> hands will get squashed
- 128×128 -> same distortion problem

Therefore, we decided to resize while preserving aspect ratio, then pad to a square (add plack pixels)

In [11]:
import argparse 
import sys

# Simulate command line arguments
sys.argv = ['script.py', '-input', './mediapipe_cropper/gestures_processed/train_point_up/', '-output', './resizer/output1/']
parser = argparse.ArgumentParser()

#-i INPUT -o OUTPUT 
parser.add_argument("-input", "--input", dest = "input", default = "./input/", help="Path to input folder")
parser.add_argument("-output", "--output", dest = "output", default = "./output/", help="Path to output folder")

args = parser.parse_args()

input_path = args.input
output_path = args.output
TARGET_SIZE = 128

def resize_and_save(img, save_file_path, size):
    
    h, w = img.shape[:2]
    scale = size / max(h, w)
    new_w, new_h = int(w * scale), int(h * scale)
    resized = cv2.resize(img, (new_w, new_h))

    # compute padding
    top = (size - new_h) // 2
    bottom = size - new_h - top
    left = (size - new_w) // 2
    right = size - new_w - left

    padded = cv2.copyMakeBorder(
        resized, top, bottom, left, right,
        cv2.BORDER_CONSTANT, value=[0,0,0]
    )
        
    # Save cropped image
    cv2.imwrite(save_file_path, padded)


for subdir, dirs, files in os.walk(input_path, topdown=True):
    for dir in dirs:
        if not os.path.isdir(os.path.join(output_path, dir)):
            os.makedirs(os.path.join(output_path, dir))
    for file in files:
        if not file.endswith((".jpg", ".jpeg", ".png")):
            continue
        file = os.path.relpath(os.path.join(subdir, file), input_path)
        
# Load the input image.
        input_file_path = os.path.join(input_path, file)
        image = cv2.imread(input_file_path)

# Resize and save image 
        output_file_path = os.path.join(output_path, file)
        print("processing ", file, "...")
        resize_and_save(image, output_file_path, TARGET_SIZE)
        
print("Done resizing")

processing  002328c4-90bc-4720-b0ea-af4e06656073_left.jpg ...
processing  003e5a9c-d03a-4ff7-8ad6-645ce884af12_right.jpg ...
processing  006ef058-c8a8-41d5-9c44-cdb0cd47f0e4_left.jpg ...
processing  009b6f29-321b-4285-ba5a-c4a1d46876c3_right.jpg ...
processing  0120eeb1-619a-4fcc-857c-8efb2eb932ce_right.jpg ...
processing  014a3470-5f58-4f9c-98f6-2ab2536ddb2d_right.jpg ...
processing  0171c662-c519-4f90-9bae-2a3f4f9dcbe1_right.jpg ...
processing  019b2626-25cc-4b72-851b-f32f95e70624_right.jpg ...
processing  01b0be8e-5de0-4f10-9192-67431e91ffad_left.jpg ...
processing  01d1b305-46cf-44d0-8d78-fffffb90c515_left.jpg ...
processing  024d4b1d-6ffa-4306-befb-41be9bc09d18_right.jpg ...
processing  028c1802-9366-44ca-9e97-8cd222aa278b_right.jpg ...
processing  02ce1c69-4687-4fe1-984a-47d7a9fd5841_right.jpg ...
processing  02f22fe3-a7e1-4668-a316-f6bde6b9ca9d_right.jpg ...
processing  032214b5-ce3e-43fa-be03-0eaa6bcee7f1_left.jpg ...
processing  0349068e-7917-4ef2-9d15-9007388ec99b_right.jpg .