# ASL Fingerspelling Recognition

In this notebook a brief setup guide along with the processes on how data was collected, preprocesses and trained will be discussed

## Setup (Windows Only)

For running and testing, please install the below libraries:

1. For data collection and preprocessing:
(Open-CV, cvzone, NumPy, MediaPipe, Pillow)

In [None]:
%pip install opencv-python cvzone numpy mediapipe Pillow 

2. For ML models and Training: (Tensorflow, Tensorflow-Hub, Matplotlib, Seaborn)

Please install on the same enviroment. (Required for training)

In [3]:
%pip install tensorflow tensorflow-hub matplotlib seaborn scikit-learn

Collecting tensorflow-hub
  Using cached tensorflow_hub-0.16.1-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting seaborn
  Using cached seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)
Collecting libclang>=13.0.0 (from tensorflow)
  Using cached libclang-18.1.1-py2.py3-none-win_amd64.whl.metadata (5.3 kB)
Collecting protobuf<3.20,>=3.9.2 (from tensorflow)
  Using cached protobuf-3.19.6-cp39-cp39-win_amd64.whl.metadata (807 bytes)
Collecting tensorflow-io-gcs-filesystem>=0.23.1 (from tensorflow)
  Downloading tensorflow_io_gcs_filesystem-0.31.0-cp39-cp39-win_amd64.whl.metadata (14 kB)
Collecting tf-keras>=2.14.1 (from tensorflow-hub)
  Using cached tf_keras-2.18.0-py3-none-any.whl.metadata (1.6 kB)
Collecting tensorflow
  Downloading tensorflow-2.18.0-cp39-cp39-win_amd64.whl.metadata (3.3 kB)
Collecting tensorflow-intel==2.18.0 (from tensorflow)
  Downloading tensorflow_intel-2.18.0-cp39-cp39-win_amd64.whl.metadata (4.9 kB)
INFO: pip is looking at multiple versions of tensorflow-intel 

  You can safely remove it manually.


3. For Real-Time Recognition: [Same as collection and preprocessing] (Open-CV, cvzone, NumPy, MediaPipe, Pillow, pyttsx3)

In [None]:
# !! Skip if installed using step-1
%pip install opencv-python cvzone numpy mediapipe Pillow pyttsx3

In [None]:
%pip install pyttsx3

NOTE:

1. There may protobuf based errors when trying to run.
Please install/re-install lower version of protobuf if faced (3.20.x or lower)

2. There may be mediapipe based errors, please reinstall medipipe suitable for above protobuf version.

## Custom Collection Pipeline

For the process of collecting images of hand along with basic preprocessing while collection.

Run the below cell after installing relevant dependancies

### Note:

IMPORTANT: WebCam required

1. Below script collects 600, 500x500 images for each alphabets/class (A to Y, no Z) 
2. To start collection press S on the opencv frame (webcam frame)
3. Fill in the baseFolder variable with the repository location to save the classes and images.
4. After collecting images for the class, ENTER KEY prompt will pop up in VS-code search-bar/top-panel. Press enter there which will start process for next class
5. Repeat steps 2 and 3 for subsequent collections
6. After the collection of final class the program will exit/halt.
7. The images can be found in the baseFolder repository

### Preprocessings:

1. Cropping hands from complete frame
2. Adding hand landmarks to cropped frame
3. Converting cropped frame to binary image (black and white pixels only)

In [None]:
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import os
import mediapipe as mp

#--------------------
baseFolder = "C:/path/to/your/output_folder"  # <-- change this
#--------------------


# Initialize webcam
cap = cv2.VideoCapture(0)

# Initialize hand detector
detector = HandDetector(maxHands=1)

# Constants
imgSize = 500
# List of letters A-Y (adjust as needed)
letters = [chr(i) for i in range(ord('A'), ord('Y') + 1)]
maxImages = 600          # Total images to capture per class
paddingFactor = 0.45     # Padding percentage

mp_hands = mp.solutions.hands

# Function to process and resize images (canvas will match input channels)
def process_and_resize(imgCrop, aspectRatio, imgSize):
    channels = 1 if len(imgCrop.shape) == 2 else imgCrop.shape[2]
    imgWhite = np.ones((imgSize, imgSize, channels), np.uint8) * 0
    try:
        if aspectRatio > 1:
            # If height > width:
            k = imgSize / imgCrop.shape[0]
            wCal = math.ceil(k * imgCrop.shape[1])
            imgResize = cv2.resize(imgCrop, (wCal, imgSize))
            wGap = math.ceil((imgSize - wCal) / 2)
            imgWhite[:, wGap:wCal + wGap] = imgResize
        else:
            # If width >= height:
            k = imgSize / imgCrop.shape[1]
            hCal = math.ceil(k * imgCrop.shape[0])
            imgResize = cv2.resize(imgCrop, (imgSize, hCal))
            hGap = math.ceil((imgSize - hCal) / 2)
            imgWhite[hGap:hCal + hGap, :] = imgResize
    except Exception as e:
        print(f"Error during image processing: {e}")
        return None
    return imgWhite

# Function to detect skin using YCrCb thresholds (returns a binary mask)
def detect_skin(frame):
    ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)
    mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
    # Clean up noise
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
    # (Optional) clean-up by drawing filled contours
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contour_mask = np.zeros_like(mask)
    cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
    mask = cv2.bitwise_and(mask, contour_mask)
    
    return mask

# Main loop for each letter
for className in letters:
    print(f"Starting collection for: {className}")
    folder = os.path.join(baseFolder, className)
    os.makedirs(folder, exist_ok=True)

    counter, collecting = 0, False

    while counter < maxImages:
        success, img = cap.read()
        if not success:
            print("Camera access failed.")
            break

        # Detect hand on the full image (only once)
        hands, _ = detector.findHands(img, draw=False)
        if hands:
            # Use the first detected hand
            hand = hands[0]
            bbox = hand['bbox']       # [x, y, w, h]
            lm_list = hand['lmList']    # list of landmarks in full-image coordinates
            x, y, w, h = bbox

            # Calculate padding (based on hand size)
            xPad = int(w * paddingFactor)
            yPad = int(h * paddingFactor)

            # Compute crop boundaries (make sure they’re within image bounds)
            crop_x1 = max(0, x - xPad)
            crop_y1 = max(0, y - yPad)
            crop_x2 = min(x + w + xPad, img.shape[1])
            crop_y2 = min(y + h + yPad, img.shape[0])
            imgCrop = img[crop_y1:crop_y2, crop_x1:crop_x2]

            if imgCrop.size > 0:
                # ----- STEP 1: Draw landmarks on the cropped image using adjusted coordinates -----
                # Create a copy of the crop on which we will draw the landmarks
                imgCrop_landmarked = imgCrop.copy()
                # Adjust each landmark from full-image coordinates to crop coordinates
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(imgCrop_landmarked, (adj_x, adj_y), 4, (0, 0, 255), -1)
                # Optionally, also draw the connections between landmarks:
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(imgCrop_landmarked, pt1_adjusted, pt2_adjusted, (0, 0, 255), 2)

                # ----- STEP 2: Convert the landmarked crop to a binary image -----
                # First, get a binary skin mask from the original crop (without landmarks)
                binaryMask = detect_skin(imgCrop)
                # Create a blank (black) image and fill white where skin is detected
                binary_result = np.zeros_like(imgCrop)
                binary_result[binaryMask > 0] = [255, 255, 255]
                # Now overlay the landmarks (drawn in black) onto the binary image.
                # (Using the same adjusted coordinates from above)
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(binary_result, (adj_x, adj_y), 4, (0, 0, 0), -1)
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(binary_result, pt1_adjusted, pt2_adjusted, (0, 0, 0), 2)

                # ----- STEP 3: Resize for saving/visualization -----
                # Use the crop’s aspect ratio for correct resizing. (You can also use h/w from bbox.)
                aspectRatio = (crop_y2 - crop_y1) / (crop_x2 - crop_x1)
                imgWhite = process_and_resize(binary_result, aspectRatio, imgSize)
                if imgWhite is not None:
                    cv2.imshow("Processed Binary Image", imgWhite)
                    if collecting:
                        counter += 1
                        savePath = os.path.join(folder, f"{className.lower()}_{counter}.jpg")
                        cv2.imwrite(savePath, imgWhite)
                        print(f"Saved {counter}/{maxImages} images for {className}")

        # Show the original live feed
        cv2.imshow("Live Feed with Landmarks", img)
        key = cv2.waitKey(1)
        if key == ord('s'):
            collecting = True
        if key == ord('p'):
            collecting = False

    print(f"Completed collection for {className}")
    input("Press Enter for next class.")

cap.release()
cv2.destroyAllWindows()


## Other Dataset Basic Preprocessing Pipeline

For the basic preprocessing of hand images.

Run the below cell after installing relevant dependancies

### Note:

1. Enter input folder 
2. Enter outPut folder for saving in the repository

### Preprocessings:

1. Adding hand landmarks to cropped frame
2. Converting cropped frame to binary image (black and white pixels only)

In [None]:
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import os
import mediapipe as mp

# ----------------------------
# Folder containing subfolders for each class (e.g., A, B, C, …)
inputFolder = "C:/path/to/your/input_folder"    # <-- change this
# Folder where the processed images will be saved
outputFolder = "C:/path/to/your/output_folder"  # <-- change this
# ----------------------------


# ----------------------------
# Configuration and Constants
# ----------------------------
imgSize = 500  # final output image will be imgSize x imgSize
os.makedirs(outputFolder, exist_ok=True)
# Initialize the hand detector (using CVZone)
detector = HandDetector(maxHands=1)
# For drawing hand connections
mp_hands = mp.solutions.hands

# ----------------------------
# Utility Functions
# ----------------------------
def process_and_resize(imgInput, aspectRatio, imgSize):
    """
    Resize an image to fit inside a square canvas while preserving aspect ratio.
    """
    channels = 1 if len(imgInput.shape) == 2 else imgInput.shape[2]
    # Create a blank (black) square image
    imgWhite = np.ones((imgSize, imgSize, channels), np.uint8) * 0
    try:
        if aspectRatio > 1:
            # If the image is taller than wide:
            k = imgSize / imgInput.shape[0]
            wCal = math.ceil(k * imgInput.shape[1])
            imgResize = cv2.resize(imgInput, (wCal, imgSize))
            wGap = math.ceil((imgSize - wCal) / 2)
            imgWhite[:, wGap:wCal + wGap] = imgResize
        else:
            # If the image is wider than tall:
            k = imgSize / imgInput.shape[1]
            hCal = math.ceil(k * imgInput.shape[0])
            imgResize = cv2.resize(imgInput, (imgSize, hCal))
            hGap = math.ceil((imgSize - hCal) / 2)
            imgWhite[hGap:hCal + hGap, :] = imgResize
    except Exception as e:
        print(f"Error during image processing: {e}")
        return None
    return imgWhite

def detect_skin(frame):
    """
    Detect skin regions using YCrCb color space thresholds and return a binary mask.
    """
    ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)
    mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
    # Clean up noise using morphological operations
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
    # (Optional) Draw filled contours to further clean up the mask
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contour_mask = np.zeros_like(mask)
    cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
    mask = cv2.bitwise_and(mask, contour_mask)
    
    return mask

# ----------------------------
# Main Processing Loop
# ----------------------------
# Iterate over each class folder in the input folder.
for className in os.listdir(inputFolder):
    classPath = os.path.join(inputFolder, className)
    if not os.path.isdir(classPath):
        continue

    print(f"Processing class: {className}")
    # Create corresponding output folder for this class
    outClassFolder = os.path.join(outputFolder, className)
    os.makedirs(outClassFolder, exist_ok=True)

    # Process each image file in the class folder
    for imgFile in os.listdir(classPath):
        imgPath = os.path.join(classPath, imgFile)
        img = cv2.imread(imgPath)
        if img is None:
            print(f"Failed to read image: {imgPath}")
            continue

        # --- Hand Landmark Detection ---
        hands, _ = detector.findHands(img, draw=False)
        if hands:
            # Use the first detected hand
            hand = hands[0]
            lm_list = hand['lmList']

            # --- Step 1: Draw Landmarks on a Copy of the Full Image ---
            # (Since the images are provided already, we work on the full image.)
            img_landmarked = img.copy()
            for lm in lm_list:
                cv2.circle(img_landmarked, (lm[0], lm[1]), 4, (0, 0, 255), -1)
            for connection in mp.solutions.hands.HAND_CONNECTIONS:
                pt1 = lm_list[connection[0]]
                pt2 = lm_list[connection[1]]
                cv2.line(img_landmarked, (pt1[0], pt1[1]), (pt2[0], pt2[1]), (0, 0, 255), 2)

            # --- Step 2: Create a Binary Image Using Skin Detection ---
            binaryMask = detect_skin(img)
            # Create a blank image and fill white where skin is detected
            binary_result = np.zeros_like(img)
            binary_result[binaryMask > 0] = [255, 255, 255]
            # Overlay the landmarks on the binary image (drawn in black)
            for lm in lm_list:
                cv2.circle(binary_result, (lm[0], lm[1]), 4, (0, 0, 0), -1)
            for connection in mp.solutions.hands.HAND_CONNECTIONS:
                pt1 = lm_list[connection[0]]
                pt2 = lm_list[connection[1]]
                cv2.line(binary_result, (pt1[0], pt1[1]), (pt2[0], pt2[1]), (0, 0, 0), 2)

            # --- Step 3: Resize for Consistent Output ---
            aspectRatio = img.shape[0] / img.shape[1]
            imgWhite = process_and_resize(binary_result, aspectRatio, imgSize)
            if imgWhite is not None:
                # Save the processed image to the output folder.
                outPath = os.path.join(outClassFolder, imgFile)
                cv2.imwrite(outPath, imgWhite)
                print(f"Processed and saved: {outPath}")
                # (Optional) Display the processed image.
                cv2.imshow("Processed Image", imgWhite)
                cv2.waitKey(1)
        else:
            print(f"No hand detected in image: {imgPath}")

cv2.destroyAllWindows()


## Data Augmentaition Pipeline

We further preprocess the images by augmenting them based on rotations and horizontal flips

Run the below cell after installing relevant dependancies

### Note:

1. Enter the folder location for the preprocessed dataset
2. Enter the output location to save the augmented dataset
3. There are more options for augmentation, comment or add more as needed

In [None]:
input_folder = "C:/path/to/your/input_folder"  
output_folder = "C:/path/to/your/output_folder"  
os.makedirs(output_folder, exist_ok=True)

#### Augmentation Functions

In [None]:
import os
import random
from PIL import Image, ImageEnhance, ImageOps
import numpy as np

# Define augmentation functions

# Rotate image by a random angle between -30 to 30 degrees
def random_rotation(image):
    angle = random.uniform(-30, 30)  # Rotate between -30 to 30 degrees
    return image.rotate(angle)

# Flip image horizontally with a 50% chance
def random_flip(image):
    if random.choice([True, False]):
        return ImageOps.mirror(image)
    return image

# Adjust brightness by a random factor between 0.7 and 1.3
def random_brightness(image):
    enhancer = ImageEnhance.Brightness(image)
    factor = random.uniform(0.7, 1.3)  # Brightness factor
    return enhancer.enhance(factor)

# Adjust contrast by a random factor between 0.7 and 1.3
def random_contrast(image):
    enhancer = ImageEnhance.Contrast(image)
    factor = random.uniform(0.7, 1.3)  # Contrast factor
    return enhancer.enhance(factor)

# Add random noise to the image
def add_random_noise(image):
    np_image = np.array(image)
    noise = np.random.normal(0, 25, np_image.shape).astype(np.int16)
    noisy_image = np.clip(np_image + noise, 0, 255).astype(np.uint8)
    return Image.fromarray(noisy_image)

# Augment an image using a combination of random transformations
# Comment out or add more functions as needed
def augment_image(image):
    image = random_rotation(image)
    image = random_flip(image)
    # image = random_brightness(image)
    # image = random_contrast(image)
    # image = add_random_noise(image)
    return image


#### Applying Augmentations

In [None]:
# To apply the augmentation to all images in the input folder and save them to the output folder
# Iterate over all subfolders and images
for subdir, _, files in os.walk(input_folder):
    relative_path = os.path.relpath(subdir, input_folder)
    output_subdir = os.path.join(output_folder, relative_path)
    os.makedirs(output_subdir, exist_ok=True)

    for file in files:
        if file.lower().endswith(('png', 'jpg', 'jpeg', 'bmp', 'tiff')):
            input_path = os.path.join(subdir, file)
            output_path = os.path.join(output_subdir, file)

            try:
                with Image.open(input_path) as img:
                    img = img.convert("L")  # Ensure greyscale (black and white)
                    augmented_img = augment_image(img)
                    augmented_img.save(output_path)
            except Exception as e:
                print(f"Error processing {input_path}: {e}")

print("Data augmentation completed!")

### Combining Augmented Data With Preprocessed Dataset

Creating a new folder for combined data

### Note:

1. Enter the preprocessed dataset location
2. Enter the augmented dataset location
3. Enter the new combined datasets save location

In [None]:
dataset1 = "C:/path/to/your/input_folder"  
dataset2 = "C:/path/to/your/input_folder"  
output_dataset = "C:/path/to/your/output_folder"  

In [None]:
import os
import shutil

# Create the output directory if it doesn't exist
os.makedirs(output_dataset, exist_ok=True)

# Function to merge datasets with renaming
def merge_datasets(source_dir, target_dir, suffix=""):
    for class_name in os.listdir(source_dir):
        source_class_path = os.path.join(source_dir, class_name)
        target_class_path = os.path.join(target_dir, class_name)
        
        if os.path.isdir(source_class_path):
            # Create the class folder in the target if it doesn't exist
            if not os.path.exists(target_class_path):
                os.makedirs(target_class_path)
            
            for file_name in os.listdir(source_class_path):
                source_file_path = os.path.join(source_class_path, file_name)
                # Add the specified suffix to the file name
                base_name, ext = os.path.splitext(file_name)
                file_name = f"{base_name}{suffix}{ext}"
                target_file_path = os.path.join(target_class_path, file_name)
                
                # Copy the file to the target directory
                shutil.copy2(source_file_path, target_file_path)

# Merge the main dataset
merge_datasets(dataset1, output_dataset, suffix="_black")

# Merge the augmented dataset with "_AUG" renaming
merge_datasets(dataset2, output_dataset, suffix="_white")


print(f"Datasets merged into: {output_dataset}")

## Training using ML Models

In this pipeline the combined dataset will be used to train the ML models

### Note:

Run the below cell which will lead to a UI for training models. 
1. Enter your dataset location
2. Selct the model to train
3. Fill in or keep the default trining parameters
4. Check or uncheck cross validation and write number of folds based on requirements
5. Each epoch will show the training status in UI
6. At end the results will be saved in modelName_RESULTS FOLDER and Models in TrainedBinary2Model

### Warning: 

- Training is resource costly! 
- Ensure that you have right setup before training. 
- Process can take more than a day if trained on CPU (varies by model). 
- Highly recommended to train using GPU if Available
- Recommend to use the already trained models in the repository

#### Run Below Code to Open Training Panel:

In [4]:
%run TrainerV3_2.py

Available GPU devices: []


## Real-Time Recognition

In this section real-time recognition will be tested out based on the predictions from trained model. 

### Note:

IMPORTANT: WebCam required

1. Enter the models location before running
2. The Ensure that the lighting conditions are ideal. 
3. Try to perform in front of dark background if possible, or near non reflective walls
4. Be thourough with ASL fingerspelling signs in order for proper classification

In [None]:
from cvzone.ClassificationModule import Classifier

classifier = Classifier("C:/Users/User/OneDrive/Documents/SignLanguageApp/TrainedBinary2Model/MobileNetV2_model.h5")

#### WebCam access and Processing

In [None]:
import cv2
import mediapipe as mp
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math


def detect_skin(frame):
    # Convert to YCrCb and equalize the luminance channel
    ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    y_channel = ycrcb[:, :, 0]
    y_eq = cv2.equalizeHist(y_channel)
    ycrcb[:, :, 0] = y_eq

    # Adjusted thresholds might be needed after equalization.
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)
    mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
    # Noise reduction using morphological operations
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
    # Optionally, keep only the largest contour (if needed)
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contour_mask = np.zeros_like(mask)
    if contours:
        cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
    mask = cv2.bitwise_and(mask, contour_mask)
    
    return mask


# Initialize camera, detector
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)


offset = 45
imgSize = 250
labels = ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y"]

# Use Mediapipe’s hand connections to draw lines between landmarks.
mp_hands = mp.solutions.hands
hand_connections = mp_hands.HAND_CONNECTIONS


while True:
    success, img = cap.read()
    if not success:
        break
    imgOutput = img.copy()
    hands, img = detector.findHands(img, draw=False)
    
    if hands:
        hand = hands[0]
        x, y, w, h = hand['bbox']
        y1, y2 = max(0, y - offset), min(img.shape[0], y + h + offset)
        x1, x2 = max(0, x - offset), min(img.shape[1], x + w + offset)
        imgCrop = img[y1:y2, x1:x2]
        
        if imgCrop.shape[0] > 0 and imgCrop.shape[1] > 0:
            # Draw landmarks on a copy of the cropped image (for visualization)
            imgCrop_landmarked = imgCrop.copy()
            if 'lmList' in hand:
                lm_list = hand['lmList']
                for lm in lm_list:
                    cv2.circle(imgCrop_landmarked, (lm[0] - x1, lm[1] - y1), 4, (0, 0, 255), -1)
                for connection in mp_hands.HAND_CONNECTIONS:
                    if connection[0] < len(lm_list) and connection[1] < len(lm_list):
                        pt1 = (lm_list[connection[0]][0] - x1, lm_list[connection[0]][1] - y1)
                        pt2 = (lm_list[connection[1]][0] - x1, lm_list[connection[1]][1] - y1)
                        cv2.line(imgCrop_landmarked, pt1, pt2, (0, 0, 255), 2)
            
            # Process the cropped image to create a binary image
            binaryMask = detect_skin(imgCrop)
            # Create a black background and set the hand area to white:
            binary_result = np.zeros_like(imgCrop)
            binary_result[binaryMask > 0] = [255, 255, 255]
            
            # Overlay the landmarks (black) on the binary image:
            if 'lmList' in hand:
                for lm in lm_list:
                    cv2.circle(binary_result, (lm[0] - x1, lm[1] - y1), 4, (0, 0, 0), -1)
                for connection in mp_hands.HAND_CONNECTIONS:
                    pt1 = (lm_list[connection[0]][0] - x1, lm_list[connection[0]][1] - y1)
                    pt2 = (lm_list[connection[1]][0] - x1, lm_list[connection[1]][1] - y1)
                    cv2.line(binary_result, pt1, pt2, (0, 0, 0), 2)
            
            # Resize the binary_result to a fixed size (e.g., 250x250) while preserving the aspect ratio
            aspectRatio = h / w
            imgWhite = np.ones((imgSize, imgSize), np.uint8) * 0
            if aspectRatio > 1:
                k = imgSize / h
                wCal = math.ceil(k * w)
                imgResize = cv2.resize(binary_result, (wCal, imgSize))
                wGap = math.ceil((imgSize - wCal) / 2)
                imgWhite[:, wGap:wCal + wGap] = cv2.cvtColor(imgResize, cv2.COLOR_BGR2GRAY)
            else:
                k = imgSize / w
                hCal = math.ceil(k * h)
                imgResize = cv2.resize(binary_result, (imgSize, hCal))
                hGap = math.ceil((imgSize - hCal) / 2)
                imgWhite[hGap:hCal + hGap, :] = cv2.cvtColor(imgResize, cv2.COLOR_BGR2GRAY)
            
            # You can now pass imgWhite (or its RGB version) to your classifier.
            imgWhiteRGB = cv2.cvtColor(imgWhite, cv2.COLOR_GRAY2BGR)
            prediction, index = classifier.getPrediction(imgWhiteRGB, draw=False)
            
            # Display classification results if the prediction confidence is high enough.
            if prediction[index] > 0.75 and 0 <= index < len(labels):
                cv2.rectangle(imgOutput, (x - offset, y - offset - 50),
                              (x - offset + 90, y - offset - 50 + 50), (255, 0, 255), cv2.FILLED)
                cv2.putText(imgOutput, labels[index], (x, y - 26),
                            cv2.FONT_HERSHEY_COMPLEX, 1.7, (255, 255, 255), 2)
                cv2.rectangle(imgOutput, (x - offset, y - offset),
                              (x + w + offset, y + h + offset), (255, 0, 255), 4)
            
            cv2.imshow("Processed Binary Image", imgWhite)
            cv2.imshow("Hand Landmarks", imgCrop_landmarked)
    
    cv2.imshow("Image", imgOutput)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break


cap.release()
cv2.destroyAllWindows()


### Real-Time recognition UI/Tkinter App

#### Features:

1. Panel displaying main frame with prediction label, binary frame, landmark frame to understand workings of the process.
2. Word typing. Use space bar to save the predicted letter. For white_space remove hand from frame and press space bar.
3. --More to be added--


### Note:
 
IMPORTANT: WebCam required

1. Change the model name in "ASLRecogAPP.py" before running the app.
2. The Ensure that the lighting conditions are ideal. 
3. Try to perform in front of dark background if possible, or near non reflective walls
4. Be thourough with ASL fingerspelling signs in order for proper classification

Run Below Cell to open the app:

In [5]:
%run ASLRecogAPP.py

No Labels Found


: 

In [None]:
import tensorflow as tf

print("Num GPUs Available:", len(tf.config.experimental.list_physical_devices('GPU')))
print("GPU being used:", tf.test.gpu_device_name())

tf.debugging.set_log_device_placement(True)  #

In [1]:
import cv2
import mediapipe as mp
import numpy as np

# Initialize MediaPipe Selfie Segmentation
mp_selfie_segmentation = mp.solutions.selfie_segmentation
segmentation = mp_selfie_segmentation.SelfieSegmentation(model_selection=1)

# Start video capture
cap = cv2.VideoCapture(0)  # Webcam input

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Flip the frame for a mirror effect
    frame = cv2.flip(frame, 1)

    # Convert to RGB for MediaPipe
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process the frame
    results = segmentation.process(rgb_frame)

    # Create a mask where the hand is detected (white = hand, black = background)
    mask = results.segmentation_mask
    threshold = 0.5  # Adjust threshold for better accuracy
    mask_binary = (mask > threshold).astype(np.uint8) * 255

    # Convert mask to 3 channels
    mask_binary = cv2.cvtColor(mask_binary, cv2.COLOR_GRAY2BGR)

    # Apply the mask to the original frame (keep hand, remove background)
    segmented_hand = cv2.bitwise_and(frame, mask_binary)

    # Show results
    cv2.imshow("Original", frame)
    cv2.imshow("Segmented Hand", segmented_hand)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()


In [None]:
# import cv2
# from cvzone.HandTrackingModule import HandDetector
# import numpy as np
# import math
# import os
# import mediapipe as mp

# # ----------------------------
# # Initialize MediaPipe Selfie Segmentation (from 1st code)
# # ----------------------------
# mp_selfie_segmentation = mp.solutions.selfie_segmentation
# segmentation = mp_selfie_segmentation.SelfieSegmentation(model_selection=1)

# # ----------------------------
# # Initialize Webcam and Hand Detector (from 2nd code)
# # ----------------------------
# cap = cv2.VideoCapture(0)
# detector = HandDetector(maxHands=1)

# # Constants and folders for saving images
# imgSize = 500
# baseFolder = "/SLangDataset/new_Blmark_data"
# # List of letters A-Y (adjust as needed)
# letters = [chr(i) for i in range(ord('A'), ord('Y') + 1)]
# maxImages = 2000          # Total images to capture per class
# paddingFactor = 0.45      # Padding percentage

# mp_hands = mp.solutions.hands

# # ----------------------------
# # Utility function: Process and resize image for saving
# # ----------------------------
# def process_and_resize(imgCrop, aspectRatio, imgSize):
#     channels = 1 if len(imgCrop.shape) == 2 else imgCrop.shape[2]
#     imgWhite = np.ones((imgSize, imgSize, channels), np.uint8) * 0
#     try:
#         if aspectRatio > 1:
#             # If height > width:
#             k = imgSize / imgCrop.shape[0]
#             wCal = math.ceil(k * imgCrop.shape[1])
#             imgResize = cv2.resize(imgCrop, (wCal, imgSize))
#             wGap = math.ceil((imgSize - wCal) / 2)
#             imgWhite[:, wGap:wCal + wGap] = imgResize
#         else:
#             # If width >= height:
#             k = imgSize / imgCrop.shape[1]
#             hCal = math.ceil(k * imgCrop.shape[0])
#             imgResize = cv2.resize(imgCrop, (imgSize, hCal))
#             hGap = math.ceil((imgSize - hCal) / 2)
#             imgWhite[hGap:hCal + hGap, :] = imgResize
#     except Exception as e:
#         print(f"Error during image processing: {e}")
#         return None
#     return imgWhite

# # ----------------------------
# # Utility function: Detect skin using YCrCb thresholds (binarization)
# # ----------------------------
# def detect_skin(frame):
#     ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
#     lower_skin = np.array([0, 133, 77], dtype=np.uint8)
#     upper_skin = np.array([255, 173, 127], dtype=np.uint8)
#     mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
#     # Clean up noise
#     kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
#     mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
#     mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
#     mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
#     # (Optional) Fill in contours to clean up the mask
#     contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#     contour_mask = np.zeros_like(mask)
#     cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
#     mask = cv2.bitwise_and(mask, contour_mask)
    
#     return mask

# # ----------------------------
# # Main Loop: Process each class (letter)
# # ----------------------------
# for className in letters:
#     print(f"Starting collection for: {className}")
#     folder = os.path.join(baseFolder, className)
#     os.makedirs(folder, exist_ok=True)

#     counter, collecting = 0, False

#     while counter < maxImages:
#         success, img = cap.read()
#         if not success:
#             print("Camera access failed.")
#             break

#         # Detect hand in the full image
#         hands, _ = detector.findHands(img, draw=False)
#         if hands:
#             # Use the first detected hand
#             hand = hands[0]
#             bbox = hand['bbox']       # [x, y, w, h]
#             lm_list = hand['lmList']    # list of landmarks in full-image coordinates
#             x, y, w, h = bbox

#             # Calculate padding based on hand size
#             xPad = int(w * paddingFactor)
#             yPad = int(h * paddingFactor)

#             # Compute crop boundaries (making sure they stay within image bounds)
#             crop_x1 = max(0, x - xPad)
#             crop_y1 = max(0, y - yPad)
#             crop_x2 = min(x + w + xPad, img.shape[1])
#             crop_y2 = min(y + h + yPad, img.shape[0])
#             imgCrop = img[crop_y1:crop_y2, crop_x1:crop_x2]

#             if imgCrop.size > 0:
#                 # ============================================================
#                 # STEP A: Apply segmentation to remove background (from 1st code)
#                 # ============================================================
#                 # Convert the cropped image to RGB as required by MediaPipe segmentation
#                 rgb_crop = cv2.cvtColor(imgCrop, cv2.COLOR_BGR2RGB)
#                 results_seg = segmentation.process(rgb_crop)
#                 mask_seg = results_seg.segmentation_mask
#                 seg_threshold = 0.5  # Adjust threshold if necessary
#                 mask_binary_seg = (mask_seg > seg_threshold).astype(np.uint8) * 255
#                 mask_binary_seg = cv2.cvtColor(mask_binary_seg, cv2.COLOR_GRAY2BGR)
#                 segmented_crop = cv2.bitwise_and(imgCrop, mask_binary_seg)
                
#                 # ============================================================
#                 # STEP 1: Draw landmarks on the segmented crop (adjust coordinates)
#                 # ============================================================
#                 imgCrop_landmarked = segmented_crop.copy()
#                 for lm in lm_list:
#                     adj_x = lm[0] - crop_x1
#                     adj_y = lm[1] - crop_y1
#                     cv2.circle(imgCrop_landmarked, (adj_x, adj_y), 4, (0, 0, 255), -1)
#                 for connection in mp.solutions.hands.HAND_CONNECTIONS:
#                     pt1 = lm_list[connection[0]]
#                     pt2 = lm_list[connection[1]]
#                     pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
#                     pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
#                     cv2.line(imgCrop_landmarked, pt1_adjusted, pt2_adjusted, (0, 0, 255), 2)

#                 # ============================================================
#                 # STEP 2: Binarize the segmented crop with landmarks overlaid
#                 # ============================================================
#                 # Use the segmented crop (without landmarks) to generate a binary skin mask
#                 binaryMask = detect_skin(segmented_crop)
#                 binary_result = np.zeros_like(segmented_crop)
#                 binary_result[binaryMask > 0] = [255, 255, 255]
#                 # Then overlay the landmarks (drawn in black) onto the binary image.
#                 for lm in lm_list:
#                     adj_x = lm[0] - crop_x1
#                     adj_y = lm[1] - crop_y1
#                     cv2.circle(binary_result, (adj_x, adj_y), 4, (0, 0, 0), -1)
#                 for connection in mp.solutions.hands.HAND_CONNECTIONS:
#                     pt1 = lm_list[connection[0]]
#                     pt2 = lm_list[connection[1]]
#                     pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
#                     pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
#                     cv2.line(binary_result, pt1_adjusted, pt2_adjusted, (0, 0, 0), 2)

#                 # ============================================================
#                 # STEP 3: Resize the binary image for saving/visualization
#                 # ============================================================
#                 aspectRatio = (crop_y2 - crop_y1) / (crop_x2 - crop_x1)
#                 imgWhite = process_and_resize(binary_result, aspectRatio, imgSize)
#                 if imgWhite is not None:
#                     cv2.imshow("Processed Binary Image", imgWhite)
#                     if collecting:
#                         counter += 1
#                         savePath = os.path.join(folder, f"{className.lower()}_{counter}.jpg")
#                         cv2.imwrite(savePath, imgWhite)
#                         print(f"Saved {counter}/{maxImages} images for {className}")

#         # Show the original live feed (for reference)
#         cv2.imshow("Live Feed with Landmarks", img)
#         key = cv2.waitKey(1)
#         if key == ord('s'):
#             collecting = True
#         if key == ord('p'):
#             collecting = False

#     print(f"Completed collection for {className}")
#     input("Press Enter for next class.")

# cap.release()
# cv2.destroyAllWindows()

import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import os
import mediapipe as mp

# ----------------------------
# Initialize MediaPipe Selfie Segmentation (from 1st code)
# ----------------------------
mp_selfie_segmentation = mp.solutions.selfie_segmentation
segmentation = mp_selfie_segmentation.SelfieSegmentation(model_selection=1)

# ----------------------------
# Initialize Webcam and Hand Detector (from 2nd code)
# ----------------------------
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)

# Constants and folders for saving images
imgSize = 500
baseFolder = "/SLangDataset/new_Blmark_data"
# List of letters A-Y (adjust as needed)
letters = [chr(i) for i in range(ord('A'), ord('Y') + 1)]
maxImages = 2000          # Total images to capture per class
paddingFactor = 0.45      # Padding percentage

mp_hands = mp.solutions.hands

# ----------------------------
# Utility function: Process and resize image for saving
# ----------------------------
def process_and_resize(imgCrop, aspectRatio, imgSize):
    channels = 1 if len(imgCrop.shape) == 2 else imgCrop.shape[2]
    imgWhite = np.ones((imgSize, imgSize, channels), np.uint8) * 0
    try:
        if aspectRatio > 1:
            # If height > width:
            k = imgSize / imgCrop.shape[0]
            wCal = math.ceil(k * imgCrop.shape[1])
            imgResize = cv2.resize(imgCrop, (wCal, imgSize))
            wGap = math.ceil((imgSize - wCal) / 2)
            imgWhite[:, wGap:wCal + wGap] = imgResize
        else:
            # If width >= height:
            k = imgSize / imgCrop.shape[1]
            hCal = math.ceil(k * imgCrop.shape[0])
            imgResize = cv2.resize(imgCrop, (imgSize, hCal))
            hGap = math.ceil((imgSize - hCal) / 2)
            imgWhite[hGap:hCal + hGap, :] = imgResize
    except Exception as e:
        print(f"Error during image processing: {e}")
        return None
    return imgWhite

# ----------------------------
# Utility function: Detect skin using YCrCb thresholds (binarization)
# ----------------------------
def detect_skin(frame):
    ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)
    mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
    # Clean up noise
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
    # (Optional) Fill in contours to clean up the mask
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contour_mask = np.zeros_like(mask)
    cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
    mask = cv2.bitwise_and(mask, contour_mask)
    
    return mask

# ----------------------------
# Main Loop: Process each class (letter)
# ----------------------------
for className in letters:
    print(f"Starting collection for: {className}")
    folder = os.path.join(baseFolder, className)
    os.makedirs(folder, exist_ok=True)

    counter, collecting = 0, False

    while counter < maxImages:
        success, img = cap.read()
        if not success:
            print("Camera access failed.")
            break

        # Detect hand in the full image
        hands, _ = detector.findHands(img, draw=False)
        if hands:
            # Use the first detected hand
            hand = hands[0]
            bbox = hand['bbox']       # [x, y, w, h]
            lm_list = hand['lmList']    # list of landmarks in full-image coordinates
            x, y, w, h = bbox

            # Calculate padding based on hand size
            xPad = int(w * paddingFactor)
            yPad = int(h * paddingFactor)

            # Compute crop boundaries (making sure they stay within image bounds)
            crop_x1 = max(0, x - xPad)
            crop_y1 = max(0, y - yPad)
            crop_x2 = min(x + w + xPad, img.shape[1])
            crop_y2 = min(y + h + yPad, img.shape[0])
            imgCrop = img[crop_y1:crop_y2, crop_x1:crop_x2]

            if imgCrop.size > 0:
                # ============================================================
                # STEP A: Apply segmentation to remove background (from 1st code)
                # ============================================================
                # Convert the cropped image to RGB as required by MediaPipe segmentation
                rgb_crop = cv2.cvtColor(imgCrop, cv2.COLOR_BGR2RGB)
                results_seg = segmentation.process(rgb_crop)
                mask_seg = results_seg.segmentation_mask
                seg_threshold = 0.5  # Adjust threshold if necessary
                mask_binary_seg = (mask_seg > seg_threshold).astype(np.uint8) * 255
                mask_binary_seg = cv2.cvtColor(mask_binary_seg, cv2.COLOR_GRAY2BGR)
                segmented_crop = cv2.bitwise_and(imgCrop, mask_binary_seg)
                
                # ---------------------------
                # Show the segmented crop (background removed)
                # ---------------------------
                cv2.imshow("Segmented Crop", segmented_crop)
                
                # ============================================================
                # STEP 1: Draw landmarks on the segmented crop (adjust coordinates)
                # ============================================================
                imgCrop_landmarked = segmented_crop.copy()
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(imgCrop_landmarked, (adj_x, adj_y), 4, (0, 0, 255), -1)
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(imgCrop_landmarked, pt1_adjusted, pt2_adjusted, (0, 0, 255), 2)

                # ============================================================
                # STEP 2: Binarize the segmented crop with landmarks overlaid
                # ============================================================
                # Use the segmented crop (without landmarks) to generate a binary skin mask
                binaryMask = detect_skin(segmented_crop)
                binary_result = np.zeros_like(segmented_crop)
                binary_result[binaryMask > 0] = [255, 255, 255]
                # Then overlay the landmarks (drawn in black) onto the binary image.
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(binary_result, (adj_x, adj_y), 4, (0, 0, 0), -1)
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(binary_result, pt1_adjusted, pt2_adjusted, (0, 0, 0), 2)

                # ============================================================
                # STEP 3: Resize the binary image for saving/visualization
                # ============================================================
                aspectRatio = (crop_y2 - crop_y1) / (crop_x2 - crop_x1)
                imgWhite = process_and_resize(binary_result, aspectRatio, imgSize)
                if imgWhite is not None:
                    cv2.imshow("Processed Binary Image", imgWhite)
                    if collecting:
                        counter += 1
                        savePath = os.path.join(folder, f"{className.lower()}_{counter}.jpg")
                        cv2.imwrite(savePath, imgWhite)
                        print(f"Saved {counter}/{maxImages} images for {className}")

        # Show the original live feed (for reference)
        cv2.imshow("Live Feed with Landmarks", img)
        key = cv2.waitKey(1)
        if key == ord('s'):
            collecting = True
        if key == ord('p'):
            collecting = False

    print(f"Completed collection for {className}")
    input("Press Enter for next class.")

cap.release()
cv2.destroyAllWindows()


Starting collection for: A
Error during image processing: could not broadcast input array from shape (501,500,3) into shape (500,500,3)
Error during image processing: could not broadcast input array from shape (501,500,3) into shape (500,500,3)


In [None]:
import cv2
from cvzone.HandTrackingModule import HandDetector
import numpy as np
import math
import os
import mediapipe as mp

# ----------------------------
# Initialize MediaPipe Selfie Segmentation
# ----------------------------
mp_selfie_segmentation = mp.solutions.selfie_segmentation
segmentation = mp_selfie_segmentation.SelfieSegmentation(model_selection=1)

# ----------------------------
# Initialize Webcam and Hand Detector
# ----------------------------
cap = cv2.VideoCapture(0)
detector = HandDetector(maxHands=1)

# Constants and folders for saving images
imgSize = 500
baseFolder = "/SignLanguageApp/SLangDataset/new_Blmark_data"
letters = [chr(i) for i in range(ord('A'), ord('Y') + 1)]  # List of letters A-Y
maxImages = 2000         # Total images to capture per class
paddingFactor = 0.45     # Padding percentage

mp_hands = mp.solutions.hands

# ----------------------------
# Utility function: Process and resize image for saving
# ----------------------------
def process_and_resize(imgCrop, aspectRatio, imgSize):
    channels = 1 if len(imgCrop.shape) == 2 else imgCrop.shape[2]
    imgWhite = np.ones((imgSize, imgSize, channels), np.uint8) * 0
    try:
        if aspectRatio > 1:
            # Height > width:
            k = imgSize / imgCrop.shape[0]
            wCal = math.ceil(k * imgCrop.shape[1])
            imgResize = cv2.resize(imgCrop, (wCal, imgSize))
            wGap = math.ceil((imgSize - wCal) / 2)
            imgWhite[:, wGap:wCal + wGap] = imgResize
        else:
            # Width >= height:
            k = imgSize / imgCrop.shape[1]
            hCal = math.ceil(k * imgCrop.shape[0])
            imgResize = cv2.resize(imgCrop, (imgSize, hCal))
            hGap = math.ceil((imgSize - hCal) / 2)
            imgWhite[hGap:hCal + hGap, :] = imgResize
    except Exception as e:
        print(f"Error during image processing: {e}")
        return None
    return imgWhite

# ----------------------------
# (Optional) Utility function: Detect skin using YCrCb thresholds
# (Not used in the updated processing)
# ----------------------------
def detect_skin(frame):
    ycrcb = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)
    mask = cv2.inRange(ycrcb, lower_skin, upper_skin)
    
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    
    contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    contour_mask = np.zeros_like(mask)
    cv2.drawContours(contour_mask, contours, -1, 255, thickness=cv2.FILLED)
    mask = cv2.bitwise_and(mask, contour_mask)
    
    return mask

# ----------------------------
# Main Loop: Process each class (letter)
# ----------------------------
for className in letters:
    print(f"Starting collection for: {className}")
    folder = os.path.join(baseFolder, className)
    os.makedirs(folder, exist_ok=True)

    counter, collecting = 0, False

    while counter < maxImages:
        success, img = cap.read()
        if not success:
            print("Camera access failed.")
            break

        # Detect hand in the full image
        hands, _ = detector.findHands(img, draw=False)
        if hands:
            # Use the first detected hand
            hand = hands[0]
            bbox = hand['bbox']       # [x, y, w, h]
            lm_list = hand['lmList']    # List of landmarks in full-image coordinates
            x, y, w, h = bbox

            # Calculate padding based on hand size
            xPad = int(w * paddingFactor)
            yPad = int(h * paddingFactor)

            # Compute crop boundaries (ensure they stay within image bounds)
            crop_x1 = max(0, x - xPad)
            crop_y1 = max(0, y - yPad)
            crop_x2 = min(x + w + xPad, img.shape[1])
            crop_y2 = min(y + h + yPad, img.shape[0])
            imgCrop = img[crop_y1:crop_y2, crop_x1:crop_x2]

            if imgCrop.size > 0:
                # -----------------------------------------------------
                # STEP A: Apply segmentation to remove background
                # -----------------------------------------------------
                rgb_crop = cv2.cvtColor(imgCrop, cv2.COLOR_BGR2RGB)
                results_seg = segmentation.process(rgb_crop)
                mask_seg = results_seg.segmentation_mask
                seg_threshold = 0.5  # Adjust threshold if necessary
                mask_binary_seg = (mask_seg > seg_threshold).astype(np.uint8) * 255
                mask_binary_seg = cv2.cvtColor(mask_binary_seg, cv2.COLOR_GRAY2BGR)
                segmented_crop = cv2.bitwise_and(imgCrop, mask_binary_seg)
                
                # Show the segmented crop (background removed)
                cv2.imshow("Segmented Crop", segmented_crop)
                
                # -----------------------------------------------------
                # STEP 1: Draw landmarks on the segmented crop
                # -----------------------------------------------------
                imgCrop_landmarked = segmented_crop.copy()
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(imgCrop_landmarked, (adj_x, adj_y), 4, (0, 0, 255), -1)
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(imgCrop_landmarked, pt1_adjusted, pt2_adjusted, (0, 0, 255), 2)
                
                # -----------------------------------------------------
                # STEP 2: Convert the segmented crop directly to a binary image
                # (Skipping additional skin range checking)
                # -----------------------------------------------------
                # Create a blank image for the binary result
                binary_result = np.zeros_like(segmented_crop)
                # Convert the segmented crop to grayscale and threshold it
                gray = cv2.cvtColor(segmented_crop, cv2.COLOR_BGR2GRAY)
                _, binary_from_seg = cv2.threshold(gray, 1, 255, cv2.THRESH_BINARY)
                binary_result[binary_from_seg > 0] = [255, 255, 255]
                # Overlay landmarks (drawn in black) on the binary image
                for lm in lm_list:
                    adj_x = lm[0] - crop_x1
                    adj_y = lm[1] - crop_y1
                    cv2.circle(binary_result, (adj_x, adj_y), 4, (0, 0, 0), -1)
                for connection in mp.solutions.hands.HAND_CONNECTIONS:
                    pt1 = lm_list[connection[0]]
                    pt2 = lm_list[connection[1]]
                    pt1_adjusted = (pt1[0] - crop_x1, pt1[1] - crop_y1)
                    pt2_adjusted = (pt2[0] - crop_x1, pt2[1] - crop_y1)
                    cv2.line(binary_result, pt1_adjusted, pt2_adjusted, (0, 0, 0), 2)
                
                # -----------------------------------------------------
                # STEP 3: Resize the binary image for saving/visualization
                # -----------------------------------------------------
                aspectRatio = (crop_y2 - crop_y1) / (crop_x2 - crop_x1)
                imgWhite = process_and_resize(binary_result, aspectRatio, imgSize)
                if imgWhite is not None:
                    cv2.imshow("Processed Binary Image", imgWhite)
                    if collecting:
                        counter += 1
                        savePath = os.path.join(folder, f"{className.lower()}_{counter}.jpg")
                        cv2.imwrite(savePath, imgWhite)
                        print(f"Saved {counter}/{maxImages} images for {className}")

        # Show the original live feed (for reference)
        cv2.imshow("Live Feed with Landmarks", img)
        key = cv2.waitKey(1)
        if key == ord('s'):
            collecting = True
        if key == ord('p'):
            collecting = False

    print(f"Completed collection for {className}")
    input("Press Enter for next class.")

cap.release()
cv2.destroyAllWindows()


Starting collection for: A
Error during image processing: could not broadcast input array from shape (501,500,3) into shape (500,500,3)
Error during image processing: could not broadcast input array from shape (501,500,3) into shape (500,500,3)
