
# ASL Letter Recognition
### Authors: Ant and Craig
A collaborative project aimed at developing a CNN model for recognizing ASL letters, enhancing communication for the deaf and hard of hearing.


## Project Objective
The goal is to develop a Convolutional Neural Network (CNN) model for recognizing American Sign Language (ASL) alphabet signs from images. This notebook outlines the entire process, including dataset preparation, model development, training, and evaluation, culminating in a Gradio-based interactive interface for real-time testing.

In [2]:

import numpy as np
import os
from PIL import Image
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import tensorflow as tf
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam

# Define the directory for ASL images
image_dir = '/content/drive/MyDrive/asl_alphabet_train/asl_alphabet_train'


## Data Loading and Preprocessing
This section deals with the initial stage of the project - loading and preprocessing the ASL image dataset. The images are sourced from a designated directory, resized, normalized, and then divided into training and test sets to prepare them for the CNN model.

In [4]:
# Load and preprocess the dataset
import os
from PIL import Image
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import img_to_array

# Function to load images from a directory and preprocess them
def load_images(image_dir, target_size=(64, 64)):
    images = []
    labels = []

    for folder in os.listdir(image_dir):
        folder_path = os.path.join(image_dir, folder)
        if os.path.isdir(folder_path):
            for file in os.listdir(folder_path):
                if file.endswith('.jpg') or file.endswith('.png'):
                    img_path = os.path.join(folder_path, file)
                    img = Image.open(img_path).resize(target_size)
                    img_array = img_to_array(img)
                    images.append(img_array)
                    labels.append(folder)  # Using folder name as label

    return np.array(images), np.array(labels)

# Load images
image_dir = '/content/drive/MyDrive/asl_alphabet_train/asl_alphabet_train'  # Update with the correct path
images, labels = load_images(image_dir)

# Split dataset into training and test sets
train_images, test_images, train_labels, test_labels = train_test_split(
    images, labels, test_size=0.2, random_state=42
)

## Model Architecture
Here, we define the CNN model’s architecture. Leveraging ResNet50 as the foundational structure, we add and configure layers specifically designed for ASL sign recognition. The model’s architecture is crucial as it determines the efficiency and effectiveness of the learning process.

In [5]:
# Model development
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model

# Base model with pre-trained weights
base_model = ResNet50(weights='imagenet', include_top=False)

# Add new layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(np.unique(train_labels)), activation='softmax')(x)

# Final model
model = Model(inputs=base_model.input, outputs=predictions)

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


## Model Summary
This segment provides a detailed overview of the model’s architecture. It’s beneficial for understanding the layer composition, parameter count, and overall structure of the CNN model developed for ASL recognition.

In [6]:
# Model Summary
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_1 (InputLayer)        [(None, None, None, 3)]      0         []                            
                                                                                                  
 conv1_pad (ZeroPadding2D)   (None, None, None, 3)        0         ['input_1[0][0]']             
                                                                                                  
 conv1_conv (Conv2D)         (None, None, None, 64)       9472      ['conv1_pad[0][0]']           
                                                                                                  
 conv1_bn (BatchNormalizati  (None, None, None, 64)       256       ['conv1_conv[0][0]']          
 on)                                                                                          

## Model Training and Evaluation
Training is a pivotal phase where the model learns from the dataset. This section covers the training process, including label encoding, model fitting, and evaluation of its performance using the test set. The evaluation metrics give insight into the model’s accuracy and generalization capabilities.

In [None]:
# Model Testing
# Convert labels to one-hot encoding
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder

# Encode labels
label_encoder = LabelEncoder()
train_labels_encoded = label_encoder.fit_transform(train_labels)
test_labels_encoded = label_encoder.transform(test_labels)

# Convert to one-hot
train_labels_one_hot = to_categorical(train_labels_encoded)
test_labels_one_hot = to_categorical(test_labels_encoded)

# Training the model
model.fit(train_images,
            train_labels_one_hot,
            batch_size=32,
            epochs=10,
            validation_data=(test_images, test_labels_one_hot)
            )

# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(test_images, test_labels_one_hot)
print("Test accuracy:", test_accuracy)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10

## Gradio Interface for Inference
To facilitate user interaction and real-time testing of the model, we integrate a Gradio interface in this final section. Users can upload ASL images and receive instant predictions, demonstrating the model’s practical application and ease of use.

In [None]:
# Gradio interface for model inference
import gradio as gr

def classify_image(inp):
    inp = inp.reshape((-1, 64, 64, 3))
    prediction = model.predict(inp).flatten()
    return {label: prediction[i] for i, label in enumerate(np.unique(labels))}

iface = gr.Interface(fn=classify_image, inputs=gr.inputs.Image(shape=(64, 64)), outputs=gr.outputs.Label(num_top_classes=3))
iface.launch()