## **Introduction**

In this notebook, Arabic sign language gestures will be recognized from images using a CNN model. Arabic Sign Language (ArSL) is a visual-gestural language used by deaf and hard-of-hearing individuals in Arab countries for communication.

This project focuses on developing a CNN model to accurately recognize Arabic sign language gestures from images. By leveraging deep learning techniques, the model will be trained to identify various signs with high precision.


### **Notebook Overview:**

* Data Preparation: The dataset comprising hand gesture images representing different signs in Arabic Sign Language is prepared. Images are loaded and preprocessed to ensure suitability for training the CNN model.
* Model Building: The CNN architecture is designed and implemented using TensorFlow and Keras. The model is trained on preprocessed images to learn patterns and features associated with different sign gestures.
* Model Training: The CNN model is trained on the prepared dataset, with performance monitored over multiple epochs. Techniques such as data augmentation and regularization are utilized to improve generalization.
* Model Evaluation & Prediction: The trained model's performance is evaluated on a separate test dataset. Metrics including accuracy, precision, recall, and F1-score are computed to assess the model's effectiveness.

## Libraries

In [1]:
import os
import shutil
import cv2
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm

In [2]:
import os
import numpy as np
import pandas as pd

import random
from random import randint

from sklearn.utils import shuffle # Shuffle arrays or sparse matrices in a consistent way
from sklearn.model_selection import train_test_split # Split arrays or matrices into random train and test subsets
from sklearn.metrics import classification_report, confusion_matrix
import sklearn

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec # Specifies the geometry of the grid that a subplot can be placed in.

import keras
from keras import models as Models
from keras import layers as Layers
from keras.preprocessing import image
from keras.models import Sequential,Model
from keras.layers import Input,InputLayer, Dense, Activation, ZeroPadding2D, BatchNormalization
from keras.layers import Flatten, Conv2D, AveragePooling2D, MaxPooling2D, Dropout
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint,EarlyStopping
from keras import utils as Utils
from keras.utils import to_categorical # Converts a class vector (integers) to binary class matrix.
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping

import seaborn as sns
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras import layers, models, optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Split data into training and testing sets
from sklearn.model_selection import train_test_split

## CNN Model

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
import os
import numpy as np

# Load data from numpy arrays
destination_folder = "/content/drive/Shared drives/Computer Vision/Data/Resized_RGB_ArSL_dataset_numpy"

image_extensions = ['.jpg.npy', '.png.npy', '.jpeg.npy', '.JPEG.npy', '.JPG.npy', '.PNG.npy']
files = [file for file in os.listdir(destination_folder) if any(file.endswith(ext) for ext in image_extensions)]

# Check if there are any image files
if not files:
    print("No image files found in the directory.")
else:
    # Initialize lists to store image arrays and labels
    image_arrays = []
    labels = []

    # Read each image file and append its numpy array and label to the lists
    for file in files:
        # Extract label from file name (assuming label is before the first underscore)
        label = file.split('_')[0]
        # Load numpy array
        array = np.load(os.path.join(destination_folder, file))
        # Append to lists
        image_arrays.append(array)
        labels.append(label)

    # Convert lists to numpy arrays
    image_arrays = np.array(image_arrays)
    labels = np.array(labels)

In [5]:
# Convert labels to numeric values
from sklearn.preprocessing import LabelEncoder

label_encoder = LabelEncoder()
labels_encoded = label_encoder.fit_transform(labels)

# Split data into 70% training, 15% validation, and 15% testing
X_train_temp, X_test, y_train_temp, y_test = train_test_split(image_arrays, labels_encoded, test_size=0.15, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train_temp, y_train_temp, test_size=0.1765, random_state=42)  # 0.1765 is approximately 15% of the remaining data

In [6]:
# Preprocess data (normalize pixel values)
X_train = X_train / 255.0
X_val = X_val / 255.0
X_test = X_test / 255.0

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(image_arrays.shape[1:])),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(8, activation='softmax')  # 8 classes, so softmax activation
])

model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 254, 254, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 127, 127, 32)      0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 125, 125, 64)      18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 62, 62, 64)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 60, 60, 64)        36928     
                                                                 
 flatten (Flatten)           (None, 230400)            0

In [7]:
# Check unique labels
print("Unique labels:", np.unique(labels))

Unique labels: ['Lam' 'Meem' 'Reh' 'Seen' 'Sheen' 'Waw' 'Yeh' 'Zain']


In [8]:
# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(X_train, y_train, epochs=30, validation_data=(X_val, y_val))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [9]:
# Evaluate the model on the testing data
test_loss, test_acc = model.evaluate(X_test, y_test)
print("Test Accuracy:", test_acc)

Test Accuracy: 0.43853819370269775


In [29]:
# Function to load and preprocess image
def load_and_preprocess_image(image_path, target_size=(256, 256)):
    try:
        # Load image
        image = cv2.imread(image_path)
        if image is None:
            raise Exception("Failed to load image. Please check the image file path:", image_path)
        # Resize image to match target size
        image = cv2.resize(image, target_size)
        # Normalize pixel values
        image = image / 255.0
        # Return preprocessed image
        return image
    except Exception as e:
        print("Error occurred while loading the image:", e)
        return None

# Function to predict label of an image
def predict_image_label(image_path, model):
    try:
        # Load and preprocess image
        image = load_and_preprocess_image(image_path)
        if image is None:
            return None
        # Reshape image to match model input shape
        image = image.reshape(1, *image.shape)
        # Predict label
        prediction = model.predict(image)
        # Decode prediction to get label
        predicted_label = label_encoder.inverse_transform([prediction.argmax()])[0]
        # Return predicted label
        return predicted_label
    except Exception as e:
        print("Error occurred while predicting:", e)
        return None

def main():
    try:
        # Path to the image
        image_path = "/content/5.jpg"

        # Predict label of the image
        predicted_label = predict_image_label(image_path, model)

        # Print predicted label
        if predicted_label:
            print("Predicted Label:", predicted_label)
    except KeyboardInterrupt:
        print("\nExiting...")
    except Exception as e:
        print("An error occurred:", e)

if __name__ == "__main__":
    main()

Predicted Label: Meem


## ResNet Model

## **Introduction**

In this section, Arabic sign language gestures will be recognized from images using a ResNet model.

### **Section Overview:**
* Data Preparation: The dataset comprising hand gesture images representing different signs in Arabic Sign Language is prepared. Images are loaded and preprocessed to ensure suitability for training the ResNet model.
* Model Building: The ResNet architecture, pretrained on ImageNet, is utilized as the base model. Additional layers are added for classification purposes. The model is trained on preprocessed images to learn patterns and features associated with different sign gestures.
* Model Training: The ResNet model is trained on the prepared dataset, with performance monitored over multiple epochs. Techniques such as fine-tuning and regularization are applied to improve generalization.

In [20]:
X_train, X_test, y_train, y_test = train_test_split(image_arrays, labels_encoded, test_size=0.2, random_state=42)

In [21]:
# Load ResNet50 model without the top layer
resnet = ResNet50(weights='imagenet', include_top=False, input_shape=(256, 256, 3))

# Freeze the layers in the base ResNet model
for layer in resnet.layers:
    layer.trainable = False

# Build the model
model = Sequential([
    resnet,
    Flatten(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(8, activation='softmax')  # 8 classes for Arabic Sign Language
])

In [23]:
# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Define callbacks
checkpoint = ModelCheckpoint("best_model.h5", monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
early_stopping = EarlyStopping(monitor='val_loss', patience=5, verbose=1)

# Train the model
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test), callbacks=[checkpoint, early_stopping])

Epoch 1/20
Epoch 1: val_accuracy improved from -inf to 0.69327, saving model to best_model.h5
Epoch 2/20
Epoch 2: val_accuracy did not improve from 0.69327
Epoch 3/20
Epoch 3: val_accuracy improved from 0.69327 to 0.75810, saving model to best_model.h5
Epoch 4/20
Epoch 4: val_accuracy did not improve from 0.75810
Epoch 5/20
Epoch 5: val_accuracy improved from 0.75810 to 0.77805, saving model to best_model.h5
Epoch 6/20
Epoch 6: val_accuracy improved from 0.77805 to 0.78554, saving model to best_model.h5
Epoch 7/20
Epoch 7: val_accuracy did not improve from 0.78554
Epoch 8/20
Epoch 8: val_accuracy did not improve from 0.78554
Epoch 9/20
Epoch 9: val_accuracy did not improve from 0.78554
Epoch 10/20
Epoch 10: val_accuracy improved from 0.78554 to 0.82294, saving model to best_model.h5
Epoch 11/20
Epoch 11: val_accuracy did not improve from 0.82294
Epoch 12/20
Epoch 12: val_accuracy did not improve from 0.82294
Epoch 13/20
Epoch 13: val_accuracy did not improve from 0.82294
Epoch 14/20
Ep

In [27]:
# Load the image
image_path = "/content/5.jpg"  # Replace with the path to your image

# Attempt to read the image
image = cv2.imread(image_path)

# Check if the image was loaded successfully
if image is None:
    print("Error: Failed to load image.")
else:
    # Resize the image to match the model input shape
    resized_image = cv2.resize(image, (256, 256))

    # Normalize pixel values
    normalized_image = resized_image / 255.0

    # Perform prediction
    predicted_label = model.predict(np.expand_dims(normalized_image, axis=0))

    # Display the predicted label
    print("Predicted Label:", predicted_label)


Predicted Label: [[6.7920346e-06 7.1729606e-01 4.5406669e-06 8.9878783e-02 4.5235618e-05
  3.6623948e-03 1.6555762e-01 2.3548687e-02]]


In [28]:
# Decode the predicted label
decoded_label = label_encoder.inverse_transform(np.argmax(predicted_label, axis=1))

# Display the predicted label
print("Predicted Label:", decoded_label)

Predicted Label: ['Meem']
