<h1>Shallow Neural Network for ASL Alphabet Recognition

<p>This notebook is dedicated to training a shallow neural network for recognizing the American Sign Language Alphabet (ASL Alphabet). The ASL alphabet allows deaf and hard-of-hearing individuals to spell out words and proper nouns that do not have specific signs in ASL. This enables them to communicate effectively in situations where a sign might not exist for a particular word or concept.</h3>

<h2>Import Dependencies

In [40]:
import pandas as pd
import numpy as np
import os
import datetime

from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import TensorBoard

from sklearn.model_selection import train_test_split

import mlflow
import mlflow.keras

current_directory = os.getcwd()
print(current_directory)


c:\Users\User\OneDrive\Desktop\BigData Sem2\Final Project


In [41]:
landmarks = pd.read_csv("dataset/landmarks.csv", header=None, on_bad_lines='skip')
X = landmarks.values[:,0:-1].astype(float)
y = landmarks.values[:,-1]
X

array([[0.34267747, 0.76672232, 0.41138849, ..., 0.62721717, 0.32061642,
        0.64012504],
       [0.36988956, 0.71172482, 0.44346225, ..., 0.56345606, 0.36001027,
        0.57881904],
       [0.3350803 , 0.62300825, 0.41358197, ..., 0.46337739, 0.32813659,
        0.47998089],
       ...,
       [0.40622181, 0.92157954, 0.49825594, ..., 0.46062481, 0.22831513,
        0.37963077],
       [0.42869595, 0.90653968, 0.5214389 , ..., 0.41193819, 0.24294531,
        0.32783997],
       [0.42630112, 0.91243881, 0.52119589, ..., 0.42063075, 0.23506358,
        0.33766457]])

<h2>Data Preprocessing Pipeline

In [42]:
def preprocessing(landmarks):
    X = landmarks.values[:,0:-1].astype(float)
    y = landmarks.values[:,-1]

    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42) 
    
    y_train = to_categorical(y_train)
    y_test = to_categorical(y_test)

    return X_train, X_test, y_train, y_test

<h3>Train-Test Split

In [43]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42) 
X_train

array([[0.51376808, 0.94498158, 0.58532572, ..., 0.48341447, 0.48762906,
        0.41418147],
       [0.09327012, 0.74772549, 0.12167979, ..., 0.76382637, 0.23676576,
        0.75605839],
       [0.48219216, 0.48154902, 0.46560657, ..., 0.40733135, 0.37800711,
        0.39006814],
       ...,
       [0.46902052, 0.71303272, 0.52855122, ..., 0.39231864, 0.37466004,
        0.45015761],
       [0.17080863, 0.61506081, 0.19445346, ..., 0.46023408, 0.10221684,
        0.48458272],
       [0.14098312, 0.36921191, 0.17889328, ..., 0.25791916, 0.10955216,
        0.26961192]])

<h3>Label Encoding for Softmax Layer

In [44]:
y_train = to_categorical(y_train)
y_val = to_categorical(y_val)
y_train

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

<h2>Experiment 1 - Baseline model

<h3>Define Hyperparameters

In [57]:
learning_rate = 0.01
num_epochs = 300
batch_size = 32
input_shape = X_train.shape[1]

In [58]:
baseline_model = Sequential([
    Input(shape=(input_shape,)), 
    Dense(8, activation='relu'),
    Dense(64, activation='relu'),
    Dense(26, activation='softmax') 
])

baseline_model.compile(optimizer = 'adam', 
                       loss = 'categorical_crossentropy', 
                       metrics = ['accuracy', 'Precision', 'Recall'])
                       
baseline_model.summary()

<h3>(Optional) Log Experiments with MLFLow

In [59]:
experiment_name = "asl-alphabet-classifier"
run_name = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

In [60]:
logdir = os.path.join("logs", experiment_name, run_name)
tb_callback = TensorBoard(log_dir=logdir, write_graph=True, histogram_freq=1)

In [61]:
mlflow.set_tracking_uri('mlruns/')

mlflow.set_experiment(experiment_name)

#mlflow.tensorflow.autolog()

with mlflow.start_run(run_name = run_name) as run:
    mlflow.log_param("input_dim", X_train.shape[1])
    mlflow.log_param("num_epochs", num_epochs)
    mlflow.log_param("optimizer", "adam")

    history = baseline_model.fit(X_train, y_train, 
        epochs=num_epochs, verbose = 2, 
        validation_data = (X_val, y_val),
        callbacks = [tb_callback]
        )
    
    baseline_model.save("asl_model/baseline_model_2.h5")
    
    mlflow.log_metric("train_loss", history.history["loss"][-1])
    mlflow.log_metric("train_acc", history.history["accuracy"][-1])
    mlflow.log_metric("train_precision", history.history["Precision"][-1])
    mlflow.log_metric("train_recall", history.history["Recall"][-1])
    mlflow.log_metric("val_loss", history.history["val_loss"][-1])
    mlflow.log_metric("val_acc", history.history["val_accuracy"][-1])
    mlflow.log_metric("val_precision", history.history["val_Precision"][-1])
    mlflow.log_metric("val_recall", history.history["val_Recall"][-1])

    mlflow.keras.log_model(baseline_model, "baseline_model")


print(f"Run ID: {run.info.run_id}")

Epoch 1/300
221/221 - 5s - 23ms/step - Precision: 0.0000e+00 - Recall: 0.0000e+00 - accuracy: 0.0724 - loss: 3.1427 - val_Precision: 0.0000e+00 - val_Recall: 0.0000e+00 - val_accuracy: 0.0963 - val_loss: 3.0369
Epoch 2/300
221/221 - 1s - 6ms/step - Precision: 0.0000e+00 - Recall: 0.0000e+00 - accuracy: 0.1655 - loss: 2.8283 - val_Precision: 0.0000e+00 - val_Recall: 0.0000e+00 - val_accuracy: 0.2193 - val_loss: 2.6431
Epoch 3/300
221/221 - 1s - 6ms/step - Precision: 0.9667 - Recall: 0.0041 - accuracy: 0.2656 - loss: 2.4059 - val_Precision: 0.8261 - val_Recall: 0.0108 - val_accuracy: 0.2856 - val_loss: 2.2500
Epoch 4/300
221/221 - 1s - 6ms/step - Precision: 0.7981 - Recall: 0.0118 - accuracy: 0.3397 - loss: 2.0993 - val_Precision: 0.8571 - val_Recall: 0.0102 - val_accuracy: 0.3921 - val_loss: 1.9907
Epoch 5/300
221/221 - 1s - 6ms/step - Precision: 0.9011 - Recall: 0.0336 - accuracy: 0.4289 - loss: 1.8418 - val_Precision: 0.9846 - val_Recall: 0.0363 - val_accuracy: 0.4606 - val_loss: 1.72



Run ID: ab53b2060a3d48c7b1907f0c99a2d4d8


<h2>Testing with <a href = 'https://www.kaggle.com/datasets/grassknoted/asl-alphabet'>kaggle</a> data</h2>


In [None]:
from keras.models import load_model
import cv2
import os
import numpy as np
import string
import time
from src.track import HandTracking

In [65]:
from keras.models import load_model
import cv2
import os
import numpy as np
import string
from src.track import HandTracking
from sklearn.metrics import accuracy_score, precision_score, recall_score

baseline_model = load_model("asl_model/baseline_model.h5", compile=False)
HT = HandTracking()

classes = {i: key for i, key in enumerate(string.ascii_lowercase)}

def mediapipe_output(frame):
    """Process frame to extract hand landmarks and convert to features."""
    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    image.flags.writeable = False
    hands_results = HT.track(image)
    HT.read_results(image, hands_results)

    if hands_results.multi_hand_landmarks:
        hand = hands_results.multi_hand_landmarks[0]
        l = []
        for i in range(21):
            l += HT.get_moy_coords(hand, i)
        return np.array([l])
    return []

image_folder = 'dataset/test/'

true_labels = []
predictions = []

for folder_name in os.listdir(image_folder):
    subfolder_path = os.path.join(image_folder, folder_name)
    if os.path.isdir(subfolder_path):
        image_count = 0
        for filename in os.listdir(subfolder_path):
            path = os.path.join(subfolder_path, filename)
            image = cv2.imread(path)
            if image is None:
                continue
            
            features = mediapipe_output(image)
            if len(features) > 0:
                pred = baseline_model.predict(features)
                predicted_class_index = np.argmax(pred, axis=1)
                predictions.append(classes[predicted_class_index[0]])
                true_label = folder_name.lower()
                true_labels.append(true_label)
            
            image_count += 1
            if image_count >= 50:
                break

true_labels = np.array(true_labels)
predictions = np.array(predictions)

accuracy = accuracy_score(true_labels, predictions)
precision = precision_score(true_labels, predictions, average='macro')
recall = recall_score(true_labels, predictions, average='macro')

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")

cv2.destroyAllWindows()




[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 108ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 39ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 38ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 55ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 45ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 39ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 56ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 53ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5

  _warn_prf(average, modifier, msg_start, len(result))


In [66]:
baseline_model.evaluate(X_val,y_val)

[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9300 - loss: 0.2874 - precision_30: 0.9528 - recall_30: 0.9131


[0.3342765271663666,
 0.9223796129226685,
 0.9409332275390625,
 0.9025495648384094]

In [56]:
%load_ext tensorboard
%tensorboard --logdir logs/asl-alphabet-classifier

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 19180), started 12:04:09 ago. (Use '!kill 19180' to kill it.)

<h2>Utilities

In [68]:
# import os
# import shutil

# def limit_images_in_folder(folder_path, limit=30):
#     files = sorted([file for file in os.listdir(folder_path) if file.lower().endswith(('.png', '.jpg', '.jpeg'))])
    
#     if len(files) > limit:

#         files_to_remove = files[limit:]
        
        
#         for file in files_to_remove:
#             os.remove(os.path.join(folder_path, file))
#             print(f"Removed: {file}")

# main_directory = 'dataset/test/'

# for folder_name in os.listdir(main_directory):
#     subfolder_path = os.path.join(main_directory, folder_name)
#     if os.path.isdir(subfolder_path):
#         limit_images_in_folder(subfolder_path)
#         print(f"Processed folder: {folder_name}")


Removed: 126.jpg
Removed: 127.jpg
Removed: 128.jpg
Removed: 129.jpg
Removed: 13.jpg
Removed: 130.jpg
Removed: 131.jpg
Removed: 132.jpg
Removed: 133.jpg
Removed: 134.jpg
Removed: 135.jpg
Removed: 136.jpg
Removed: 137.jpg
Removed: 138.jpg
Removed: 139.jpg
Removed: 14.jpg
Removed: 140.jpg
Removed: 141.jpg
Removed: 142.jpg
Removed: 143.jpg
Removed: 144.jpg
Removed: 145.jpg
Removed: 146.jpg
Removed: 147.jpg
Removed: 148.jpg
Removed: 149.jpg
Removed: 15.jpg
Removed: 150.jpg
Removed: 151.jpg
Removed: 152.jpg
Removed: 153.jpg
Removed: 154.jpg
Removed: 155.jpg
Removed: 156.jpg
Removed: 157.jpg
Removed: 158.jpg
Removed: 159.jpg
Removed: 16.jpg
Removed: 160.jpg
Removed: 161.jpg
Removed: 162.jpg
Removed: 163.jpg
Removed: 164.jpg
Removed: 165.jpg
Removed: 166.jpg
Removed: 167.jpg
Removed: 168.jpg
Removed: 169.jpg
Removed: 17.jpg
Removed: 170.jpg
Removed: 171.jpg
Removed: 172.jpg
Removed: 173.jpg
Removed: 174.jpg
Removed: 175.jpg
Removed: 176.jpg
Removed: 177.jpg
Removed: 178.jpg
Removed: 179.jpg
Re