# Introduction
This notebook investigates the generalization capability of our AI model in translating sign language. We train the model on datasets from different team members and then evaluate its performance on unseen data from other members. This approach allows us to understand the model's effectiveness in handling data variations and its reliability in real-world scenarios.


# Model Training and Preprocessing Functions
We define two key functions: `train_rnn_model` for training the RNN model on selected datasets and `preprocess_test_data` for preprocessing test data. These functions streamline the process of retraining the model on different datasets and preparing various test datasets for evaluation.


In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense, Dropout
from keras.callbacks import EarlyStopping
from sklearn.metrics import accuracy_score
import joblib
import matplotlib.pyplot as plt
from keras.layers import SimpleRNN, Bidirectional, BatchNormalization
from sklearn.metrics import precision_score, recall_score, f1_score

def train_rnn_model(*dataset_paths):

     # Load and concatenate the datasets
    dfs = [pd.read_csv(path) for path in dataset_paths]
    df = pd.concat(dfs, ignore_index=True)

    # number of rows and columns
    print(df.shape)

    # Convert all feature columns to numeric and set non-convertible values to NaN
    for col in df.columns[:-1]:  # Excluding the last column
        df[col] = pd.to_numeric(df[col], errors='coerce')

    # Removing rows with NaN values
    df.dropna(inplace=True)

    # Separate features and labels
    X = df.iloc[:, :-1].values  # All columns except the last one
    y = df.iloc[:, -1].values   # Only the last column

    # Scale the features
    scaler = MinMaxScaler()
    X = scaler.fit_transform(X)

    # Reshape X to fit the RNN model (samples, time steps, features)
    X = X.reshape((X.shape[0], 1, X.shape[1]))

    # Encode the labels
    encoder = OneHotEncoder(sparse=False)
    y_encoded = encoder.fit_transform(y.reshape(-1, 1))

    # Define the RNN model
    model_rnn = Sequential()
    model_rnn.add(Bidirectional(SimpleRNN(30, activation='relu', return_sequences=True), input_shape=(X.shape[1], X.shape[2])))
    model_rnn.add(BatchNormalization())
    model_rnn.add(SimpleRNN(32, activation='relu'))
    model_rnn.add(Dropout(0.3))
    model_rnn.add(Dense(16, activation='relu'))
    model_rnn.add(Dense(y_encoded.shape[1], activation='softmax'))

    # Compile the model with categorical_crossentropy loss function
    model_rnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    # Add EarlyStopping as a callback
    early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

    # Split the dataset into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.2, random_state=42)

    # Train the model
    model_rnn.fit(X_train, y_train, epochs=100, validation_data=(X_test, y_test), callbacks=[early_stopping])

    # Save the model
    model_rnn.save('rnn_model.h5')
    # Save the scaler to use it in predict.py and scale the realtime data
    joblib.dump(scaler, 'rnn_scaler.joblib')

    return model_rnn, scaler, encoder




In [2]:
def preprocess_test_data(df_test, scaler, encoder):
    df_test.dropna(inplace=True)
    X_test = df_test.iloc[:, :-1].values
    y_test = df_test.iloc[:, -1].values
    X_test = scaler.transform(X_test)
    X_test = X_test.reshape((X_test.shape[0], 1, X_test.shape[1]))
    y_test_encoded = encoder.transform(y_test.reshape(-1, 1))
    return X_test, y_test_encoded

In [3]:
def predict(model_rnn, X, y_test):
    y_pred = model_rnn.predict(X)
    # Convert predictions to classes
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_test_classes = np.argmax(y_test, axis=1)
    # Calculate the accuracy
    accuracy = accuracy_score(y_test_classes, y_pred_classes)
    print(f"Accuracy on the test set: {accuracy * 100:.2f}%")

    # Calculate precision, recall, and F1-score
    precision = precision_score(y_test_classes, y_pred_classes, average='weighted')
    recall = recall_score(y_test_classes, y_pred_classes, average='weighted')
    f1 = f1_score(y_test_classes, y_pred_classes, average='weighted')

    print(f"Precision: {precision:.2f}")
    print(f"Recall: {recall:.2f}")
    print(f"F1-score: {f1:.2f}")

# Model Evaluation on Unseen Data
## First Test
In this section, we evaluate the model's performance on data from Mouad, who was not included in the initial training set. This test aims to assess the model's ability to generalize from the training data to new, unseen data.


In [4]:
model_rnn, scaler, encoder = train_rnn_model('../dataset/sensor_data_badr.csv', '../dataset/sensor_data_ismail.csv')

(1191, 441)






Epoch 1/100


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100


  saving_api.save_model(


In [5]:
df_test = pd.read_csv('../dataset/sensor_data_mouad.csv')
X_test_mouad, y_test_mouad_encoded = preprocess_test_data(df_test, scaler, encoder)

In [6]:
predict(model_rnn, X_test_mouad, y_test_mouad_encoded)

Accuracy on the test set: 99.33%
Precision: 0.99
Recall: 0.99
F1-score: 0.99


## Second Test
Following the first test, we retrain the model on a larger dataset including Mouad's data and then test it on data from Kamal, another member not previously included. This step is crucial for assessing how well the model adapts to new individuals and the potential impact of dataset quality on model performance.


In [7]:
model_rnn, scaler, encoder = train_rnn_model('../dataset/sensor_data_badr.csv', '../dataset/sensor_data_mouad.csv', '../dataset/sensor_data_ismail.csv')

(1791, 441)




Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100


  saving_api.save_model(


In [8]:
df_test = pd.read_csv('../dataset/sensor_data_kamal.csv')
X_test_kamal, y_test_kamal_encoded = preprocess_test_data(df_test, scaler, encoder)

In [9]:
predict(model_rnn, X_test_kamal, y_test_kamal_encoded)

Accuracy on the test set: 71.94%
Precision: 0.83
Recall: 0.72
F1-score: 0.68


  _warn_prf(average, modifier, msg_start, len(result))


# Conclusion
The results from testing the model on unseen data highlight its capabilities and limitations in generalizing from the training dataset. While the model performed well on data from Mouad, the decreased accuracy observed with Kamal's data underscores the importance of dataset quality and diversity in training. These findings emphasize the need for comprehensive and varied training data to ensure the model's effectiveness across different individuals and sign language variations.
