# Data Preprocessing

The dataset we will use is available from https://github.com/sergiuoprea/Hand-Gesture-Recognition-Datasets. It contains 7 gestures performed 10 times each by 5 different people for a total of 50 samples per gesture. They include both the raw dataset of 2D coordinates along with a motion-normalized dataset.

In [9]:
# Download the dataset
!wget https://raw.githubusercontent.com/sergiuoprea/Hand-Gesture-Recognition-Datasets/master/Motion_Normalized_Dataset.csv

--2023-07-11 19:05:37--  https://raw.githubusercontent.com/sergiuoprea/Hand-Gesture-Recognition-Datasets/master/Motion_Normalized_Dataset.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 611023 (597K) [text/plain]
Saving to: ‘Motion_Normalized_Dataset.csv.1’


2023-07-11 19:05:37 (10.6 MB/s) - ‘Motion_Normalized_Dataset.csv.1’ saved [611023/611023]



In [9]:
import numpy as np
import pandas as pd

# Read the dataset
df = pd.read_csv('Motion_Normalized_Dataset.csv')

# Select only samples where the gesture is move left, move right, move up, or move down
df = df[df['Gesture'].isin(['MoveLeft', 'MoveRight', 'MoveUp', 'MoveDown'])]

# Select all features that start with LC_X, LC_Y, and Gesture
X = df.filter(regex='^(LC_X|LC_Y|Gesture)')

# Create a dictionary of labels and their corresponding integer values
label_to_int = {'MoveLeft': 0, 'MoveRight': 1, 'MoveUp': 2, 'MoveDown': 3}

# Create a dictionary of integer values and their corresponding labels
int_to_label = {0: 'MoveLeft', 1: 'MoveRight', 2: 'MoveUp', 3: 'MoveDown'}

# Convert the labels from strings to integers starting at index 0
y = X['Gesture'].map(label_to_int).values

# Remove the gesture column from the dataset
X = X.drop(columns=['Gesture'])

X_new = np.zeros((X.shape[0], 30))

for i in range(X.shape[1] // 2):
    # Convert to 1D index
    X_new[:, i] = (X.iloc[:, i] + 1) * 3 + (X.iloc[:, i + 30] + 1)

# Convert to integer pandas dataframe
X = pd.DataFrame(X_new, dtype=int)

# Split the dataset into training and testing sets
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)

X_train.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,20,21,22,23,24,25,26,27,28,29
162,4,5,3,1,7,4,4,5,1,4,...,3,0,3,3,3,3,6,6,6,6
132,4,2,6,2,6,5,5,5,1,3,...,5,5,4,5,5,2,2,2,2,5
179,4,3,3,3,6,0,8,3,4,6,...,6,6,0,3,7,6,6,6,6,3
10,4,0,3,2,0,2,1,0,0,2,...,2,0,1,0,1,0,0,1,0,0
98,4,3,5,6,5,4,3,4,5,3,...,8,6,7,7,5,8,7,8,8,7


# Fitting the Model to the Training Set

In [19]:
# Fit the model using hmmlearn
from hmmlearn import hmm

# Fit a model for each gesture type
models = []
for gesture in label_to_int.keys():
    # Get all samples that belong to the current gesture
    X_gesture = X_train[y_train == label_to_int[gesture]]
    
    # Create a HMM model with 2 hidden states
    model = hmm.CategoricalHMM(n_components=2)
    
    # Fit the model to the current gesture
    model.fit(X_gesture)

    # Add the model to the list of models
    models.append(model)

# Evaluating the Model on the Test Set

For gesture recognition, a model is created for each gesture. The model is then evaluated on the test set to determine the gesture that best matches the test sample.

The score is computed using the forward algorithm. The score is the probability of the test sample given the model. The model with the highest score is the one that best matches the test sample.

In [21]:
# Classify a sample as one of the four gestures
def classify_sample(sample):
    # Compute the likelihood of the sample under each model
    scores = [model.score(sample) for model in models]
    
    # Get the index of the best scoring model
    index_best_model = np.argmax(scores)
    
    # Return the corresponding gesture
    return index_best_model

# Classify all samples and count the number of misclassifications
preds = [classify_sample(sample.reshape(1, -1)) for sample in X_test.values]
num_misclassifications = sum(preds != y_test)

# Print the number of misclassifications
print('Number of misclassifications:', num_misclassifications)

# Print the accuracy
print('Accuracy:', 1 - num_misclassifications / len(df))

# Get the parameters of the first model
A = models[0].transmat_
B = models[0].emissionprob_
pi = models[0].startprob_

# Print the transition matrix
print('Transition matrix:')
print(A)

# Print the emission matrix
print('Emission matrix:')
print(B)

# Print the initial probabilities
print('Initial probabilities:')
print(pi)

Number of misclassifications: 0
Accuracy: 1.0
Transition matrix:
[[0.12606244 0.87393756]
 [0.95223898 0.04776102]]
Emission matrix:
[[0.21024223 0.2578936  0.23249039 0.07266194 0.11676943 0.05101107
  0.02273743 0.0010716  0.03512231]
 [0.3318889  0.21020856 0.15721939 0.04178505 0.12389108 0.0305557
  0.05039491 0.03184535 0.02221106]]
Initial probabilities:
[9.99999081e-01 9.19226837e-07]


# The Forward Algorithm