# Squat Exercise Classification

In this notebook, we will collect data, preprocess it, and train two models (a simple feedforward neural network and a convolutional neural network) to classify squat exercises. We will then compare the performance of both models.

## 1. Set up important landmarks and functions

According to my research, the correct form for a squat is analyzed through the position of:

- Back
- Hip
- Legs

Therefore, there will be 9 keypoints which will be extracted from MediaPipe in order to train or detect a correct form of a squat:

- `NOSE`
- `LEFT_SHOULDER`
- `RIGHT_SHOULDER`
- `LEFT_HIP`
- `RIGHT_HIP`
- `LEFT_KNEE`
- `RIGHT_KNEE`
- `LEFT_ANKLE`
- `RIGHT_ANKLE`

The data frame will be saved in a .csv file.

A data frame will contain a `label` column which represents the label of a data point.

There are another 9 x 4 columns representing 9 features of a human pose that are important for a squat. Each landmark's info will be flattened.

According to the MediaPipe documentation, each landmark consists of the following:

- `x` and `y`: Landmark coordinates normalized to [0.0, 1.0] by the image width and height respectively.
- `z`: Represents the landmark depth with the depth at the midpoint of hips being the origin, and the smaller the value the closer the landmark is to the camera. The magnitude of `z` uses roughly the same scale as `x`.
- `visibility`: A value in [0.0, 1.0] indicating the likelihood of the landmark being visible (present and not occluded) in the image.

In [6]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import cv2
import mediapipe as mp
import time
import torch
import torch.nn as nn
import torch.nn.functional as F
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [7]:
# Define the landmarks
imp_landmarks = ["NOSE", "LEFT_SHOULDER", "RIGHT_SHOULDER", "LEFT_HIP", "RIGHT_HIP", "LEFT_KNEE", "RIGHT_KNEE", "LEFT_ANKLE", "RIGHT_ANKLE"]

In [8]:
landmarks = ["label"]

for landmark in imp_landmarks:
    landmarks += [f"{landmark.lower()}_x", f"{landmark.lower()}_y", f"{landmark.lower()}_z", f"{landmark.lower()}_v"]

## 2. Data Collection and Preprocessing

We will use the `saving_squat.py` script to gather landmarks for squat exercises using MediaPipe. The script will display the video with landmarks overlayed and wait for a keystroke to classify it. The classified data will be saved to a CSV file for training and testing.

In [9]:
# Load the data
data = pd.read_csv('squat_data.csv')
data.head()

In [10]:
# Split the data into features and labels
X = data.drop(columns=['label'])
y = data['label']

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## 3. Implementing a Simple Feedforward Neural Network

We will define a simple feedforward neural network using `torch.nn` and train it on the training data.

In [11]:
class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

# Hyperparameters
input_size = X_train.shape[1]
hidden_size = 64
num_classes = 3
num_epochs = 100
learning_rate = 0.001

# Model, loss function, and optimizer
model = SimpleNN(input_size, hidden_size, num_classes)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Convert data to PyTorch tensors
X_train_tensor = torch.tensor(X_train.values, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values, dtype=torch.long)
X_test_tensor = torch.tensor(X_test.values, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.long)

# Training loop
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_train_tensor)
    loss = criterion(outputs, y_train_tensor)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Evaluate the model
model.eval()
with torch.no_grad():
    train_outputs = model(X_train_tensor)
    _, train_predicted = torch.max(train_outputs.data, 1)
    train_accuracy = accuracy_score(y_train_tensor, train_predicted)

    test_outputs = model(X_test_tensor)
    _, test_predicted = torch.max(test_outputs.data, 1)
    test_accuracy = accuracy_score(y_test_tensor, test_predicted)

print(f'Training Accuracy: {train_accuracy:.4f}')
print(f'Testing Accuracy: {test_accuracy:.4f}')

## 4. Implementing a Convolutional Neural Network

We will define a convolutional neural network using `torch.nn` and train it on the training data.

In [12]:
class ConvNet(nn.Module):
    def __init__(self, num_classes):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv1d(1, 16, kernel_size=3, padding=1),
            nn.BatchNorm1d(16),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv1d(16, 32, kernel_size=3),
            nn.BatchNorm1d(32),
            nn.ReLU(),
            nn.MaxPool1d(2))
        self.fc1 = nn.Linear(32*2, num_classes)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        return out

# Hyperparameters
num_classes = 3
num_epochs = 100
learning_rate = 0.001

# Model, loss function, and optimizer
model = ConvNet(num_classes)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Reshape data for ConvNet
X_train_tensor = X_train_tensor.view(-1, 1, X_train_tensor.shape[1])
X_test_tensor = X_test_tensor.view(-1, 1, X_test_tensor.shape[1])

# Training loop
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(X_train_tensor)
    loss = criterion(outputs, y_train_tensor)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 10 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Evaluate the model
model.eval()
with torch.no_grad():
    train_outputs = model(X_train_tensor)
    _, train_predicted = torch.max(train_outputs.data, 1)
    train_accuracy = accuracy_score(y_train_tensor, train_predicted)

    test_outputs = model(X_test_tensor)
    _, test_predicted = torch.max(test_outputs.data, 1)
    test_accuracy = accuracy_score(y_test_tensor, test_predicted)

print(f'Training Accuracy: {train_accuracy:.4f}')
print(f'Testing Accuracy: {test_accuracy:.4f}')

## 5. Comparing the Results

We will compare the accuracy, precision, recall, and F1-score of both the NN and CNN models. We will also visualize the results using plots to show the performance of both models.

In [13]:
from sklearn.metrics import precision_score, recall_score, f1_score

# Calculate precision, recall, and F1-score for NN
nn_precision = precision_score(y_test_tensor, test_predicted, average='weighted')
nn_recall = recall_score(y_test_tensor, test_predicted, average='weighted')
nn_f1 = f1_score(y_test_tensor, test_predicted, average='weighted')

# Calculate precision, recall, and F1-score for CNN
cnn_precision = precision_score(y_test_tensor, test_predicted, average='weighted')
cnn_recall = recall_score(y_test_tensor, test_predicted, average='weighted')
cnn_f1 = f1_score(y_test_tensor, test_predicted, average='weighted')

# Print the results
print(f'NN Precision: {nn_precision:.4f}')
print(f'NN Recall: {nn_recall:.4f}')
print(f'NN F1-score: {nn_f1:.4f}')

print(f'CNN Precision: {cnn_precision:.4f}')
print(f'CNN Recall: {cnn_recall:.4f}')
print(f'CNN F1-score: {cnn_f1:.4f}')

# Visualize the results
labels = ['NN', 'CNN']
accuracy = [train_accuracy, test_accuracy]
precision = [nn_precision, cnn_precision]
recall = [nn_recall, cnn_recall]
f1 = [nn_f1, cnn_f1]

x = np.arange(len(labels))
width = 0.2

fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, accuracy, width, label='Accuracy')
rects2 = ax.bar(x + width/2, precision, width, label='Precision')
rects3 = ax.bar(x + 1.5*width, recall, width, label='Recall')
rects4 = ax.bar(x + 2.5*width, f1, width, label='F1-score')

ax.set_xlabel('Model')
ax.set_ylabel('Scores')
ax.set_title('Comparison of NN and CNN Models')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.legend()

fig.tight_layout()

plt.show()