1. Set up important landmarks and functions
Generate Data Frame
According to my research the correct form for a squat is analyzed through the position of:

Back
Hip
Legs
Therefore, there will be 9 keypoints which will be extract from mediapipe in order to train or detect a correct form of a squat:

"NOSE",
"LEFT_SHOULDER",
"RIGHT_SHOULDER",
"LEFT_HIP",
"RIGHT_HIP",
"LEFT_KNEE",
"RIGHT_KNEE",
"LEFT_ANKLE",
"RIGHT_ANKLE"
The data frame will be saved in a .csv file.

A data frame will contains a "Label" columns which represent the label of a data point.

There are another 9 x 4 columns represent 9 features of a human pose that are important for a squat. In that each landmark's info will be flatten

According to the Mediapipe documentation, Each landmark consists of the following:

x and y: Landmark coordinates normalized to [0.0, 1.0] by the image width and height respectively.
z: Represents the landmark depth with the depth at the midpoint of hips being the origin, and the smaller the value the closer the landmark is to the camera. The magnitude of z uses roughly the same scale as x.
visibility: A value in [0.0, 1.0] indicating the likelihood of the landmark being visible (present and not occluded) in the image.

In [6]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import cv2
import mediapipe as mp
import time

In [7]:
# Define the landmarks
imp_landmarks = ["NOSE", "LEFT_SHOULDER", "RIGHT_SHOULDER", "LEFT_HIP", "RIGHT_HIP", "LEFT_KNEE", "RIGHT_KNEE", "LEFT_ANKLE", "RIGHT_ANKLE"]

In [8]:
landmarks = ["label"]

for landmark in imp_landmarks:
    landmarks += [f"{landmark.lower()}_x", f"{landmark.lower()}_y", f"{landmark.lower()}_z", f"{landmark.lower()}_v"]

# Model Training based on the data

In [9]:
# Load the data
data = pd.read_csv("/Users/defeee/Documents/GitHub/FormAI-ML/Computer_Vision/squat_data.csv")
data.head()

   nose_x  nose_y  left_shoulder_x  left_shoulder_y  right_shoulder_x  \
0  0.0000  0.0000         0.000000         0.000000          0.000000   
1  0.0000  0.0000         0.000000         0.000000          0.000000   
2  0.0000  0.0000         0.000000         0.000000          0.000000   
3  0.0000  0.0000         0.000000         0.000000          0.000000   
4  0.0000  0.0000         0.000000         0.000000          0.000000   

   right_shoulder_y  left_hip_x  left_hip_y  right_hip_x  right_hip_y  \
0          0.000000    0.000000    0.000000     0.000000     0.000000   
1          0.000000    0.000000    0.000000     0.000000     0.000000   
2          0.000000    0.000000    0.000000     0.000000     0.000000   
3          0.000000    0.000000    0.000000     0.000000     0.000000   
4          0.000000    0.000000    0.000000     0.000000     0.000000   

   ...  right_knee_y  left_ankle_x  left_ankle_y  right_ankle_x  \
0  ...      0.000000      0.000000      0.000000       

In [None]:
# Split the data
from sklearn.model_selection import train_test_split
X = data.drop("label", axis=1)
y = data["label"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Train the NN model using pytorch
import torch
import torch.nn as nn
import torch.nn.functional as F

In [None]:
# Define the model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(18, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 3)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

In [None]:
# Convert the data to tensor
X_train = torch.tensor(X_train.values).float()
X_test = torch.tensor(X_test.values).float()
y_train = torch.tensor(y_train.values).long()
y_test = torch.tensor(y_test.values).long()

In [None]:
# Train the model
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.01)

In [None]:
# Train the model
n_epochs = 100
train_losses = np.zeros(n_epochs)
test_losses = np.zeros(n_epochs)

for it in range(n_epochs):
    net.train()
    for inputs, targets in trainloader:
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
        train_losses[it] = loss.item()

    net.eval()
    for inputs, targets in testloader:
        outputs = net(inputs)
        loss = criterion(outputs, targets)
        test_losses[it] = loss.item()

    print(f'Epoch {it+1}/{n_epochs}, Train Loss: {train_losses[it]:.4f}, Test Loss: {test_losses[it]:.4f}')

In [None]:
# Evaluate the model
from sklearn.metrics import accuracy_score
with torch.no_grad():
    y_pred = net(X_test)
    y_pred = torch.argmax(y_pred, dim=1)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Accuracy: {accuracy}")

In [None]:
# Plot the results
plt.figure(figsize=(12, 6))
plt.plot(train_losses, label='Train Loss')
plt.plot(test_losses, label='Test Loss')
plt.legend()
plt.show()

In [None]:
# Save the model
torch.save(net.state_dict(), "/Users/defeee/Documents/GitHub/FormAI-ML/Models/Core/Squat/model.pth")