<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Capstone Project - Zoom Ally: A Running Form Classifier (Overstriding)

> Author: Lee Hongwei
---

## Problem Statement:

Our challenge is to devise a solution that empowers individuals to **detect one of the most prevalent running form errors: overstriding**, attempting to strengthen Nike Run Club's position as a trusted virtual running coach and in turn, the app's ability to attract and retain users.

---
There are a total of ___ notebooks for this project:  
 1. `01_Data_Collection_Video_Dl.ipynb`   
 2. `02_Data_Collection_Pose_Estimation.ipynb`   
 3. `03_Feature_Engineering_and_EDA.ipynb`
 4. `04_Data_Preprocessing_and_Modelling.ipynb`
 
**This Notebook:**
- Extract pose landmark data from keypoints using mediapipe's pose estimation model

# 1. Import Libraries

In [1]:
# Import libraries for computer vision
import cv2
import mediapipe as mp

In [2]:
# Import libraries for basic necessities
import numpy as np
import os
import time
import pandas as pd
import matplotlib.pyplot as plt

# 2. Defining File Paths

In [3]:
# Defining the filepaths for videos

SOURCE_PATH_G = '../datasets/good_running' # Source path for no_overstride videos
SOURCE_PATH_B = '../datasets/bad_running' # Source path for overstride videos

# Get a list of all video files in the source folder
video_files_g = [f for f in os.listdir(SOURCE_PATH_G) if f.endswith('.mp4')]
video_files_b = [f for f in os.listdir(SOURCE_PATH_B) if f.endswith('.mp4')]

# 3. Pose Detection

<img src='https://i.imgur.com/3j8BPdc.png' style = 'height:300px'>

Mediapipe collects 33 keypoints from humans detected in a video, with the indices of each keypoint marked as above. Each keypoint contains x, y, z and visibility coordinates within the video and is scaled between 0 to 1. Origin of coordinates in a video start from the top left corner.

## 3.1 Extracting and Drawing Keypoints

### 3.1.1 Instantiate Model and Define Functions

In [4]:
# Instantiate holistic detection model
mp_holistic = mp.solutions.holistic # Makes our keypoint detections
mp_drawing = mp.solutions.drawing_utils # Draws the points and lines between points

# Set mediapipe holistic model to detection and tracking confidence as 0.5 to allow easier tracking and fewer missing values.
holistic = mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5)

I0000 00:00:1715063173.596034       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2


INFO: Created TensorFlow Lite XNNPACK delegate for CPU.


In [5]:
# Define a function that passes in an image and the detection model that it will be used on

def mediapipe_detection(image, model):
    # Colour conversion as opencv reads in bgr
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # Make image non-writeable to prevent accidental modification of the image
    image.flags.writeable = False
    # Image data is fixed and can be used to create prediction results of keypoint detections
    results = model.process(image)
    # Make image writeable again
    image.flags.writeable = True
    # Convert colour back to bgr for it to be readable by opencv
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

    # Return 2 values: image and results. 
    return image, results

In [6]:
# Define a function that draws the landmarks from the 'pose_landmarks' function

def draw_landmarks(image, results):
    mp_drawing.draw_landmarks(image, 
                              results.pose_landmarks, # Note: Mediapipe has many other landmark markers such as face and hands which I will not be using
                              mp_holistic.POSE_CONNECTIONS)

## 3.2 Testing with webcam

In [7]:
# Create an opencv object from our video
cap = cv2.VideoCapture(0)  # Passing 0 into VideoCapture == Use WebCam footage (MacOS)

# Set mediapipe model

while cap.isOpened():
    # Read feed
    ret, frame = cap.read()
    # Once feed is at its last frame, there will be no ret value, break loop and move to next video
    if not ret:
        break

    # Make detections using previously defined function to detect keypoints
    image, results = mediapipe_detection(frame, holistic)
    print(results)

    # Show to screen with defined function to draw points and lines
    draw_landmarks(image, results)
    cv2.imshow('OpenCV Feed', image)

    # Break loop if 'q' is pressed
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

# Release video capture and destroy OpenCV windows
cap.release()
if cv2.getWindowProperty('OpenCV Feed', cv2.WND_PROP_VISIBLE) >= 1:
    cv2.destroyAllWindows()
cv2.waitKey(1)

<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.solution_base.SolutionOutputs'>
<class 'mediapipe.python.soluti

-1

At this stage, we can opt between extracting the data into a numpy array or a pandas dataframe/csv file. 

We opted to save each frame's data in numpy arrays for the following reasons:
1. Lighter file size
2. Ease of tracking skipped frames (mediapipe sometimes skips the frame if the pose estimation was not able to extract the data fast enough)
3. Ease of conversion back into a pandas dataframe

Cons of doing so:
At this stage, we're not able to visualise the data with ease which a pandas dataframe offers with column names available. However since we're just extracting data, as long as the pose estimation model runs smoothly and in a controlled manner, we can accurately convert the numpy array into a pandas dataframe with ease.

In [8]:
# Extracting webcam results into a numpy array

landmarks = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark])
print(landmarks[:5])

[[ 0.50638884  0.49915484 -0.61960572  0.99748868]
 [ 0.52479327  0.43191147 -0.56306452  0.99831372]
 [ 0.53690493  0.43197474 -0.56319326  0.9976967 ]
 [ 0.54576963  0.43299702 -0.56347585  0.99803662]
 [ 0.47238716  0.43674192 -0.56225872  0.99813634]]


In [9]:
# Defining data extraction from keypoints as a function

def extract_keypoints(results):
    landmarks = np.array([[res.x, res.y, res.z, res.visibility] for res in results.pose_landmarks.landmark])
    return landmarks

## 3.3 Applying Pose Detection on Exported Videos

### 3.3.1 No Overstride

In [10]:
# Define the path for exported data (numpy arrays)
EXPORT_PATH = '../datasets/no_overstride_npy'

In [11]:
# Checking files
print(video_files_g)
print('\n')
print(len(video_files_g))

['F_Running_3.mp4', 'F_Running_2.mp4', 'F_Running_6.mp4', 'F_Running_7.mp4', 'F_Running_5.mp4', 'F_Running_4.mp4', 'M_Running_8.mp4', 'M_Running_3.mp4', 'M_Running_2.mp4', 'M_Running_1.mp4', 'M_Running_5.mp4', 'M_Running_4.mp4', 'M_Running_6.mp4', 'M_Running_7.mp4', 'F_Running_8.mp4']


15


In [15]:
# loop through video files for each .mp4 file
for videomp4 in video_files_g:
    # Construct the full path to the video file
    vid_file = os.path.join(SOURCE_PATH_G, videomp4)
    
    # Create the directory to store the numpy arrays for this video
    video_dir = os.path.join(EXPORT_PATH, videomp4[:-4])  # Remove the '.mp4' extension from the video file name
    os.makedirs(video_dir, exist_ok=True)
    
    # Create an opencv object from our video
    cap = cv2.VideoCapture(vid_file)
    
    # Get the total number of frames in the video
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

    # Timestep
    timestep = 0

    # Set mediapipe model

    while cap.isOpened():
        # Read feed
        ret, frame = cap.read()
        # Once feed is at its last frame, there will be no ret value, break loop and move to next video
        if not ret:
            break

        # Make detections using previously defined function to detect keypoints
        image, results = mediapipe_detection(frame, holistic)

        # Export Keypoints
        keypoints = extract_keypoints(results) # Extract keypoints from the results object

        # Save numpy in new folder
        npy_path = os.path.join(video_dir, str(timestep) + '.npy') # Create a path to save the numpy array to be generated below
        np.save(npy_path, keypoints) # Save the keypoints data in a numpy array onto the path defined
        
        # Show to screen with defined function to draw points and lines
        draw_landmarks(image, results)
        cv2.imshow('OpenCV Feed', image)

        # Increase timestep by 1 for next frame
        timestep += 1

        # Break loop if 'q' is pressed
        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

# Release video capture and destroy OpenCV windows
cap.release()
if cv2.getWindowProperty('OpenCV Feed', cv2.WND_PROP_VISIBLE) >= 1:
    cv2.destroyAllWindows()
cv2.waitKey(1)  # Added this
cap.release

holistic.close
    

<bound method SolutionBase.close of <mediapipe.python.solutions.holistic.Holistic object at 0x2962c44f0>>

### 3.3.2 Overstriding

In [16]:
# Define the path for exported data (numpy arrays)
DATA_PATH = '../datasets/overstride_npy'

# Create video extraction source path 
source_path = '../datasets/bad_running/'

# Get a list of all video files in the source folder
video_files = [f for f in os.listdir(source_path) if f.endswith('.mp4')]

In [17]:
video_files

['Overstride_16.mp4',
 'Overstride_17.mp4',
 'Overstride_15.mp4',
 'Overstride_14.mp4',
 'Overstride_10.mp4',
 'Overstride_11.mp4',
 'Overstride_13.mp4',
 'Overstride_12.mp4',
 'Overstride_7.mp4',
 'Overstride_6.mp4',
 'Overstride_4.mp4',
 'Overstride_5.mp4',
 'Overstride_2.mp4',
 'Overstride_3.mp4',
 'Overstride_8.mp4',
 'Overstride_9.mp4']

In [18]:
# loop through video files for each .mp4 file
for videomp4 in video_files:
    # Construct the full path to the video file
    vid_file = os.path.join(source_path, videomp4)
    
    # Create the directory to store the numpy arrays for this video
    video_dir = os.path.join(DATA_PATH, videomp4[:-4])  # Remove the '.mp4' extension from the video file name
    os.makedirs(video_dir, exist_ok=True)
    
    # Create an opencv object from our video
    cap = cv2.VideoCapture(vid_file)
    
    # Get the total number of frames in the video
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

    # Timestep
    timestep = 0

    # Set mediapipe model
    with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
        while cap.isOpened():
            # Read feed
            ret, frame = cap.read()
            # Once feed is at its last frame, there will be no ret value, break loop and move to next video
            if not ret:
                break

            # Make detections using previously defined function to detect keypoints
            image, results = mediapipe_detection(frame, holistic)

            # Export Keypoints
            keypoints = extract_keypoints(results) # Extract keypoints from the results object

            # Save numpy in new folder
            timestep+=1
            npy_path = os.path.join(video_dir, str(timestep) + '.npy') # Create a path to save the numpy array to be generated below
            np.save(npy_path, keypoints) # Save the keypoints data in a numpy array onto the path defined
            
            # Show to screen with defined function to draw points and lines
            draw_landmarks(image, results)
            cv2.imshow('OpenCV Feed', image)

            # Break loop if 'q' is pressed
            if cv2.waitKey(10) & 0xFF == ord('q'):
                break

    # Release video capture and destroy OpenCV windows
    cap.release()
    if cv2.getWindowProperty('OpenCV Feed', cv2.WND_PROP_VISIBLE) >= 1:
        cv2.destroyAllWindows()
    cv2.waitKey(1)  # Added this
    cap.release

holistic.close
    

I0000 00:00:1715063415.273276       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063423.528371       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063432.063810       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063440.198645       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063449.497605       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063459.463148       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063468.746910       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063476.553606       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063485.012969       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:0

<bound method SolutionBase.close of <mediapipe.python.solutions.holistic.Holistic object at 0x2d215cb50>>

### 3.3.3 Overstride Flipped

In [19]:
# Define the path for exported data (numpy arrays)
DATA_PATH = '../datasets/overstride_npy'

# Create video extraction source path 
source_path = '../datasets/overstride_augment/'

# Get a list of all video files in the source folder
video_files = [f for f in os.listdir(source_path) if f.endswith('.mp4')]

In [20]:
# loop through video files for each .mp4 file
for videomp4 in video_files:
    # Construct the full path to the video file
    vid_file = os.path.join(source_path, videomp4)
    
    # Create the directory to store the numpy arrays for this video
    video_dir = os.path.join(DATA_PATH, videomp4[:-4])  # Remove the '.mp4' extension from the video file name
    os.makedirs(video_dir, exist_ok=True)
    
    # Create an opencv object from our video
    cap = cv2.VideoCapture(vid_file)
    
    # Get the total number of frames in the video
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

    # Timestep
    timestep = 0

    # Set mediapipe model
    with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
        while cap.isOpened():
            # Read feed
            ret, frame = cap.read()
            # Once feed is at its last frame, there will be no ret value, break loop and move to next video
            if not ret:
                break

            # Make detections using previously defined function to detect keypoints
            image, results = mediapipe_detection(frame, holistic)

            # Export Keypoints
            keypoints = extract_keypoints(results) # Extract keypoints from the results object

            # Save numpy in new folder
            timestep+=1
            npy_path = os.path.join(video_dir, str(timestep) + '.npy') # Create a path to save the numpy array to be generated below
            np.save(npy_path, keypoints) # Save the keypoints data in a numpy array onto the path defined
            
            # Show to screen with defined function to draw points and lines
            draw_landmarks(image, results)
            cv2.imshow('OpenCV Feed', image)

            # Break loop if 'q' is pressed
            if cv2.waitKey(10) & 0xFF == ord('q'):
                break

    # Release video capture and destroy OpenCV windows
    cap.release()
    if cv2.getWindowProperty('OpenCV Feed', cv2.WND_PROP_VISIBLE) >= 1:
        cv2.destroyAllWindows()
    cv2.waitKey(1)  # Added this
    cap.release

holistic.close
    

I0000 00:00:1715063561.452600       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063569.899287       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063580.022839       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063588.191071       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063597.410476       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063605.790377       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063613.701329       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063621.488399       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063632.732314       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:0

<bound method SolutionBase.close of <mediapipe.python.solutions.holistic.Holistic object at 0x2d0fc86d0>>

### 3.3.4 No Overstride Flipped

In [21]:
# Define the path for exported data (numpy arrays)
DATA_PATH = '../datasets/no_overstride_npy'

# Create video extraction source path 
source_path = '../datasets/no_overstride_augment/'

# Get a list of all video files in the source folder
video_files = [f for f in os.listdir(source_path) if f.endswith('.mp4')]

In [22]:
# loop through video files for each .mp4 file
for videomp4 in video_files:
    # Construct the full path to the video file
    vid_file = os.path.join(source_path, videomp4)
    
    # Create the directory to store the numpy arrays for this video
    video_dir = os.path.join(DATA_PATH, videomp4[:-4])  # Remove the '.mp4' extension from the video file name
    os.makedirs(video_dir, exist_ok=True)
    
    # Create an opencv object from our video
    cap = cv2.VideoCapture(vid_file)
    
    # Get the total number of frames in the video
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

    # Timestep
    timestep = 0

    # Set mediapipe model
    with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
        while cap.isOpened():
            # Read feed
            ret, frame = cap.read()
            # Once feed is at its last frame, there will be no ret value, break loop and move to next video
            if not ret:
                break

            # Make detections using previously defined function to detect keypoints
            image, results = mediapipe_detection(frame, holistic)

            # Export Keypoints
            keypoints = extract_keypoints(results) # Extract keypoints from the results object

            # Save numpy in new folder
            timestep+=1
            npy_path = os.path.join(video_dir, str(timestep) + '.npy') # Create a path to save the numpy array to be generated below
            np.save(npy_path, keypoints) # Save the keypoints data in a numpy array onto the path defined
            
            # Show to screen with defined function to draw points and lines
            draw_landmarks(image, results)
            cv2.imshow('OpenCV Feed', image)

            # Break loop if 'q' is pressed
            if cv2.waitKey(10) & 0xFF == ord('q'):
                break

    # Release video capture and destroy OpenCV windows
    cap.release()
    if cv2.getWindowProperty('OpenCV Feed', cv2.WND_PROP_VISIBLE) >= 1:
        cv2.destroyAllWindows()
    cv2.waitKey(1)  # Added this
    cap.release

holistic.close
    

I0000 00:00:1715063711.143049       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063716.794014       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063725.149623       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063733.674335       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063744.155393       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063755.112609       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063763.671771       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063775.172110       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:00:1715063785.252647       1 gl_context.cc:344] GL version: 2.1 (2.1 Metal - 88), renderer: Apple M2
I0000 00:0

<bound method SolutionBase.close of <mediapipe.python.solutions.holistic.Holistic object at 0x2df8edaf0>>

---

Each frame's keypoint coordinates have been extracted into a singular numpy array. With all the frames from a video saved in their individual folders. 

In the next notebook, we will observe the differences between angles and positions of different keypoints across runners in both classes. This will help us identify key features that will be included in the model.