# Libraries

In [1]:
import os
import pandas as pd
import numpy as np
from tslearn.generators import random_walks
from tslearn.clustering import TimeSeriesKMeans

# Select activity

In this part, only the squat activity is selected, in the training, the validation and the testing dataset.

In [2]:
squat = ['./CSV_thunder/' + i for i in os.listdir('./CSV_thunder') if 'Squat' in i] + ['./CSV2_thunder/' + i for i in os.listdir('./CSV2_thunder') if 'Squat' in i]

In [3]:
x_dataset = np.zeros((len(squat), 150, 34))
for ind,file in enumerate(squat):
    df = pd.read_csv(file, index_col = 0)
    df.drop('Frame_number', inplace = True)
    dfT =df.transpose()
    x_dataset[ind,:,:] = dfT.to_numpy()

# Cluster

The goal here is to create 3 clusters with every videos of squats thanks to a KMean TimeSeries model. The idea would be that the clusters would reflect the fitness accuracy, with the cluster: "correct realisation", "medium realisation" and "incorrect realisation".
<br>
We could expect that the clusters reflect fitness accuracy as we give relevant features (joint coordinate) to evaluate fitness accuracy. Some other features, like the angle of the joint would be better to expect the cluster to the one we want.

In [4]:
km_dba = TimeSeriesKMeans(n_clusters=3, metric="dtw", max_iter=5, max_iter_barycenter=5, random_state=0).fit_predict(x_dataset)

In [5]:
cluster_0 = [squat[ind].replace('./CSV_thunder/', '') for ind, i in enumerate(km_dba) if i==0]
cluster_1 = [squat[ind].replace('./CSV_thunder/', '') for ind, i in enumerate(km_dba) if i==1]
cluster_2 = [squat[ind].replace('./CSV_thunder/', '') for ind, i in enumerate(km_dba) if i==2]        

In [6]:
cluster_0

["Squats_210_25 - I tried Inger Houghton's 7 Minute Workout Tabata Songs.csv",
 'Squats_296_30 - 7 Minute Workout Full Video.csv',
 'Squats_308_4.csv',
 'Squats_48_12.csv',
 'Squats_77_14.csv',
 'Squats_92_15.csv',
 './CSV2_thunder/Squats_268_55 - The 7 Minute WorkoutFact or Fiction_480p.csv',
 './CSV2_thunder/Squats_52_35 -7 Minute workout song   wtimer  tabata song_480p.csv']

In [7]:
cluster_1

['Squats_122_19 - 7 Minute Workout Song wtimer  Tabata Songs_480p.csv',
 'Squats_136_20 - 7 minute workout Full bodySONG_480p.csv',
 'Squats_152_21 - Tabata 7 minute workout_480p.csv',
 'Squats_167_22 - 7 Minute Workout Song (Tabata Songs).csv',
 'Squats_181_23 -7 Minute Workout Song KIDS  7.csv',
 'Squats_195_24 - 7 Minute Workout  Tabata Songs wtimer_480p.csv',
 'Squats_21_10.csv',
 'Squats_253_28 - Mitray from RusUkraine  7 minute workout  Tabata Song_480p.csv',
 "Squats_268_29 - Herbalife's 7 Minute WorkOut.csv",
 'Squats_283_3.csv',
 'Squats_323_5.csv',
 'Squats_33_11.csv',
 'Squats_356_7.csv',
 'Squats_368_8.csv',
 'Squats_64_13.csv',
 './CSV2_thunder/Squats_121_42  - The Perfect 7Minute Workout_480p.csv',
 './CSV2_thunder/Squats_135_43 - The Scientific 7minute Workout.csv',
 './CSV2_thunder/Squats_15_32 - 7 minute Workout TABATA song_360p.csv',
 './CSV2_thunder/Squats_172_46 -Scientific 7Minute or 7Minute Scientific  Workout_480p.csv',
 './CSV2_thunder/Squats_208_49 - 7Minute Wo

In [8]:
cluster_2

['Squats_109_17.csv',
 'Squats_225_26 - 7 Minute Workout Song wtimer TABATA SONGS_480p.csv',
 'Squats_240_27 -Scientific 7 Minute Workout_480p.csv',
 'Squats_340_6.csv',
 'Squats_7_1.csv',
 './CSV2_thunder/Squats_109_41 - The Scientific 7Minute Workout_1080p.csv',
 './CSV2_thunder/Squats_147_44- The 7 Minute Workout.csv',
 './CSV2_thunder/Squats_159_45 - THE SCIENTIFIC 7 MINUTE WORKOUT 2.csv',
 './CSV2_thunder/Squats_186_47 - The 7 Minute Scientific Workout_1080p.csv',
 './CSV2_thunder/Squats_197_48 - 7 Minute Workout Challenge FULL Workout.csv',
 './CSV2_thunder/Squats_256_54 - 7minute workout  Prospect Park Loop Brooklyn NYC  Absolute WIN_480p.csv',
 './CSV2_thunder/Squats_28_33 - Trying The 7 Minute Workout App.csv',
 './CSV2_thunder/Squats_293_59 -  The Scientific 7minute Workout_1080p.csv',
 './CSV2_thunder/Squats_306_60 -7minute urban workout  Prospect Place Brooklyn NYC  Absolute WIN_1080p.csv',
 './CSV2_thunder/Squats_4_31 - Interval Training ft 7 Mins Workout Song_1080p.csv',


### Results
After checking the videos from the different clusters, it seems that the clusters don't represent the fitness accuracy, but rather the position of the subject relatively to the camera. Indeed, one cluster seems to have all videos where the camera rotates around the subject. And the two others are different in the way the subject is position in the image.
<br>
Therefore our goal is not reached, but it is due to the wide diversity of the dataset. We could expect better results in a dataset where all the videos are standardized.