Real Time Squat Classifying Service built with React, TensorflowJS, MediaPipe, Firebase. Check out! https://real-time-squat-classifier.web.app/
Recently, AI technology is conquering the field of video beyond images. Our team developed an AI service that classifies users' squat video received through webcams into seven labels and shows them in real time. The frontend of the website was developed using React, and the service was designed to perform inference on the browser using TensorflowJS so that no separate server or cloud was required. Through the MediaPipe framework, data was received by detecting the user's real-time pose and storing the results in local storage. Our main reference paper was [1].
This project used the data from https://hi.cs.waseda.ac.jp/~ogata/Dataset.html. The data consist of 7 labels of squat pose.
Model architecture is like below. First model, called CID, we replaced average pooling of the original paper to FC Layer, which led to the better result. Second one, since temporal distance can be interpreted as sinusoidal waves, we implemented fourier transform to the data and developed a model that deals with transformed data.
We conducted an experiment to compare the reference model from the paper and ours.
Paper | Experiment | CID | FFT | |
---|---|---|---|---|
Parameter | 3.1M | 3.1M | 6M | 0.3~2.5M |
Accuracy | 0.75-0.89 | 0.50-0.65 | 0.71-0.82 | 0.62-0.7 |
Gradcam result is like below.
[1] R. Ogata, E. Simo-Serra, S. Iizuka and H. Ishikawa, "Temporal Distance Matrices for Squat Classification," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019, pp. 2533-2542, doi: 10.1109/CVPRW.2019.00309.