This incomplete repository works on action recognition (excercise activity in particular) on the MM-Fit dataset that adapts:
Although the MM-Fit dataset contains various form of data, with 10 activity excercise classes (plus 1 class non activity). So far we have just been working with only 2D skeleton data for visual action recognition, but there are more to come soon...
Some resource has been used for this repository:
- MM-Fit paper
- The baseline auto encoder-decoder from MM-Fit
- Implementation of ViT on Pytorch
- Implementation of MLP-Mixer on Pytorch
- Implementation of ViVit on Pytorch model 2 and model 3
Detailed of the MM-Fit dataset is providedin EDA.ipynb
.
Please look at training_scenarios.txt
for some training parameters suggestion.
Two file sampling_image.py
and sampling_video.py
help provide the distribution of the MM-fit dataset over the train/var/test set on 11 classes.
conda env create -f environment.yml
conda activate mm-fit
conda install -c conda-forge einops
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
Result so far (up to Sep 9th, 2021):
- ViT: 56.32% Acc
- MLP-Mixer: 74.44% Acc
- Vivit: 79.69% Acc