Learning to Estimate Critical Gait Parameters from Single-View RGB Videos with Transformer-Based Attention Network
This repository is the official implementation for Learning to Estimate Critical Gait Parameters from Single-View RGB Videos with Transformer-Based Attention Network. In this paper, we propose a novel spatio-temporal Transformer network to estimate critical gait parameters (walking speed, cadence, Gait Deviation Index, knee flexion angle at maximum extension) from RGB videos captured by a single-view camera.
This figure demonstrates the overall architecture of our proposed method for
The dataset used in this study is available here. For pre-processed data, click here, which has been pre-processed following the official code of mobile-gaitlab.
You can download our pre-processed data and create a new directory data
to start training.
To train the model, run:
- For GDI:
python main.py --epoch_num 200 --batch_size 128 --lr 3e-4 --eta_min 8e-5 --target_metric GDI --sides 2
- For Knee Flexion Angle at Maximum Extension:
python main.py --epoch_num 200 --batch_size 128 --lr 3e-4 --eta_min 8e-5 --target_metric KFME --sides 2
- For Speed:
python main.py --epoch_num 200 --batch_size 128 --lr 6e-4 --eta_min 1e-4 --target_metric speed --sides 1
- For Cadence:
python main.py --epoch_num 200 --batch_size 128 --lr 6e-4 --eta_min 1e-4 --target_metric cadence --sides 1
-sides
controls the number of sides to train the model. Double-sided parameters, such as GDI or Knee Flexion Angle at Maximum Extension, require the model to output one metric for each side.
-eta_min
controls the minimum learning rate for the CosineAnnealingWarmRestarts scheduler as proposed in SGDR: Stochastic Gradient Descent with Warm Restarts
We would like to thank the authors of PoseFormer and mobile-gaitlab. Part of our implementation is adapted from their open-source code.