If you find this code useful for your research, consider citing:
@misc{https://doi.org/10.48550/arxiv.2212.10746,
doi = {10.48550/ARXIV.2212.10746},
url = {https://arxiv.org/abs/2212.10746},
author = {Song, Neil},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {SLGTformer: An Attention-Based Approach to Sign Language Recognition},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
Simply run conda env create -f environment.yml
.
Please download and use the preprocessed skeleton data for WLASL by Skeleton Aware Multi-modal Sign Language Recognition. Please be sure to follow their rules and agreements when using the preprocessed data.
./download.sh
Pretrained models are provided here.
./train.sh
./test.sh
This code is based on SAM-SLR-v2. Huge thank you to the authors for open sourcing their code.
Thank you to @yuxng for his advice and guidance throughout this project. Shout-out to his lab @IRVL for the RTX A5000s and all the fun conversations while models were training.