SPOBET

Pose-based word-level sign language recognition with BERT-style transformer in Keras

About The Project

This repository implements, using Keras, a pose-based, word-level sign language recognition with BERT-style transformer.

Model is trained on WLASL 2D pose data, on the ASL100 split. https://github.com/dxli94/WLASL
Model is built with Keras layers; highly transferable and configurable
Comparable* accuracy levels achieved on the ASL100 split as compared to other pose-based word-level sign language recognition models

Further details on the implementation and results discussion can be found in https://medium.com/@kennethong.ai/spobet-d9d952836c48

(back to top)

Using the Repo

Getting Started

Clone this repository
Install the required packages using the requirements.txt.
Download the dataset from the WLASL website. We just need the keypoints files and the split files.
Place the keypoint folders in dataset/annotations and the split files in dataset.
The model, dataset and training parameters are controlled by the config files found in the config folder

Training

In the root folder, run

python main.py --run train

Tensorboard logs will be saved in the logs directory.
Masked encoder weights will be saved as weights/pretrain.
Model weights will be saved as weights/spobet

Evaluation

In the root folder, run

python main.py --run evaluation

The accuracy scores for the Top 1, Top 5 and Top 10 will be printed at the end.

Inferencing

This repo does not include the implementation of OpenPose to retrieve the keypoints needed for inferencing. To do inferencing, you will need to:

Retrieve keypoints usng OpenPose
Format the results similar to those in WLASL
Create a "split" file with the neccesary information. I.e. video_id (annotation folder must be of the same name), start_frame and end_frame (each annotation file is named according to the frame number), and the train/test split be equals to "test"
In dataconfig.cfg, set SHOW_RES = 1. This will print out the inference results at the end of evaluation. The res is shown as a list of predicted labels, from the lowest probability to the highest. I.e the label with the highest probability is res[-1].
Run evaluation as per normal

(back to top)

Trained Weights

	Top 1	Top 5	Top 10
SPOBET (ASL100), BERT encoder	63.95%	87.98%	91.86%

(back to top)

License

The code is published under the Apache License 2.0.

The accompanying data of the WLASL dataset used for training and experiments, however, allow only non-commercial usage. This, therefore, extends the terms of non-commmerical usage to the uploaded model weights and its derivatives.

(back to top)

References

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
configs		configs
dataset		dataset
lib		lib
logs		logs
weights		weights
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPOBET

About The Project

Using the Repo

Getting Started

Training

Evaluation

Inferencing

Trained Weights

License

References

About

Releases

Packages

Languages

License

Kojk-AI/spobet

Folders and files

Latest commit

History

Repository files navigation

SPOBET

About The Project

Using the Repo

Getting Started

Training

Evaluation

Inferencing

Trained Weights

License

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages