This project focuses on video classification at the word level, specifically targeting Korean Sign Language (KSL). By taking word-level sign language videos as input, our system classifies the signs into distinct categories. The model is designed to process approximately 64 frames of video, providing classification for 60 different sign language words. Explore our repository to delve into the implementation and contribute to the development of this meaningful initiative.
☝️ Sohyun Yoo (YAI 12th) - Video data preprocessing / Model Experiments / Lead✌️ Meedeum Cho (YAI 11th) - Segmentation / Real-time Demo
👌 Hyunjin Park (YAI 12th) - Keypoint data analysis & preprocessing
👊 Jeongmin Seo (YAI 12th) - Model Experiments / Data collecting
🖐️ Hyemi Yoo (YAI 12th) - Model Experiments / Real-time Demo
We used the following dataset for our Korean Sign Language Translation project:
- AIhub Sign Language Video Dataset
- After downloading the dataset, you will have to convert videos into images of frames. We have utilized the generation code
./utils/generate_data.sh
from here. Environment setup for the preprocessing is also included in the repository's README. - For the AIhub dataset, we also provide you a code for easier preprocessing as below:
preprocessing/rearrange_videos.py
: extracts only the videos that are intarget_words.txt
from all zipfiles and rearranges them into directories of classes (words). This code will be useful if you want only a subset, rather than extracting all the zipfiles.
To set up the project, follow these steps:
git clone https://github.com/cygbbhx/kslr.git
cd kslr
Create and activate a virtual environment using anaconda:
conda create --name kslr-env python=3.7
conda activate kslr-env
Install the required packages:
pip install -r requirements.txt
- We used CUDA 11.6 and Python 3.7.
To train the model, run the following command:
python train.py -c config/path_to_your_config.yaml
- We have provided some example configs in
main/config
. You will have to modify arguments such asdata_dir
within the config to use them.
TBA
To inference with other videos, run the following command:
python inference.py --input_path path/to/input/video.mp4
- You will have to modify arguments such as
--config
or--resume
to provide correct model and weights to use.
To try the demo with real world input, run demo.py
below main/demo directory.
cd demo
python demo.py -w path_to_model_weight.pt
- You can adjust the arguments to change the input to be RGB (pixel values) or keypoints. In our implementation, the code automatically changes the input type according to the model choice.