Cross-layer Dynamic Fusion for Continuous Sign Language Recognition (CDF–CSLR)

Overview

This repository implements a deep learning framework for continuous sign language recognition (CSLR). It integrates spatio-temporal feature extraction, cross-layer dynamic fusion, temporal convolution, bidirectional recurrent modeling, and sequence-level knowledge distillation.
**Notice:**This is experimental code to demonstrate the overall structure and realism of the framework. Due to the protection of the training data, the trained model will be published after the paper is received.

Prerequisites

This project is implemented in Pytorch (better >=1.13 to be compatible with ctcdecode or these may exist errors). Thus please install Pytorch first.
ctcdecode==0.4 [parlance/ctcdecode]，for beam search decode. (ctcdecode is only supported on the Linux platform.)
[Optional] sclite [kaldi-asr/kaldi], install kaldi tool to get sclite for evaluation. After installation, create a soft link toward the sclite: mkdir ./software ln -s PATH_TO_KALDI/tools/sctk-2.4.10/bin/sclite ./software/sclite

You may use the python version evaluation tool for convenience (by setting 'evaluate_tool' as 'python' in line 16 of ./configs/baseline.yaml), but sclite can provide more detailed statistics.
You can install other required modules by conducting pip install -r requirements.txt

Data Preparation

You can choose any one of following datasets to verify the effectiveness of CDF.

PHOENIX2014 dataset

Download the RWTH-PHOENIX-Weather 2014 Dataset [download link]. Our experiments based on phoenix-2014.v3.tar.gz.
After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET/phoenix2014-release ./dataset/phoenix2014
The original image sequence is 210x260, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.
```
cd ./preprocess
python dataset_preprocess.py --process-image --multiprocessing
```

PHOENIX2014-T dataset

Download the RWTH-PHOENIX-Weather 2014 Dataset [download link]
After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET/PHOENIX-2014-T-release-v3/PHOENIX-2014-T ./dataset/phoenix2014-T
The original image sequence is 210x260, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.
```
cd ./preprocess
python dataset_preprocess-T.py --process-image --multiprocessing
```

If you get an error like on the PHOENIX2014-T dataset, you may refer to this issue to tackle the problem.IndexError: list index out of range

CSL-Daily dataset

Request the CSL-Daily Dataset from this website [download link]
After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET ./dataset/CSL-Daily
The original image sequence is 1280x720, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.
```
cd ./preprocess
python dataset_preprocess-CSL-Daily.py --process-image --multiprocessing
```

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
dataset		dataset
evaluation		evaluation
modules		modules
phoenix2014-T		phoenix2014-T
preprocess		preprocess
utils		utils
weight_map_generation		weight_map_generation
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
main.py		main.py
requirements.txt		requirements.txt
seq_scripts.py		seq_scripts.py
slr_network.py		slr_network.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-layer Dynamic Fusion for Continuous Sign Language Recognition (CDF–CSLR)

Overview

Prerequisites

Data Preparation

PHOENIX2014 dataset

PHOENIX2014-T dataset

CSL-Daily dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cross-layer Dynamic Fusion for Continuous Sign Language Recognition (CDF–CSLR)

Overview

Prerequisites

Data Preparation

PHOENIX2014 dataset

PHOENIX2014-T dataset

CSL-Daily dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages