Skip to content

PranLiu/CDF

Repository files navigation

Cross-layer Dynamic Fusion for Continuous Sign Language Recognition (CDF–CSLR)

Overview

  1. This repository implements a deep learning framework for continuous sign language recognition (CSLR). It integrates spatio-temporal feature extraction, cross-layer dynamic fusion, temporal convolution, bidirectional recurrent modeling, and sequence-level knowledge distillation.
  2. **Notice:**This is experimental code to demonstrate the overall structure and realism of the framework. Due to the protection of the training data, the trained model will be published after the paper is received.

Prerequisites

  1. This project is implemented in Pytorch (better >=1.13 to be compatible with ctcdecode or these may exist errors). Thus please install Pytorch first.

  2. ctcdecode==0.4 [parlance/ctcdecode],for beam search decode. (ctcdecode is only supported on the Linux platform.)

  3. [Optional] sclite [kaldi-asr/kaldi], install kaldi tool to get sclite for evaluation. After installation, create a soft link toward the sclite: mkdir ./software ln -s PATH_TO_KALDI/tools/sctk-2.4.10/bin/sclite ./software/sclite

    You may use the python version evaluation tool for convenience (by setting 'evaluate_tool' as 'python' in line 16 of ./configs/baseline.yaml), but sclite can provide more detailed statistics.

  4. You can install other required modules by conducting pip install -r requirements.txt

Data Preparation

You can choose any one of following datasets to verify the effectiveness of CDF.

PHOENIX2014 dataset

  1. Download the RWTH-PHOENIX-Weather 2014 Dataset [download link]. Our experiments based on phoenix-2014.v3.tar.gz.

  2. After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET/phoenix2014-release ./dataset/phoenix2014

  3. The original image sequence is 210x260, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.

    cd ./preprocess
    python dataset_preprocess.py --process-image --multiprocessing
    

PHOENIX2014-T dataset

  1. Download the RWTH-PHOENIX-Weather 2014 Dataset [download link]

  2. After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET/PHOENIX-2014-T-release-v3/PHOENIX-2014-T ./dataset/phoenix2014-T

  3. The original image sequence is 210x260, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.

    cd ./preprocess
    python dataset_preprocess-T.py --process-image --multiprocessing
    

If you get an error like on the PHOENIX2014-T dataset, you may refer to this issue to tackle the problem.IndexError: list index out of range

CSL-Daily dataset

  1. Request the CSL-Daily Dataset from this website [download link]

  2. After finishing dataset download, extract it. It is suggested to make a soft link toward downloaded dataset. ln -s PATH_TO_DATASET ./dataset/CSL-Daily

  3. The original image sequence is 1280x720, we resize it to 256x256 for augmentation. Run the following command to generate gloss dict and resize image sequence.

    cd ./preprocess
    python dataset_preprocess-CSL-Daily.py --process-image --multiprocessing
    

About

Cross-layer Dynamic Fusion for Continuous Sign Language Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors