SLNSpeech contains three modalities: audio, visual, and sign language, and each modality corresponds to each other in time sequence. To the best of the authors’ knowledge, SLNSpeech dataset is the first dataset that coexists three modalities of audio, visual, and sign language, and can be used to explore the characteristics of these three modalities by using self-supervised learning methods.
Please clone our repository for conducting detailed procedures. We suggest reading README before using the dataset. We also provide the processed dataset which can be found in README. If you do not want to extend the dataset, you just download it.
All the SLNSpeech data is intended for academic and computational use only. No commercial usage is allowed. We highly respect copyright and privacy. If you find SLNSpeech violates your rights, please contact us.
Licensed under the Computational Use of Data Agreement (C-UDA). Plaese refer to C-UDA-1.0.pdf for more information.
Please cite the SLNSpeech paper if it helps your research:
Jiasong Wu, Taotao Li, Youyong Kong, Guanyu Yang, Lotfi Senhadji, Huazhong Shu, SLNSpeech: solving extended speech separation problem by the help of sign language