Welcome to Prompt2Sign!
This repository stores the preprocessed data for the paper:
SignLLM: Sign Languages Production Large Language Models.
Prompt2Sign is first multilingual sign language dataset, which uses tools to automate the acquisition and processing of sign language videos on the web, is an evolving data set that is efficient, lightweight, reducing the previous shortcomings. The details of the dataset are available at https://signllm.github.io/Prompt2Sign/.
Current languages include: American Sign Language (ASL), German Sign Language (GSL, Alias DGS), Swiss German Sign Language (DSGS), French Sign Language of Switzerland (LSF-CH), Italian Sign Language of Switzerland (LIS-CH), Argentine Sign Language (Lengua de Señas Argentina, LSA), Korean Sign Language (KSL), and Turkish Sign Language (TSL).
Dataset Summary
| Name | Language | Vocab. | Duration (h) | Signers | Multiview | Transcription | Gloss | Pose | Depth | Speech | Prompt | Compress |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Video-Based CSL | CSL | 178 | 100 | 50 | ❌ | ✔️ | ❌ | ✔️ | ✔️ | ❌ | ❌ | ❌ |
| SIGNUM | GSL | 450 | 55 | 25 | ❌ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ |
| RWTH-Phoenix-2014T | GSL | 3k | 11 | 9 | ❌ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Public DGS Corpus | GSL | -- | 50 | 327 | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ |
| BSL Corpus | BSL | 5k | -- | 249 | ❌ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ |
| NCSLGR | ASL | 1.8k | 5.3 | 4 | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ |
| How2Sign | ASL | 16k | 79 | 11 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ |
| Prompt2Sign (ours) | Multilingual | 40k | 200 | 40 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Please cite the following paper when using Prompt2Sign in your research:
@inproceedings{prompt2sign2023,
title={Prompt2Sign: A Multilingual Dataset for Sign Language Production},
author={XXXX},
booktitle={xxxx},
year={2023}
}