ISL is the variety of sign language used most commonly in India. It is an independent language with its own grammar, syntax and vocabulary. It is not a mapping of a spoken language like Hindi. There are at least 6 million users of ISL, and it is one of the most used sign languages in the world. Despite this, there are very few resources for automated processing of ISL.
In this project, we investigate different methods to train models for segmenting signs in Indian Sign Language.
For the training set, we used YouTube videos from Deaf Enabled Foundation, which releases daily videos on “Word of the Day” , which also include continuously signed sentences describing the meaning of the chosen word. We scraped 722 videos in total.The links to our dataset are given in /train_data/readme.MD
.
Deaf Enabled Foundation YouTube Channel: https://www.youtube.com/@deafenabledfoundation1117.
For the test set, we used the ISL CSLTR Dataset, which contains multiple videos of 100 sentences in ISL. We annotated two videos for each sentence manually using VIA Annotation Tool. The annotated dataset is available in /test_data/VIA_annotations.csv
.
ISL CSLTR Dataset: https://data.mendeley.com/datasets/kcmpdxky7p/1
VIA Annotation Tool: https://www.robots.ox.ac.uk/~vgg/software/via/
We used the TCN + i3D model, and trained it on ISL data using Pseudolabelling and Changepoints. The details of the models are given in the following research papers:
-
Katrin Renz, Nicolaj C. Stache, Samuel Albanie and Gül Varol, Sign language segmentation with temporal convolutional networks, ICASSP 2021. [arXiv]
-
Katrin Renz, Nicolaj C. Stache, Neil Fox, Gül Varol and Samuel Albanie, Sign Segmentation with Changepoint-Modulated Pseudo-Labelling, CVPRW 2021. [arXiv]
The models are stored in /models/mstcn_ISL_CMPL.zip
and the Jupyter Notebooks for our code are stored in /notebooks