LID-Code-Switching-Interspeech2023

Code repository for InterSpeech 2023 paper "Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech"

Paper Abstract

This work focuses on improving the Spoken Language Identification (LangId) system for a challenge that focuses on developing robust language identification systems that are reliable for non-standard, accented (Singaporean accent), spontaneous code-switched, and child-directed speech collected via Zoom. We propose a two-stage Encoder-Decoder-based E2E model. The encoder module consists of 1D depth-wise separable convolutions with Squeeze-and-Excitation (SE) layers with a global context. The decoder module uses an attentive temporal pooling mechanism to get fixed length time-independent feature representation. The total number of parameters in the model is around 22.1 M, which is relatively light compared to using some large-scale pre-trained speech models. We achieved an EER of 15.6% in the closed track and 11.1% in the open track (baseline system 22.1%). We also curated additional LangId data from YouTube videos (having Singaporean speakers), which will be released for public use.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
youtube_data_links		youtube_data_links
.gitignore		.gitignore
README.md		README.md
finetune.py		finetune.py
label_models.py		label_models.py
train_from_scratch.py		train_from_scratch.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LID-Code-Switching-Interspeech2023

Paper Abstract

Singaporean Dialect English-Mandrin YouTube Data

About

Releases

Packages

Languages

shashikg/LID-Code-Switching

Folders and files

Latest commit

History

Repository files navigation

LID-Code-Switching-Interspeech2023

Paper Abstract

Singaporean Dialect English-Mandrin YouTube Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages