The official repository for "AnKaS", [SPECOM 2024] https://specom2024.ftn.uns.ac.rs/ (submitted)
The corpus includes transcripts from radio broadcasts, featuring samples from 17 speakers (7 males and 10 females). Covering about 4.5 hours of audio recordings, it contains 32037 words, thus being a valuable tool for linguistic research. Among the peculiarities of the presented corpus are instances of code-switching between Livvi-Karelian and Russian. The baseline experiments were carried out with the Kaldi toolkit. Hybrid DNN/HMMs with factorized time-delay neural networks were utilized for acoustic modeling, while trigram and LSTM-based models were used for language modeling. The proposed model allowed achieving the Word Error Rate (WER) of 26%.
Parts of this project page were adopted from the Nerfies page.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.