@inproceedings{huu23_interspeech,
author={Tuong Tu Huu and Viet Thanh Pham and Thi Thu Trang Nguyen and Thai Lai Dao},
title={{Mispronunciation detection and diagnosis model for tonal language, applied to Vietnamese}},
year=2023,
booktitle={Proc. INTERSPEECH 2023},
pages={1014--1018},
doi={10.21437/Interspeech.2023-364}
}
Here is the link to audio: https://drive.google.com/drive/folders/1TjTluTxEB99QhGFTYFWb-vEdWXM-lyKJ?usp=sharing
CSV contains 3 atributes:
Path: path to audio (Should change follow the format)
Canonical: Canonical phoneme
Transcript: Human notation phoneme
Total 4266 utterances with 84 speakers - 4.89 hours.
Add test_fix.csv: Fix some canonical. Original paper use test.csv