This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment using Montreal Forced Aligner (MFA).
The pre-trained acoustic models of Hong Kong Cantonese are available in pretrained_models/
:
-
acoustic_model_cv15_train.zip
: model trained using thetrain
set (~10 hrs) from Common Voice Hong Kong Chinese Corpus (Common Voice Corpus 15.0 updated on 9/14/2023). -
acoustic_model_cv15_validated.zip
: model trained using thevalidated
set (~106.5 hrs, 2325 speakers) from Common Voice Hong Kong Chinese Corpus (Common Voice Corpus 15.0 updated on 9/14/2023).
cv15_validated_lexicon.txt
and cv15_validated_lexicon.dict
contain the lexicon in the Common Voice Hong Kong Chinese Corpus 15.0, which is over 4800 entries. The former is in non-probabilistic format and the latter includes pronunciation and silence probabilities.
An example of using the pre-trained acoustic model is as follows:
mfa align [OPTIONS] corpus_directory dictionary acoustic_model_cv15_validated.zip
output_directory
-
Training acoustic models using the Kaldi recipe
The relevant scripts are available in
kaldi_tutorial_scripts/
. -
Training acoustic models with MFA (Kaldi) implementation
The relevant scripts are available in
mfa_tutorial_scripts/
.