cantonese_ASR

This project is a modified version of ASR for Chinese, https://github.com/CynthiaSuwi/ASR-for-Chinese-Pipeline, however, that project is mainly for madarin, in this project, we try to use this pipeline and choose the dataset to be from mozilla's common voice Hong Kong cantonese dataset (https://commonvoice.mozilla.org/en/datasets , zh-HK_100h_2020-12-11), and based on the corpus information from pycantonese (https://pycantonese.org/searches.html). The training is based on cantonese corpus and dataset.

Please follow the following to setup and try your training or test

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: 11.2 |

python3.6: install python3.6 by typing "sudo apt-get install python3.6"

clone the source code by "git clone https://github.com/kathykyt/cantonese_ASR.git"
Create a virtual python environment: "cd catonese_ASR" , run "virtualenv -p /usr/bin/python3.6 venv"
setup python virtual environment: "source venv/bin/activate"
Install required packages: "pip install -r requirements.txt"
Visit https://commonvoice.mozilla.org/en/datasets and select the download the cantonese dataset file, zh-HK_100h_2020-12-11 to download, the file is zh-HK.tar.gz. copy it under the directory, cantonest_ASR/dataset/ by "cp zh-HK.tar.gz {your top diretory}/cantonest_ASR/dataset/ "
extract the file by "tar xvf zh-HK.tar.gz"
Prepare the wave file for training and testing. Since the commonvoice data is mp3, we have to convert them to .wav files. To convert it, under cantonest_ASR/dataset/ run "./convert_to_mp3.py ", after that run "./convert_to_mp3_test.py".
Since the trained model file will be located under model_speech, so create the direcotry m251 under model_speech/, by "mkdir m251"
To start the training, cd catonese_ASR, type "python train_mspeech.py" , remember to change into python virtual environment before issung the command.
Please be patient, the training is very slow even with GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
create_dataset_tools		create_dataset_tools
dataset		dataset
general_function		general_function
model_language		model_language
model_speech		model_speech
.gitignore		.gitignore
LICENSE		LICENSE
LanguageModel3.py		LanguageModel3.py
README.md		README.md
SpeechModel251.py		SpeechModel251.py
dict.txt		dict.txt
how-tp-create-virtualenv-python36.txt		how-tp-create-virtualenv-python36.txt
readdata24.py		readdata24.py
requirements.txt		requirements.txt
test.py		test.py
test_mspeech.py		test_mspeech.py
train_mspeech.py		train_mspeech.py