AuxiliaryASR

This repo contains the training code for Phoneme-level ASR for Voice Conversion (VC) and TTS (Text-Mel Alignment) used in StarGANv2-VC and StyleTTS.

Pre-requisites

Python >= 3.7
Clone this repository:

git clone https://github.com/yl4579/AuxiliaryASR.git
cd AuxiliaryASR

Install python requirements:

pip install SoundFile torchaudio torch jiwer pyyaml click matplotlib g2p_en librosa

Prepare your own dataset and put the train_list.txt and val_list.txt in the Data folder (see Training section for more details).

Training

python train.py --config_path ./Configs/config.yml

Please specify the training and validation data in config.yml file. The data list format needs to be filename.wav|label|speaker_number, see train_list.txt as an example (a subset for LJSpeech). Note that speaker_number can just be 0 for ASR, but it is useful to set a meaningful number for TTS training (if you need to use this repo for StyleTTS).

Checkpoints and Tensorboard logs will be saved at log_dir. To speed up training, you may want to make batch_size as large as your GPU RAM can take. However, please note that batch_size = 64 will take around 10G GPU RAM.

Languages

This repo is set up for English with the g2p_en package, but you can train it with other languages. If you would like to train for datasets in different languages, you will need to modify the meldataset.py file (L86-93) with your own phonemizer. You also need to change the vocabulary file (word_index_dict.txt) and change n_token in config.yml to reflect the number of tokens. A recommended phonemizer for other languages is phonemizer.

References

Acknowledgement

The author would like to thank @tosaka-m for his great repository and valuable discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Configs		Configs
.env_example		.env_example
LICENSE		LICENSE
README.md		README.md
denoise.py		denoise.py
finetune.pth		finetune.pth
layers.py		layers.py
meldataset.py		meldataset.py
models.py		models.py
optimizers.py		optimizers.py
ouput.txt		ouput.txt
test.ipynb		test.ipynb
text_utils.py		text_utils.py
train.py		train.py
trainer.py		trainer.py
utils.py		utils.py
word_index_dict.txt		word_index_dict.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AuxiliaryASR

Pre-requisites

Training

Languages

References

Acknowledgement

About

Releases

Packages

Languages

License

traderpedroso/AuxiliaryASR

Folders and files

Latest commit

History

Repository files navigation

AuxiliaryASR

Pre-requisites

Training

Languages

References

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages