Controllable TTS - Grader (Github)
Controllable TTS - Grader (Gitee)
Start gradio gui with
python .\gui.py
You can install the Python dependencies with
pip3 install -r requirements.txt
Attention: the matplotlib version requirements of gradio conflicts with FastSpeech2. Install gradio first then rollback matplotlib to 3.2.2.
Arrange config files as the following structure:
.
├── config
│ ├── DATASET_NAME
└── └── └── model.yaml
└── preprocess.yaml
└── train.yaml
The model use ESD_en dataset by default. For English single-speaker TTS, run
python .\synthesize.py -t "YOUR_CONTENT"
There are optional parameters, you can check out the details by using "help" or read the code in "synthesis.py"
python .\synthesize.py --help
Here lists some common used parameters:
-m
or --model
: name of model used.
-s
or --speaker_id
: specify the emotion id in multi emotion datasets.
-e
or --emotion_id
: specify the speaker id in multi speaker datasets.
-r
or --restore_step
: load the model of a particular checkpoint.
The generated utterances will be put in output/result/
.
Preprocess dataset by the following command:
python .\preprocess.py -m ESD_en
The TextGrid file generated by MFA should be put in ./preprocessed_data/DATASET_NAME/
Train model by the following command:
python .\train.py -m ESD_en
configure pretrain path by parameter -pp
or -pretrain_path