CTTS: Controllable Text-To-Speech

Links

Quickstart

GUI

Start gradio gui with

python .\gui.py

Dependencies

You can install the Python dependencies with

pip3 install -r requirements.txt

Attention: the matplotlib version requirements of gradio conflicts with FastSpeech2. Install gradio first then rollback matplotlib to 3.2.2.

Inference

Arrange config files as the following structure:

  .
  ├── config
  │   ├── DATASET_NAME
  └── └── └── model.yaml
          └── preprocess.yaml
          └── train.yaml

The model use ESD_en dataset by default. For English single-speaker TTS, run

python .\synthesize.py -t "YOUR_CONTENT"

There are optional parameters, you can check out the details by using "help" or read the code in "synthesis.py"

python .\synthesize.py --help

Here lists some common used parameters:

-m or --model: name of model used.

-s or --speaker_id: specify the emotion id in multi emotion datasets.

-e or --emotion_id: specify the speaker id in multi speaker datasets.

-r or --restore_step: load the model of a particular checkpoint.

The generated utterances will be put in output/result/.

Training

Preprocess

Preprocess dataset by the following command:

python .\preprocess.py -m ESD_en

The TextGrid file generated by MFA should be put in ./preprocessed_data/DATASET_NAME/

Training

Train model by the following command:

python .\train.py -m ESD_en

configure pretrain path by parameter -pp or -pretrain_path

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
audio		audio
config		config
docs		docs
hifigan		hifigan
lexicon		lexicon
models		models
preprocessor		preprocessor
text		text
transformer		transformer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
gui.py		gui.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
synthesize.py		synthesize.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CTTS: Controllable Text-To-Speech

Links

Quickstart

GUI

Dependencies

Inference

Training

Preprocess

Training

About

Releases

Packages

Languages

License

aucki6144/ctts

Folders and files

Latest commit

History

Repository files navigation

CTTS: Controllable Text-To-Speech

Links

Quickstart

GUI

Dependencies

Inference

Training

Preprocess

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages