Pretrained models

Pretrained models come as an archive that contains all four models (g2p, speaker encoder, synthesizer, vocoder). The archive comes with the same directory structure as the repo, and you're expected to merge its contents with the root of the repository. For reference, the GPUs used for training are GTX 1060.

Initial commit (latest release) Google drive

G2P: trained 100k steps (0.5 days) with a batch size of 512

Encoder: trained 1.7M steps (24 days) with a batch size of 64

Synthesizer: trained 332k steps (10 days) with a batch size of 32

Vocoder: trained 1.4M steps (14 days) with a batch size of 100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrained models

Initial commit (latest release) Google drive

Clone this wiki locally