Controllable Speakers

Flux9665 released this 25 Oct 15:16

· 22 commits to ControllableMultilingual since this release

This release extends the toolkits functionality and provides new checkpoints.

self contained embeddings: we no longer use an external embedding model for TTS conditioning. Instead we train one that is specifically tailored for this use.
new vocoder: Avocodo replaces HiFi-GAN
new controllability options through artificial speaker generation
quality of life changes, such as weights&biases integration, a graphic demo script and automated model downloading
divese bugfixes and speed increases

This release breaks backwards compatibility, please download the new models or stick to a prior release if you rely on your old models.

Assets 8