Pytorch implementation of Eden-TTS: A Simple and Efficient Parallel Text-to-speech Architecture with Collaborative Duration-alignment Learning

We propose Eden-TTS, a simple and efficient parallel TTS architecture which jointly learns duration prediction, text-speech alignment and speech generation in a single fully-differentiable model. The alignment is learned implicitly in our architecture. A novel energy-modulated attention mechanism is proposed for alignment guidance which leads to fast and stable convergence of our model. Our model can be easily implemented and trained.

Listen the audio samples: audio samples

architecture

train the model using ljspeech

download the ljspeech and extract it
clone this repo: git clone https://github.com/edenynm/eden-tts.git
run python preprocess_ljs.py -p path/to/ljspeech for training data preparation
run python train.py to do the training. You may want to check the hparams.py for experiment settings before running
download pretrained vocoder from hifigan pretrained model, and set voc_path in hparams.py to the downloaded hifigan vocoder path.
When the training finishes, run python inference.py -t "input text" for speech generation.

reference

git respository

cite our article

If you find the method helpful, you may cite the following article.

@inproceedings{ma23c_interspeech,
  author={Youneng Ma and Junyi He and Meimei Wu and Guangyue Hu and Haojun Fei},
  title={{EdenTTS: A Simple and Efficient Parallel Text-to-speech Architecture with Collaborative Duration-alignment Learning}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4449--4453},
  doi={10.21437/Interspeech.2023-700}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
hifigan		hifigan
models		models
resource		resource
text		text
transformer		transformer
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
edenTTS.pdf		edenTTS.pdf
energy_weight_processor.py		energy_weight_processor.py
hparams.py		hparams.py
inference.py		inference.py
preprocess_ljs.py		preprocess_ljs.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pytorch implementation of Eden-TTS: A Simple and Efficient Parallel Text-to-speech Architecture with Collaborative Duration-alignment Learning

architecture

train the model using ljspeech

reference

git respository

cite our article

About

Releases

Packages

Languages

License

younengma/eden-tts

Folders and files

Latest commit

History

Repository files navigation

Pytorch implementation of Eden-TTS: A Simple and Efficient Parallel Text-to-speech Architecture with Collaborative Duration-alignment Learning

architecture

train the model using ljspeech

reference

git respository

cite our article

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages