Mean Opinion Score Study

Jump to bottom

Eren Gölge edited this page Jan 12, 2021 · 1 revision

Based on our latest MOS study, TTS is able to achieve on par performance with any other solution. However, it is also shown that the quality of the dataset is an important part of TTS. Therefore, it is worth to find a good resource or even record your own dataset, if you like to peak the TTS performance.

You can find some pointers under Dataset

Models

Judy Wave1: Tacotron + WaveRNN
Judy Wave2: Tacotron2 + WaveRNN
Judy GL1 : Tacotron + Griffin Lim
Judy GL2 : Tacotron2 + Griffin Lim
MozillaTTS Nancy: Tacotron + Griffin Lim
MozillaTTS Nancy2: Tacotron2 + WaveRNN
MozillaTTS LJSpeech: Tacotron + GriffinLim

(Judy is an 25 hours long internal dataset)

(.Jofish .Abe and .Janice are real human voices)