Tacotron: Train TWEB dataset #22

erogol · 2018-04-23T11:59:10Z

Dataset: https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset

erogol · 2018-04-23T12:01:04Z

Using the master branch had very poor performance due to the very long sequence length of the dataset. To alleviate the problem I try to use Truncated Backpropagation Through Time.

erogol · 2018-04-23T12:02:58Z

Dataset has a interesting frequency distribution, seems like post-processed after the recording.

erogol · 2019-01-06T17:09:37Z

This dataset is in very low quality. It is low-pass filtered applied. It causes low stop-token prediction and pronunciation errors, especially for novel words. Training with phonemes might improve the results.

Also I replaces ReLU with RReLU and removed Dropout in prenet. These changes improved the results but yet to be tested on other datasets.

Sound example: https://soundcloud.com/user-565970875/tweb-example-108k-iters-2810d57
Model : https://drive.google.com/open?id=1deQ2akq9cuyreda0DgZOiBdydkbgseWP

erogol · 2019-01-07T14:14:15Z

As I just discover, I trained the model with thw sampling rate 22050 which is default for LJSpeech but TWEB has 12000 sampling rate. That might be a important bug.

viveknad94 · 2019-11-05T01:07:03Z

@erogol Is there any pretrained TTS model that has been trained with a male voice? I have been trying synthesize audio with the pretrained model (Tacotron-iter-108K on the TWEB dataset) but the commit is no longer present (2810d57). I get this error error: pathspec '2810d57' did not match any file(s) known to git.

erogol added this to In Progress in v0.0.1 Apr 25, 2018

erogol moved this from In Progress to Done in v0.0.1 May 28, 2018

erogol closed this as completed Jan 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tacotron: Train TWEB dataset #22

Tacotron: Train TWEB dataset #22

erogol commented Apr 23, 2018

erogol commented Apr 23, 2018

erogol commented Apr 23, 2018

erogol commented Jan 6, 2019

erogol commented Jan 7, 2019

viveknad94 commented Nov 5, 2019 •

edited

Tacotron: Train TWEB dataset #22

Tacotron: Train TWEB dataset #22

Comments

erogol commented Apr 23, 2018

erogol commented Apr 23, 2018

erogol commented Apr 23, 2018

erogol commented Jan 6, 2019

erogol commented Jan 7, 2019

viveknad94 commented Nov 5, 2019 • edited

viveknad94 commented Nov 5, 2019 •

edited