Abrupt noise， #68

WendongGan · 2019-01-07T02:44:20Z

Does anybody have such a problem? When it is trained for 1000k steps with LjSpeech , the "abrupt noise" appears. For example:

The audio file is :
LJ001-0007.wav_synthesis_01.zip

My config.json file is:

I used single GPU。

Look forward your help!

WendongGan · 2019-01-07T02:48:40Z

Some friends think that the reason is that the dataset is not enough and overfitting appears.

WendongGan · 2019-01-07T02:54:43Z

My code is from commit f4c04e2. It is commited on Nov 10, 2018。The train costs so long time that I have not use latest code。 Does the latest code have this problem?

Yeongtae · 2019-01-07T07:45:16Z

Have you make the sample audio from melspectrogram or text?

WendongGan · 2019-01-07T07:54:45Z

When audio is made from melspectrogram and text, the "abrupt noise" will appear. The Both conditions get the same result of noise.

WendongGan · 2019-01-07T07:58:43Z

I'm trying the latest code. And I want to know whether the latest commits could solve the problem. For example,

Yeongtae · 2019-01-14T09:44:12Z

@UESTCgan Is it solved? my model has similar noise.
8.zip

WendongGan · 2019-01-15T08:00:33Z

@UESTCgan Is it solved? my model has similar noise.
8.zip

I listened your sample. How many steps have you trained ? How many hours are your dataset of train ? You mean that your noise is this one :

I also have this noise, but the "Abrupt noise" is more serious. It is the noise :

I‘m trying the latest code of the author。The step is just 100k，it is not enough , so I'm not sure if it could solve the problem. (f4c04e2).

Yeongtae · 2019-01-15T08:12:40Z

My model was trained with 1100epoch.
But it has reverb effect.

WendongGan · 2019-01-15T08:14:16Z

My model was trained with 1100epoch.

How many hours are your dataset of train ?

Yeongtae · 2019-01-15T08:17:04Z

With 8 v100 gpus in gcp vm, it takes 5 days.
My experiment setting is following:
Num channels: 8bit
Batch size: 80( 10 for each gpu)
Another prameters are dafault.

WendongGan · 2019-01-15T08:20:54Z

How much is your sigma ? I set it as 1.0 when I train and infer.

Yeongtae · 2019-01-15T08:25:19Z

Sigma is Sqrt(0.5) ~ 0.7071.... for training.
It is default in the waveglow paper.

Sigma is 0.66 for inference. It is default in the demo.

WendongGan · 2019-01-15T08:28:24Z

Increase the sigma when infering , background noise will decrease.

Yeongtae · 2019-01-15T08:30:03Z

But big sigma makes more reverb effect.

WendongGan · 2019-01-15T08:43:11Z

But big sigma makes more reverb effect.

I see, thank you !

Yeongtae · 2019-01-16T00:42:01Z

My model was trained with 1100epoch.

How many hours are your dataset of train ?

my dataset consist of 13000 sentences and 10 hours.

yxt132 · 2019-01-28T12:21:06Z

Does anybody have such a problem? When it is trained for 1000k steps with LjSpeech , the "abrupt noise" appears. For example:

The audio file is :
LJ001-0007.wav_synthesis_01.zip

My config.json file is:

I used single GPU。

Look forward your help!

I saw you used 16k sampling rate. Isn't the sampling rate 22050 for the LJSPEECH dataset? Or does it matter? What does the segment length do? Does it have to be consistent with the sampling rate?

rafaelvalle · 2019-02-10T00:34:57Z

Segment length is independent of sampling rate.
It is ok to convert LJS to 16khz. Note that if training tacotron in parallel, it must have the same audio specifications.

rafaelvalle · 2019-02-26T21:33:51Z

We've shared a quick hack to decrease the fixed noise from model's bias in waveglow :
NVIDIA/tacotron2#142 (comment)

rafaelvalle · 2019-03-16T01:00:20Z

Closing due to inactivity.

rafaelvalle closed this as completed Mar 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abrupt noise， #68

Abrupt noise， #68

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019 •

edited

Yeongtae commented Jan 7, 2019

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019 •

edited

Yeongtae commented Jan 14, 2019 •

edited

WendongGan commented Jan 15, 2019 •

edited

Yeongtae commented Jan 15, 2019 •

edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 •

edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 •

edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 •

edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 16, 2019 •

edited

yxt132 commented Jan 28, 2019

rafaelvalle commented Feb 10, 2019

rafaelvalle commented Feb 26, 2019

rafaelvalle commented Mar 16, 2019

Abrupt noise， #68

Abrupt noise， #68

Comments

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019 • edited

Yeongtae commented Jan 7, 2019

WendongGan commented Jan 7, 2019

WendongGan commented Jan 7, 2019 • edited

Yeongtae commented Jan 14, 2019 • edited

WendongGan commented Jan 15, 2019 • edited

Yeongtae commented Jan 15, 2019 • edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 • edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 • edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 15, 2019 • edited

WendongGan commented Jan 15, 2019

Yeongtae commented Jan 16, 2019 • edited

yxt132 commented Jan 28, 2019

rafaelvalle commented Feb 10, 2019

rafaelvalle commented Feb 26, 2019

rafaelvalle commented Mar 16, 2019

WendongGan commented Jan 7, 2019 •

edited

WendongGan commented Jan 7, 2019 •

edited

Yeongtae commented Jan 14, 2019 •

edited

WendongGan commented Jan 15, 2019 •

edited

Yeongtae commented Jan 15, 2019 •

edited

Yeongtae commented Jan 15, 2019 •

edited

Yeongtae commented Jan 15, 2019 •

edited

Yeongtae commented Jan 15, 2019 •

edited

Yeongtae commented Jan 16, 2019 •

edited