How to fix the noise during inference time? #23

xinghua-qu · 2021-07-27T00:15:38Z

Hi Jaehyeon,

May I ask how to fix the stochastic noise during inference time? I want some generated audio to be reproducable, thus need to fix the random noise part.
Currently it seems I can only control the noise scale.

sid = torch.LongTensor([1]) # speaker identity
stn_tst = get_text("Tell me the answer please", hps_ms)

with torch.no_grad():
    x_tst = stn_tst.unsqueeze(0)
    x_tst_lengths = torch.LongTensor([stn_tst.size(0)])
    audio = net_g_ms.infer(x_tst, x_tst_lengths, sid = sid, noise_scale=1, noise_scale_w=2, length_scale=1)[0][0,0].data.float().numpy()
ipd.display(ipd.Audio(audio, rate=hps_ms.data.sampling_rate))

The text was updated successfully, but these errors were encountered:

BridgetteSong · 2021-07-27T02:26:42Z

always using same input in mode.py infer function:
z_p = m_p + torch.randn_like(m_p) * torch.exp(logs_p) * noise_scale
you can replace torch.randn_like with same fixed input:

print torch.randn_like(m_p) and save
always use results of 1 as input tensor

xinghua-qu · 2021-07-27T06:00:40Z

always using same input in mode.py infer function:
z_p = m_p + torch.randn_like(m_p) * torch.exp(logs_p) * noise_scale
you can replace torch.randn_like with same fixed input:

print torch.randn_like(m_p) and save

always use results of 1 as input tensor

Thanks for the reply.
But it seems the dimension of m_p varies every time. In this case, it's impossible to replace torch.randn_like(m_p) as a constant tensor.
You can see the dimension of m_p that I print out several times.

torch.Size([1, 192, 124])
torch.Size([1, 192, 126])
torch.Size([1, 192, 123])
torch.Size([1, 192, 125])
torch.Size([1, 192, 124])

To my understanding, setting the hyperparameter noise_scale_w to be zero can enable the dimension of m_p to be a constant value. But if noise_scale_w is not equal to 0, is there any way to reproduce the same generated audio?

BridgetteSong · 2021-07-27T15:28:09Z

Because you use a StochasticDurationPredictor for duration prediction and it also contains a sample process like this
e_q = torch.randn(w.size(0), 2, w.size(2)).to(device=x.device, dtype=x.dtype) * x_mask
you should also fixed this like above.

CookiePPP · 2021-07-27T16:33:14Z

Just set the seed in torch when you input your data. Use branch rng if you want the other components to be unaffected by set seed.
It will always give you the same output from the same input.

CookiePPP · 2021-07-27T16:38:28Z

with torch.random.fork_rn:

https://pytorch.org/docs/stable/random.html

And

torch.manual_seed(0)

https://pytorch.org/docs/stable/notes/randomness.html

When you call the model forward should be enough for reproduction.

xinghua-qu closed this as completed Jan 11, 2022

nikich340 mentioned this issue Jan 11, 2023

How can we keep the output the same when the input is the same？ #111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fix the noise during inference time? #23

How to fix the noise during inference time? #23

xinghua-qu commented Jul 27, 2021 •

edited

Loading

BridgetteSong commented Jul 27, 2021

xinghua-qu commented Jul 27, 2021 •

edited

Loading

BridgetteSong commented Jul 27, 2021

CookiePPP commented Jul 27, 2021

CookiePPP commented Jul 27, 2021 •

edited

Loading

How to fix the noise during inference time? #23

How to fix the noise during inference time? #23

Comments

xinghua-qu commented Jul 27, 2021 • edited Loading

BridgetteSong commented Jul 27, 2021

xinghua-qu commented Jul 27, 2021 • edited Loading

BridgetteSong commented Jul 27, 2021

CookiePPP commented Jul 27, 2021

CookiePPP commented Jul 27, 2021 • edited Loading

xinghua-qu commented Jul 27, 2021 •

edited

Loading

xinghua-qu commented Jul 27, 2021 •

edited

Loading

CookiePPP commented Jul 27, 2021 •

edited

Loading