Out of Memory on Synthesis #37

dyelax · 2018-03-13T20:28:35Z

When running python synthesis.py <model_checkpoint_path> <output_dir> --conditional <mel_path> , I consistently run out of GPU memory about 4 minutes into synthesis. I have a GTX 1080Ti (11GB memory), and when I watch nvidia-smi while synthesis is running, the memory usage continually increases until it runs out. How much GPU memory is generally required to synthesize a clip?

For reference, here is the progress on ljspeech-mel-00001.npy before it failed most recently:
33249/195328 [03:57<19:18, 139.90it/s]

The text was updated successfully, but these errors were encountered:

imdatceleste · 2018-03-14T14:16:05Z

@dyelax : are you getting OOM from NVIDIA Cuda or are you running out of main RAM? I can't see anything in r9y9's code where it would run OOM on GPU, but rather a place where it could run out of memory on main RAM if the audio to be generated is too long.

I have just generated an audio of 240,000 frames (yours is 195,328) and I had no problems with GTX 1080 Ti (11GB). BUT: I have 128GB of RAM...

Also: what sample-rate are you using? You should not go beyond 22-24KHz

dyelax · 2018-03-14T20:24:30Z

It's definitely a CUDA memory issue. Here's the error I'm getting:

THCudaCheck FAIL file=/tmp/pip-yxt749na-build/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "synthesis.py", line 182, in <module>
    waveform = wavegen(model, length, c=c, g=speaker_id, initial_value=initial_value, fast=True)
  File "synthesis.py", line 124, in wavegen
    log_scale_min=hparams.log_scale_min)
  File "/workspace/wavenet_vocoder/wavenet_vocoder/wavenet.py", line 335, in incremental_forward
    x, h = f.incremental_forward(x, ct, gt)
  File "/workspace/wavenet_vocoder/wavenet_vocoder/modules.py", line 125, in incremental_forward
    return self._forward(x, c, g, True)
  File "/workspace/wavenet_vocoder/wavenet_vocoder/modules.py", line 143, in _forward
    x = self.conv.incremental_forward(x)
  File "/opt/conda/lib/python3.6/site-packages/deepvoice3_pytorch/conv.py", line 40, in incremental_forward
    self.input_buffer[:, :-1, :] = self.input_buffer[:, 1:, :].clone()
RuntimeError: cuda runtime error (2) : out of memory at /tmp/pip-yxt749na-build/aten/src/THC/generic/THCStorage.cu:58

I have 16GB RAM, but again, this definitely seems like a GPU memory problem. Especially sice I can see the GPU memory climbing and hitting max in nvidia-smi. I'm using --preset presets/ljspeech_mixture.json, which should have a sample rate of 22050.

I'm running inside of a docker container (built off the pytorch docker) if that could be a potential issue.

imdatceleste · 2018-03-15T08:09:48Z

Hmm, we are using Python 3.5 and no Docker-images. The problem is that .clone() is actually copying data, you you might actually reach a limit. I never had the problem but, you are right, you might be running OOM in Cuda because of cloning the data too many times...
Try reducing samplerate to 16000 or even 12000 Hz, pre-process again and try with that.

Sorry that I can't help more...

npuichigo · 2018-03-15T09:29:45Z

@dyelax What about synthesizing on cpu? I think it may run faster.

neverjoe · 2018-03-15T11:21:33Z

I got this problem in a few days ago, i will sent pr later, i fixed it.
@dyelax @r9y9 @imdatsolak

r9y9 · 2018-03-17T14:58:50Z

I'm not sure if I can write pytorch code that triggers OOM, without accessing low-level APIs of CUDA. Isn't it a GPU driver bug or pytorch bug?

r9y9 · 2018-03-17T15:02:07Z

I'm curious to see a fix by @neverjoe.

aleksas · 2018-04-16T14:16:23Z

@dyelax Also try to restart server/computer and execute same command again and see if problem persists. Sometimes I run into similar problems after killing training process which seams to later cause memory allocation issues.

azraelkuan · 2018-04-18T14:13:28Z

I have checked the synthesis process, in the wavenet.py

wavenet_vocoder/wavenet_vocoder/wavenet.py

Line 355 in 4d5f68c

outputs += [x]

the x is the type of Variable when we save it in the list, this will case the memory increase in the sample process.
so we should change this to
outputs += [x.cpu().data.numpy()]
and change

wavenet_vocoder/wavenet_vocoder/wavenet.py

Line 325 in 4d5f68c

current_input = outputs[-1]

to
current_input=Variable(torch.from_numpy(current_input))
if next(self.parameters()).is_cuda: current_input=current_input.cuda()

butterl · 2018-04-26T09:42:45Z

@azraelkuan do you meet error when changed to
current_input=Variable(torch.from_numpy(current_input))

Traceback (most recent call last):
  File "synthesis.py", line 187, in <module>
    waveform = wavegen(model, length, c=c, g=speaker_id, initial_value=initial_value, fast=True)
  File "synthesis.py", line 125, in wavegen
    log_scale_min=hparams.log_scale_min)
  File "D:\code\wavenet_vocoder-master\wavenet_vocoder\wavenet.py", line 326, in incremental_forward
    current_input = Variable(torch.from_numpy(current_input))
TypeError: expected np.ndarray (got Tensor)

azraelkuan · 2018-04-26T10:18:38Z

@butterl i guess that you forget to change the tensor to numpy https://github.com/azraelkuan/wavenet_vocoder/blob/828da55c4e5dd29f05413b4ec7b9afa04bfe39a3/wavenet_vocoder/wavenet.py#L359
you can compare your incremental_forward code with mine

butterl · 2018-04-27T01:19:01Z

@azraelkuan Thanks Kuan , checked your repo， after merged all the related code it's OK now 😄

but from my server， the evalution（50+it/s） is much faster than the one before patch(10-13 it/s) ， any instruction on the speed up modification ?

r9y9 · 2018-04-27T07:58:11Z

@azraelkuan

the x is the type of Variable when we save it in the list, this will case the memory increase in the sample process.
so we should change this to
outputs += [x.cpu().data.numpy()]
and change

Sorry for chiming in late. I understand the memory usage increases in the sampling process but I don't think it triggers OOM unless you are trying to synthesis too long audio. I'm wondering CPU<->GPU data transform per-sample is inefficient, though I don't care the speed so much since it's already super slow.

r9y9 · 2018-04-27T17:28:24Z

This should be fixed by #55. Feel free to reopen if the issue persists.

r9y9 mentioned this issue Apr 27, 2018

fix the error of oom when synthesis and remove the dependency of deepvoice3 #51

Closed

r9y9 closed this as completed Apr 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of Memory on Synthesis #37

Out of Memory on Synthesis #37

dyelax commented Mar 13, 2018 •

edited

Loading

imdatceleste commented Mar 14, 2018 •

edited

Loading

dyelax commented Mar 14, 2018 •

edited

Loading

imdatceleste commented Mar 15, 2018

npuichigo commented Mar 15, 2018

neverjoe commented Mar 15, 2018 •

edited

Loading

r9y9 commented Mar 17, 2018

r9y9 commented Mar 17, 2018

aleksas commented Apr 16, 2018

azraelkuan commented Apr 18, 2018 •

edited

Loading

butterl commented Apr 26, 2018

azraelkuan commented Apr 26, 2018

butterl commented Apr 27, 2018 •

edited

Loading

r9y9 commented Apr 27, 2018

r9y9 commented Apr 27, 2018

Out of Memory on Synthesis #37

Out of Memory on Synthesis #37

Comments

dyelax commented Mar 13, 2018 • edited Loading

imdatceleste commented Mar 14, 2018 • edited Loading

dyelax commented Mar 14, 2018 • edited Loading

imdatceleste commented Mar 15, 2018

npuichigo commented Mar 15, 2018

neverjoe commented Mar 15, 2018 • edited Loading

r9y9 commented Mar 17, 2018

r9y9 commented Mar 17, 2018

aleksas commented Apr 16, 2018

azraelkuan commented Apr 18, 2018 • edited Loading

butterl commented Apr 26, 2018

azraelkuan commented Apr 26, 2018

butterl commented Apr 27, 2018 • edited Loading

r9y9 commented Apr 27, 2018

r9y9 commented Apr 27, 2018

dyelax commented Mar 13, 2018 •

edited

Loading

imdatceleste commented Mar 14, 2018 •

edited

Loading

dyelax commented Mar 14, 2018 •

edited

Loading

neverjoe commented Mar 15, 2018 •

edited

Loading

azraelkuan commented Apr 18, 2018 •

edited

Loading

butterl commented Apr 27, 2018 •

edited

Loading