Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM on GTX 1080 #10

Closed
igorbarinov opened this issue Sep 14, 2016 · 1 comment
Closed

OOM on GTX 1080 #10

igorbarinov opened this issue Sep 14, 2016 · 1 comment

Comments

@igorbarinov
Copy link

igorbarinov commented Sep 14, 2016

Hi Igor, I'm getting OOM on GTX1080. Reduced sample size to one directory (175 files) and still getting this error. Do you have any ideas how to fit a tensor in memory on 8Gb cards?

I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats: 
Limit:                  7690878976
InUse:                  7325787648
MaxInUse:               7465885184
NumAllocs:                    9793
MaxAllocSize:           2779725056

W tensorflow/core/common_runtime/bfc_allocator.cc:270] ****_********************_*****************************************************************xxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 928.75MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:940] Resource exhausted: OOM when allocating tensor with shape[256,256,1,3715]
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.33GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
Traceback (most recent call last):
  File "train.py", line 151, in <module>
    main()
  File "train.py", line 136, in main
    summary, loss_value, _ = sess.run([summaries, loss, optim])
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 710, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 908, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 958, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 978, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[256,256,1,3715]
     [[Node: gradients/dilated_stack/layer4/conv_f_grad/Conv2DBackpropInput = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/dilated_stack/layer4/conv_f_grad/Shape, dilated_stack/layer4/Variable/read, gradients/dilated_stack/layer4/conv_f/BatchToSpace_grad/SpaceToBatch)]]
Caused by op u'gradients/dilated_stack/layer4/conv_f_grad/Conv2DBackpropInput', defined at:

Thank you in advance

@ibab
Copy link
Owner

ibab commented Sep 14, 2016

Hi, I've added an issue on this in #4.
The problem is that the dilated convolution used by WaveNet isn't as efficiently implemented in TensorFlow as regular convolution, so it's easy to run out of memory.

Note that reducing the number of files won't help, as we only load a single file at a time by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants