You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get OOM errors when an input image is bigger than about 1200 pixels each side (this varies by image for some reason). Can you help me understand how the model is breaking because of this? Is it the shape of the model or some other error and if we can configure it?
Thanks!
This is the error:
2019-12-03 10:39:21.705187: W tensorflow/core/common_runtime/bfc_allocator.cc:424] *_________________****************************__________***************************_________________
2019-12-03 10:39:21.705596: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at transpose_op.cc:198 : Resource exhausted: OOM when allocating tensor with shape[1,4800,4800,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "infer.py", line 50, in <module>
main()
File "infer.py", line 37, in main
sr = model.predict(np.expand_dims(low_res, axis=0))[0]
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 908, in predict
use_multiprocessing=use_multiprocessing)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 723, in predict
callbacks=callbacks)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 394, in model_iteration
batch_outs = f(ins_batch)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/backend.py", line 3476, in __call__
run_metadata=self.run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
run_metadata_ptr)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[1,4800,4800,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model_2/p_re_lu_2/Relu_1-0-0-TransposeNCHWToNHWC-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[model_2/conv2d_13/Tanh/_743]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[1,4800,4800,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model_2/p_re_lu_2/Relu_1-0-0-TransposeNCHWToNHWC-LayoutOptimizer}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
The text was updated successfully, but these errors were encountered:
If you have a GPU, it is not big enough to store the model + feature maps + output. You can see that the OOM happens on the final layer, when the output has the size 4800x4800x32. You can try to do inference on smaller images, or try splitting the model over multiple GPUs.
If you have more RAM than your GPU does, you can try doing inference on CPU (which will use RAM for allocating tensors). It will be slow but at least it will work. You can see here how to do this: link
I get OOM errors when an input image is bigger than about 1200 pixels each side (this varies by image for some reason). Can you help me understand how the model is breaking because of this? Is it the shape of the model or some other error and if we can configure it?
Thanks!
This is the error:
The text was updated successfully, but these errors were encountered: