Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tftrt_example.py exception #2

Open
Airyzf opened this issue Oct 19, 2018 · 5 comments
Open

tftrt_example.py exception #2

Airyzf opened this issue Oct 19, 2018 · 5 comments

Comments

@Airyzf
Copy link

Airyzf commented Oct 19, 2018

I got this error.

2018-10-19 17:53:13.158279: W tensorflow/core/common_runtime/bfc_allocator.cc:275] _______________________***************************************************************____________
2018-10-19 17:53:13.158377: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_ops.cc:693 : Resource exhausted: OOM when allocating tensor with shape[10000,64,24,24] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1292, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1277, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1367, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[10000,64,24,24] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node import/sequential_1/conv2d_2/convolution}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](import/sequential_1/conv2d_1/Relu, import/conv2d_2/kernel)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node import/sequential_1/dense_2/Softmax/_7}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_54_import/sequential_1/dense_2/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tftrt_example.py", line 138, in
main()
File "tftrt_example.py", line 118, in main
y_tf = tf_engine.infer(x_test)
File "tftrt_example.py", line 50, in infer
feed_dict={self.x_tensor: x})
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 887, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1110, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1286, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1308, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[10000,64,24,24] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node import/sequential_1/conv2d_2/convolution}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](import/sequential_1/conv2d_1/Relu, import/conv2d_2/kernel)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node import/sequential_1/dense_2/Softmax/_7}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_54_import/sequential_1/dense_2/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Caused by op 'import/sequential_1/conv2d_2/convolution', defined at:
File "tftrt_example.py", line 138, in
main()
File "tftrt_example.py", line 116, in main
tf_engine = TfEngine(frozen_graph)
File "tftrt_example.py", line 38, in init
graph_def=graph.frozen, return_elements=graph.x_name + graph.y_name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3426, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3426, in
for c_op in c_api_util.new_tf_operations(self)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3285, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1748, in init
self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[10000,64,24,24] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node import/sequential_1/conv2d_2/convolution}} = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](import/sequential_1/conv2d_1/Relu, import/conv2d_2/kernel)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node import/sequential_1/dense_2/Softmax/_7}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_54_import/sequential_1/dense_2/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

@jeng1220
Copy link
Owner

jeng1220 commented Oct 19, 2018

You can try to reduce batch size. It is OOM (out of memory).

@Airyzf
Copy link
Author

Airyzf commented Oct 19, 2018

You can try to reduce batch size. It is OOM (out of memory).
I annotate line 112,114,117 and got this error

Traceback (most recent call last):
File "tftrt_example.py", line 138, in
main()
File "tftrt_example.py", line 123, in main
tftrt_engine = TftrtEngine(frozen_graph, batch_size, 'FP32')#,frozen_graph.x_tensor,frozen_graph.y_tensor
File "tftrt_example.py", line 63, in init
opt_graph = copy.deepcopy(graph)
File "/usr/lib/python3.5/copy.py", line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib/python3.5/copy.py", line 297, in _reconstruct
state = deepcopy(state, memo)
File "/usr/lib/python3.5/copy.py", line 155, in deepcopy
y = copier(x, memo)
File "/usr/lib/python3.5/copy.py", line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib/python3.5/copy.py", line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib/python3.5/copy.py", line 297, in _reconstruct
state = deepcopy(state, memo)
File "/usr/lib/python3.5/copy.py", line 155, in deepcopy
y = copier(x, memo)
File "/usr/lib/python3.5/copy.py", line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib/python3.5/copy.py", line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File "/usr/lib/python3.5/copy.py", line 297, in _reconstruct
state = deepcopy(state, memo)
File "/usr/lib/python3.5/copy.py", line 155, in deepcopy
y = copier(x, memo)
File "/usr/lib/python3.5/copy.py", line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/usr/lib/python3.5/copy.py", line 174, in deepcopy
rv = reductor(4)
TypeError: can't pickle SwigPyObject objects

@jeng1220
Copy link
Owner

Cannot reproduce on my site. Did your environment fit requirements ?

@Airyzf
Copy link
Author

Airyzf commented Oct 20, 2018

Cannot reproduce on my site. Did your environment fit requirements ?

python3.5
May be TensorRT 3.0.4, how to check tensorrt version ?
Tensorflow 1.10.0
Keras 2.2.4
Pycuda 2018.1.1

@Ekta246
Copy link

Ekta246 commented Apr 16, 2020

Cannot reproduce on my site. Did your environment fit requirements ?

python3.5
May be TensorRT 3.0.4, how to check tensorrt version ?
Tensorflow 1.10.0
Keras 2.2.4
Pycuda 2018.1.1

Checking the TensorRt version:
import tensorrt as trt
print(trt.version_)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants