Error loading a TensorRT optimised graph #28854

satyajithj · 2019-05-20T08:33:31Z

I was able to convert a frozen model using the tensorRT API on a Nvidia Tesla P100 on Debian 9 using the command

trt_graph = trt.create_inference_graph(
    input_graph_def=saved_graph,
    outputs=output_names[0:1],
    max_batch_size=1,
    max_workspace_size_bytes=5000000000,
    precision_mode='FP16',
    is_dynamic_op=True
)

I am able to load the graph on the same system. However, when I try to load the graph on my local system which has an Nvidia GeForce GTX 1050M I get the following error.

  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/fuzzybatman/.local/lib/python3.7/site-packages/tensorflow/python/framework/importer.py", line 426, in import_graph_def
    graph._c_graph, serialized, options)  # pylint: disable=protected-access

tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'TRTEngineOp' in binary running on ceph. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

Is it because my GPU lacks support for TensorRT?

The text was updated successfully, but these errors were encountered:

achandraa · 2019-05-21T10:55:02Z

Just to verify did you get chance to have a look on #22360. Which TensorFlow version you are using?

satyajithj · 2019-05-21T12:17:58Z

Thank you for the response. I am using TF r.1.13
I am loading it in python and I did try adding import tensorflow.contrib.tensorrt

On a side note, I posted this question on nvidia devtalk and they answered

A generated TensorRT PLAN is valid for a specific GPU — more precisely, a specific CUDA Compute Capability. For example, if you generate a PLAN for an NVIDIA P4 (compute capability 6.1) you can’t use that PLAN on an NVIDIA Tesla V100 (compute capability 7.0).

This is quite confusing because there are articles online on optimising on one GPU and running on another.

satyajithj · 2019-05-21T12:29:42Z

Just noticed that there is a TF version mismatch between the one on my system (1.13) and the one on the GCP VM (1.12). Does this affect the result?

satyajithj · 2019-05-21T12:38:53Z

Tried again with a new model. Same error.

achandraa · 2019-05-21T13:00:33Z

Which CUDA/cuDNN version you are using ?

satyajithj · 2019-05-21T13:08:48Z

On my local system it is CUDA 10.1 and cuDNN 7.4.2
On the VM it is CUDA 10.0 and cuDNN 7.4.1

achandraa · 2019-05-23T06:31:01Z

Please help us with some more info as in whether you are getting this error on TensorFlow installed on your GCP VM or on local system. Which operating system you are using and whether you have installed TensorFlow from source or binary. If you are unclear about the template, you can refer this link. Also kindly verify whether you have followed the instruction from TensorFlow website based on information provided in the template. Thanks!

satyajithj · 2019-05-23T11:32:46Z

Hi. I run the create_inference_graph method on the VM

Debian 9
CUDA 10.0
cuDNN 7.4.1
TF 1.13
Nvidia Tesla P100 [16GB] (compute capability 6.0)

I try loading the graph for inference on the VM and it works fine.

I try loading the graph on my local system

Fedora 30
CUDA 10.1
cuDNN 7.4.2
TF 1.13
Nvidia GeForce GTX1050M [4GB] (compute capability 6.1)

and get the error

tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'TRTEngineOp' in binary running on ceph. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

I did not build TF from source. I installed it using pip3 in the terminal.

pip3 install tensorflow-gpu --user

According to a moderator on the nvidia devtalk forum

A generated TensorRT PLAN is valid for a specific GPU — more precisely, a specific CUDA Compute Capability. For example, if you generate a PLAN for an NVIDIA P4 (compute capability 6.1) you can’t use that PLAN on an NVIDIA Tesla V100 (compute capability 7.0).

aaroey · 2019-05-25T00:05:43Z

Hi @fuzzyBatman could you try adding:

from tensorflow.contrib.tensorrt.python.ops import trt_engine_op

to your loading script to see if it works?
Thanks.

satyajithj · 2019-05-25T10:16:54Z

@aaroey Same error.

Does the GPU choice not affect this?

aaroey · 2019-05-29T20:42:31Z

@fuzzyBatman could you share your full script, I'll try and let you know.

satyajithj · 2019-05-30T11:25:24Z

I have a TF frozen graph (.pb extension). I load it and run the create_inference_graph method on the GCP VM which has an Nvidia Tesla P100 (16 GB) GPU.

import tensorflow as tf
from tensorflow.python.framework import graph_io
from tensorflow.contrib import tensorrt as trt

def get_frozen_graph(graph_file):
    """Read Frozen Graph file from disk."""

    with tf.gfile.GFile(graph_file, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())

    tf.import_graph_def(graph_def, name='')
    return graph_def

sess = tf.Session(config=tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.6)))
sess.run(tf.global_variables_initializer())

saved_graph = get_frozen_graph(pbfile)

# Comment the following when loading the TensorRT graph.
print('Creating trt inference graph')

trt_graph = trt.create_inference_graph(
    input_graph_def=saved_graph,
    outputs=output_names[0:1],
    max_batch_size=1,
    max_workspace_size_bytes=4000000000,
    precision_mode='FP16',
    minimum_segment_size=2
)

graph_io.write_graph(trt_graph, "./train_log/faster_rcnn_fpn/", "frcnn_trt.pb", as_text=False)

The above script provides the file frcnn_trt.pb. Now I use the same get_frozen_graph procedure as above, using frcnn_trt.pb, with rest of the code commented out. This works on the same VM but fails on my local system that has an Nvidia GeForce GTX1050M (4 GB) GPU.

aaroey · 2019-09-17T14:01:30Z

@fuzzyBatman sorry I was not able to get to this. Thanks for the scripts, it looks legit to me. By This works on the same VM did you mean that in your VM you can run the TRT converted graph frcnn_trt.pb? By but fails on my local system did you mean it failed with TRTEngineOp not found error? I can imagine it'll fail with some error because TRT engines are not portable, meaning you'd better run the converted graph on a machine that has the same type of GPU as the one on which you ran the conversion.

Also, 1.15.0rc1 is out and 1.15.0 will be out soon, you may want to try with that. Also feel free to provide the pbfile and I'll try your script with that. Thanks.

kumariko · 2021-09-02T08:26:48Z

@fuzzyBatman We are checking to see if you still need help on this issue, as you are using an older version of tensorflow(1.x) which is officially considered as end of life. We recommend that you upgrade to 2.6 which is latest stable version of TF and let us know if the issue still persists in newer versions. we will get you the right help.Thanks!

satyajithj · 2021-09-02T20:41:42Z

Hi! I stopped working on that project a year ago

google-ml-butler · 2021-09-02T20:41:44Z

Are you satisfied with the resolution of your issue?
Yes
No

achandraa self-assigned this May 21, 2019

achandraa added the stat:awaiting response Status - Awaiting response from author label May 21, 2019

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label May 22, 2019

achandraa added comp:gpu GPU related issues type:support Support issues stat:awaiting response Status - Awaiting response from author labels May 23, 2019

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label May 23, 2019

achandraa assigned ymodak and unassigned achandraa May 24, 2019

ymodak assigned aaroey and unassigned ymodak May 24, 2019

ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 24, 2019

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label May 25, 2019

satyajithj mentioned this issue Jun 12, 2019

Conversion with no speed improvement, TRT-TF tensorflow/tensorrt#56

Closed

sanjoy added the comp:gpu:tensorrt Issues specific to TensorRT label Dec 26, 2019

kumariko self-assigned this Sep 1, 2021

kumariko added the stat:awaiting response Status - Awaiting response from author label Sep 2, 2021

satyajithj closed this as completed Sep 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading a TensorRT optimised graph #28854

Error loading a TensorRT optimised graph #28854

satyajithj commented May 20, 2019

achandraa commented May 21, 2019

satyajithj commented May 21, 2019

satyajithj commented May 21, 2019

satyajithj commented May 21, 2019

achandraa commented May 21, 2019 •

edited

satyajithj commented May 21, 2019

achandraa commented May 23, 2019

satyajithj commented May 23, 2019 •

edited

aaroey commented May 25, 2019

satyajithj commented May 25, 2019

aaroey commented May 29, 2019

satyajithj commented May 30, 2019

aaroey commented Sep 17, 2019

kumariko commented Sep 2, 2021

satyajithj commented Sep 2, 2021

google-ml-butler bot commented Sep 2, 2021

Error loading a TensorRT optimised graph #28854

Error loading a TensorRT optimised graph #28854

Comments

satyajithj commented May 20, 2019

achandraa commented May 21, 2019

satyajithj commented May 21, 2019

A generated TensorRT PLAN is valid for a specific GPU — more precisely, a specific CUDA Compute Capability. For example, if you generate a PLAN for an NVIDIA P4 (compute capability 6.1) you can’t use that PLAN on an NVIDIA Tesla V100 (compute capability 7.0).

satyajithj commented May 21, 2019

satyajithj commented May 21, 2019

achandraa commented May 21, 2019 • edited

satyajithj commented May 21, 2019

achandraa commented May 23, 2019

satyajithj commented May 23, 2019 • edited

A generated TensorRT PLAN is valid for a specific GPU — more precisely, a specific CUDA Compute Capability. For example, if you generate a PLAN for an NVIDIA P4 (compute capability 6.1) you can’t use that PLAN on an NVIDIA Tesla V100 (compute capability 7.0).

aaroey commented May 25, 2019

satyajithj commented May 25, 2019

aaroey commented May 29, 2019

satyajithj commented May 30, 2019

aaroey commented Sep 17, 2019

kumariko commented Sep 2, 2021

satyajithj commented Sep 2, 2021

google-ml-butler bot commented Sep 2, 2021

achandraa commented May 21, 2019 •

edited

satyajithj commented May 23, 2019 •

edited