Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op #1

Closed
zwhinmedia opened this issue Dec 6, 2017 · 17 comments

Comments

@zwhinmedia
Copy link

My system environment is :
Python: Anaconda3
Tensorflow-gpu : 1.4.0
GPU: nvidia gtx 1070

when I run the project, there is an error:

====== loading HAND frozen graph into memory
2017-12-06 11:59:38.773044: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feat
ure_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2017-12-06 11:59:39.066517: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gp
u\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.64GiB
2017-12-06 11:59:39.066767: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gp
u\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000
:01:00.0, compute capability: 6.1)
====== Hand Inference graph loaded.
2017-12-06 11:59:41.530536: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\ex
ecutor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where; signa
ture=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppr
ession/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/
BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether you
r GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=
input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppressio
n/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/Batch
MultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your Gra
phDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "detect_single_threaded.py", line 52, in
image_np, detection_graph, sess)
File "D:\PythonProjects\handtracking\utils\detector_utils.py", line 90, in detect_objects
feed_dict={image_tensor: image_np_expanded})
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=
input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppressio
n/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/Batch
MultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your Gra
phDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Caused by op 'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Whe
re', defined at:
File "detect_single_threaded.py", line 8, in
detection_graph, sess = detector_utils.load_inference_graph()
File "D:\PythonProjects\handtracking\utils\detector_utils.py", line 45, in load_inference_graph
tf.import_graph_def(od_graph_def, name='')
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\importer.py", line 313, in import_graph_def
op_def=op_def)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
op_def=op_def)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool ->
index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreate
rThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonM
axSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpre
ting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Could everyone give me a solution? thx~

@victordibia
Copy link
Owner

victordibia commented Dec 6, 2017

Hi @zwhinmedia

The code in this repo is written using tensorflow version 1.4.0-rc0.
The error you have looks like a version mismatch error described here .

Can you confirm your exact Tensorflow version?

python -c 'import tensorflow as tf; print(tf.__version__)'  # for Python 2
python3 -c 'import tensorflow as tf; print(tf.__version__)'  # for Python 3

Better still, are you able to run the object detection demo successfully on your machine?

-V.

@zwhinmedia
Copy link
Author

@victordibia

$ python3 -c 'import tensorflow as tf; print(tf.version)'
1.4.0

I am able to run the object detection demo successfully on your machine. and I have used the object detection to translate learning to create my own demo.

@victordibia
Copy link
Owner

Please see the discussion here: tensorflow/tensorflow#1528 .
It appears Tensorflow is really sensitive to versioning .. i.e 1.4.0 is not the same as 1.4.0-rc0 .

@zhao-haha
Copy link

zhao-haha commented Dec 11, 2017

@zwhinmedia Have you solved the problem? I have the same problem when trying to run this demo

@zhao-haha
Copy link

zhao-haha commented Dec 11, 2017

Here is the discussion under tensorflow:

Object detection works on Linux but not Mac #14884
tensorflow/tensorflow#14884

@zwhinmedia
Copy link
Author

@ZhaoWangFu
I have used the egohands datasets and the tensorflow object detection api to retrain my own model.
After training, I am able to detect 'hand' by my own model.

@zhao-haha
Copy link

@zwhinmedia, That's really nice, how long did it takes to retrain the model on your PC? I am still downloading the egohands datasets because it's really a big file, could you please share your model with me? I want to check the performace of this method as fast as possible and then i planned to retrain the model with YOLO2.

@drewgillson
Copy link

I also had this "InvalidArgumentError: NodeDef mentions attr 'T'" issue and for the life of me haven't been able to solve it, using TF 1.4.0-rc0. I've started to train my own model but after 60,000 steps and nearly 24 hours my loss is still ~5, which doesn't seem right. Victor, is there a chance you can re-export your frozen inference graph using the TF 1.4.1 release that came out on Friday? I'm sure there are others who would appreciate that too. Thanks!

@victordibia
Copy link
Owner

@drewgillson That makes sense.
I'll work on that later today and post an update within the next 24hrs.
-V.

@drewgillson
Copy link

@victordibia Thanks, that would be wonderful. I'll let my model continue to train for now, but didn't you say you achieved a loss <2 after only 5 hours of training on a GPU? I'm using a p2.xlarge instance on AWS. Did you make any particular changes to the SSD config file?

@victordibia
Copy link
Owner

victordibia commented Dec 11, 2017

Loss of ~2.5 after 5hrs.

I did not change much in the SSD config file, just the usual.

num_classes: 1
batch_size: 6

I had some memory issues with larger batch sizes.

eval_config: {
  num_examples: 960
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 960
}

Also changed eval config to fit my train/test split.

@victordibia
Copy link
Owner

victordibia commented Dec 11, 2017

I just added a version generated in using tensorflow 1.4.1.

I tested and kinda found there are some diffenrences in the size of the bounding boxes I get and the confidence levels too. (This may be due to changes in the TF code over the last few weeks). Im not sure at the moment ..

Below is an image that shows results of the different models on same camera feed.

-V.

@EvanMu96
Copy link

@zwhinmedia Could you please share me a copy of your retrained model? I would appreciate it if you do so.

@panyan928
Copy link

Hi,everyone. I have the same problem "Invalid argument: NodeDef mentions attr 'T' not in Op".
Error as follows:
`> ====== loading HAND frozen graph into memory
2017-12-30 11:03:07.075731: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

====== Hand Inference graph loaded.
example.mkv
opended
2017-12-30 11:03:08.269669: E C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\common_runtime\executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Traceback (most recent call last):
File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\Python36\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "detect_single_threaded.py", line 56, in <module>
    image_np, detection_graph, sess)
  File "E:\HandDetection\handtracking\utils\detector_utils.py", line 90, in detect_objects
    feed_dict={image_tensor: image_np_expanded})
  File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
    run_metadata_ptr)
  File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
    options, run_metadata)
  File "C:\Python36\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
         [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Caused by op 'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where', defined at:
  File "detect_single_threaded.py", line 8, in <module>
    detection_graph, sess = detector_utils.load_inference_graph()
  File "E:\HandDetection\handtracking\utils\detector_utils.py", line 45, in load_inference_graph
    tf.import_graph_def(od_graph_def, name='')
  File "C:\Python36\lib\site-packages\tensorflow\python\framework\importer.py", line 313, in import_graph_def
    op_def=op_def)
  File "C:\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
    op_def=op_def)
  File "C:\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
         [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

I can run the object detection demo successfully and tensorflow version is 1.4.0-rc0.
I just want to run the detection,
python detect_single_threaded.py --source example.mkv
can anyone help me ? Please!

@victordibia
Copy link
Owner

victordibia commented Dec 30, 2017

Hi,

I also exported the model checkpoint ...
https://github.com/victordibia/handtracking/tree/master/model-checkpoint

You can use this to export a frozen model based on your tensorflow installation. Some directions can be found here ..

Please try this

-V.

@panyan928
Copy link

@victordibia Thank you very much! I have run it successfully.

@victordibia
Copy link
Owner

Perfect!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants