New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow 1.3 with Python 3.6.2 under Windows 10 64 Bit OS has issue when run tensorflow/tensorflow/examples/image_retraining/label_image.py #12736

Closed
strategist922 opened this Issue Sep 1, 2017 · 14 comments

Comments

Projects
None yet
@strategist922

strategist922 commented Sep 1, 2017

Please go to Stack Overflow for help and support:

https://stackoverflow.com/questions/tagged/tensorflow

If you open a GitHub issue, here is our policy:

  1. It must be a bug or a feature request.
  2. The form below must be filled out.
  3. It shouldn't be a TensorBoard issue. Those go here.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.


System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 X64 Enterprise Edition
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.3
  • Python version: Anaconda 4.4.0 Python 3.6.2
  • Bazel version (if compiling from source): no
  • CUDA/cuDNN version: No
  • GPU model and memory: No
  • Exact command to reproduce:
    (tensorflow13) C:\Users\James\Tensorflow\model-retrain\tensorflow-for-poets-2\scripts>python .\label_image.py --image c:\Users\James\Tensorflow\sample_img\Panda001.jpg --graph c:\Users\James\Tensorflow\model-retrain\tensorflow-for-poets-2\scripts\retrained_graph.pb --labels C:\Users\James\Tensorflow\model-retrain\tensorflow-for-poets-2\scripts\retrained_labels.txt

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

Source code / logs

Error Log:
2017-09-01 09:27:46.902115: I C:\tf_jenkins\home\workspace\nightly-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
Traceback (most recent call last):
File ".\label_image.py", line 120, in
input_operation = graph.get_operation_by_name(input_name);
File "C:\Users\James\AppData\Local\conda\conda\envs\tensorflow13\lib\site-packages\tensorflow\python\framework\ops.py", line 3225, in get_operation_by_name
return self.as_graph_element(name, allow_tensor=False, allow_operation=True)
File "C:\Users\James\AppData\Local\conda\conda\envs\tensorflow13\lib\site-packages\tensorflow\python\framework\ops.py", line 3097, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "C:\Users\James\AppData\Local\conda\conda\envs\tensorflow13\lib\site-packages\tensorflow\python\framework\ops.py", line 3157, in _as_graph_element_locked
"graph." % repr(name))
KeyError: "The name 'import/input' refers to an Operation not in the graph."

@astewartau

This comment has been minimized.

Show comment
Hide comment
@astewartau

astewartau Sep 1, 2017

I am having this issue as well. I am using WinPython rather than Anaconda. My Tensorflow version is also 1.3 and Python 3.6.2.

UPDATE: My implementation is now working after changing the lines 78-79 in label_image.py from:

input_layer = "input"
output_layer = "InceptionV3/Predictions/Reshape_1"

to:

input_layer = "Mul"
output_layer = "final_result"

I am not sure why they were set to those other values to begin with - as far as I can tell, they are not valid operations.

astewartau commented Sep 1, 2017

I am having this issue as well. I am using WinPython rather than Anaconda. My Tensorflow version is also 1.3 and Python 3.6.2.

UPDATE: My implementation is now working after changing the lines 78-79 in label_image.py from:

input_layer = "input"
output_layer = "InceptionV3/Predictions/Reshape_1"

to:

input_layer = "Mul"
output_layer = "final_result"

I am not sure why they were set to those other values to begin with - as far as I can tell, they are not valid operations.

@drpngx

This comment has been minimized.

Show comment
Hide comment
@drpngx

drpngx Sep 7, 2017

Member

Also see resolution in #12815 (comment).

Member

drpngx commented Sep 7, 2017

Also see resolution in #12815 (comment).

@aureosun

This comment has been minimized.

Show comment
Hide comment
@aureosun

aureosun Sep 19, 2017

Hello ,Everyone
Is there anybody who had ever run the code label_image.py in tensorflow/tensorflow/examples/label_image/label_image.py
I have modify it to run on a dataset and read and calssify image one by one,and as the number of images goes,the speed is slower and slower,at first,that's about ten images per second,and when the number of image goes to 1000,the time is about 7s,Incredibly! I guess the matter is memorry leak?!
And I find the problem is in the function read_tensor_from_image_file in label_image.py and this part is read and preprocess images, so what's the matter?So I want to know how to speed up? Still how to modify the code so as to making it run for batches ? @drpngx @strategist922 @astewartau

aureosun commented Sep 19, 2017

Hello ,Everyone
Is there anybody who had ever run the code label_image.py in tensorflow/tensorflow/examples/label_image/label_image.py
I have modify it to run on a dataset and read and calssify image one by one,and as the number of images goes,the speed is slower and slower,at first,that's about ten images per second,and when the number of image goes to 1000,the time is about 7s,Incredibly! I guess the matter is memorry leak?!
And I find the problem is in the function read_tensor_from_image_file in label_image.py and this part is read and preprocess images, so what's the matter?So I want to know how to speed up? Still how to modify the code so as to making it run for batches ? @drpngx @strategist922 @astewartau

@tohnperfect

This comment has been minimized.

Show comment
Hide comment
@tohnperfect

tohnperfect Oct 30, 2017

I got the same issue as @aureosun got.
Any suggestions would be appreciated.

Thanks

tohnperfect commented Oct 30, 2017

I got the same issue as @aureosun got.
Any suggestions would be appreciated.

Thanks

@Adriabs

This comment has been minimized.

Show comment
Hide comment
@Adriabs

Adriabs Dec 4, 2017

@astewartau You are my hero! I've been trying to crack this mystery for way too long now and this fixed all my sorrows!

Adriabs commented Dec 4, 2017

@astewartau You are my hero! I've been trying to crack this mystery for way too long now and this fixed all my sorrows!

@MarkDaoust

This comment has been minimized.

Show comment
Hide comment
@MarkDaoust

MarkDaoust Dec 6, 2017

Contributor

Hi,

I've synced the examples/label_image script into the tensorflow-for-poets-2 version and deleted the duplicate that was in examples/image_retraining.

Currently the only difference is the default input_layer and output_layer names. The examples version is set for the inception v3 checkpoint and the codelab version is set for the retrained mobilenet.

Some of the confusion here is probably caused by slippage between the main version of the tutorial (1.4) and people using the master branch of the git clone. This is fixed in the master version of the tutorial. A fix is inflight to add a versioned link to the tutorial to help people use the matching version.

Contributor

MarkDaoust commented Dec 6, 2017

Hi,

I've synced the examples/label_image script into the tensorflow-for-poets-2 version and deleted the duplicate that was in examples/image_retraining.

Currently the only difference is the default input_layer and output_layer names. The examples version is set for the inception v3 checkpoint and the codelab version is set for the retrained mobilenet.

Some of the confusion here is probably caused by slippage between the main version of the tutorial (1.4) and people using the master branch of the git clone. This is fixed in the master version of the tutorial. A fix is inflight to add a versioned link to the tutorial to help people use the matching version.

@gunan gunan closed this in 7ac7aa8 Dec 7, 2017

@jamesdeep

This comment has been minimized.

Show comment
Hide comment
@jamesdeep

jamesdeep Dec 23, 2017

@astewartau Thank you very much for helping me solving the problem! But I am still confusing about the reason inside it.

jamesdeep commented Dec 23, 2017

@astewartau Thank you very much for helping me solving the problem! But I am still confusing about the reason inside it.

@ribonucleic

This comment has been minimized.

Show comment
Hide comment
@ribonucleic

ribonucleic Jan 31, 2018

Thanks for the input_layer="Mul" hint, saved me.

I tried to display the node names to find precisely that. I only got some nodes for the picture feeding and similar, nothing for the inception model. Anyone knows how to find them so I am not reliant on random websearches?

ribonucleic commented Jan 31, 2018

Thanks for the input_layer="Mul" hint, saved me.

I tried to display the node names to find precisely that. I only got some nodes for the picture feeding and similar, nothing for the inception model. Anyone knows how to find them so I am not reliant on random websearches?

@MarkDaoust

This comment has been minimized.

Show comment
Hide comment
@MarkDaoust

MarkDaoust Jan 31, 2018

Contributor

In TensorBoard? double click on boxes to see inside.

Contributor

MarkDaoust commented Jan 31, 2018

In TensorBoard? double click on boxes to see inside.

@EE-shawn

This comment has been minimized.

Show comment
Hide comment
@EE-shawn

EE-shawn Feb 8, 2018

hey, guys, after I retrained Inception-V3 using my own data. when i use label_image.py to test, it comes that my input layer and output layer is not right. I changed them.
but then another problem comes, anybody knows that:

InvalidArgumentError:
NodeDef mentions attr 'dilations' not in Op<name=Conv2D; signature=input:T, filter:T -> output:T;
attr=T:type,allowed=[DT_HALF, DT_FLOAT]; attr=strides:list(int); attr=use_cudnn_on_gpu:bool,default=true;
attr=padding:string,allowed=["SAME", "VALID"];
attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>;

NodeDef: import/conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_import/Mul_0_0/_1, import/conv/conv2d_params).
(Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).

[[Node: import/conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_import/Mul_0_0/_1, import/conv/conv2d_params)]]

Thanks for helping!!!

EE-shawn commented Feb 8, 2018

hey, guys, after I retrained Inception-V3 using my own data. when i use label_image.py to test, it comes that my input layer and output layer is not right. I changed them.
but then another problem comes, anybody knows that:

InvalidArgumentError:
NodeDef mentions attr 'dilations' not in Op<name=Conv2D; signature=input:T, filter:T -> output:T;
attr=T:type,allowed=[DT_HALF, DT_FLOAT]; attr=strides:list(int); attr=use_cudnn_on_gpu:bool,default=true;
attr=padding:string,allowed=["SAME", "VALID"];
attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>;

NodeDef: import/conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_import/Mul_0_0/_1, import/conv/conv2d_params).
(Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).

[[Node: import/conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_import/Mul_0_0/_1, import/conv/conv2d_params)]]

Thanks for helping!!!

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Feb 8, 2018

Member

Please try using the code from the branch your TF is built from.
Like if you are using 1.5, use the example code from the r1.5 branch.

Member

gunan commented Feb 8, 2018

Please try using the code from the branch your TF is built from.
Like if you are using 1.5, use the example code from the r1.5 branch.

@walton-wang929

This comment has been minimized.

Show comment
Hide comment
@walton-wang929

walton-wang929 Feb 8, 2018

@gunan Thank you bro. you said right. my model trained on other tensorflow 1.5 sever machine, but when i tested on my tensorflow 1.5 computer. so it goes wrong. after upgrading tensorflow to 1.5, this problem solved.

walton-wang929 commented Feb 8, 2018

@gunan Thank you bro. you said right. my model trained on other tensorflow 1.5 sever machine, but when i tested on my tensorflow 1.5 computer. so it goes wrong. after upgrading tensorflow to 1.5, this problem solved.

@sheerun

This comment has been minimized.

Show comment
Hide comment
@sheerun

sheerun Jun 5, 2018

We me input layer "Placeholder" works as well. It would good to know what is the official recommended layer though.. @drpngx @gunan ?

sheerun commented Jun 5, 2018

We me input layer "Placeholder" works as well. It would good to know what is the official recommended layer though.. @drpngx @gunan ?

@SriSk87

This comment has been minimized.

Show comment
Hide comment
@SriSk87

SriSk87 Jul 26, 2018

This did not work for me is there any other solution for this ?
I changed my code as following

  input_height = 299
  input_width = 299
  input_mean = 0
  input_std = 299
  input_layer = "Mul"
  output_layer = "final_result"

  input_name = "import/" + input_layer
  output_name = "import/" + output_layer
  input_operation = graph.get_operation_by_name(input_name)
  output_operation = graph.get_operation_by_name(output_name)

Please suggest me a solution
Thank you

SriSk87 commented Jul 26, 2018

This did not work for me is there any other solution for this ?
I changed my code as following

  input_height = 299
  input_width = 299
  input_mean = 0
  input_std = 299
  input_layer = "Mul"
  output_layer = "final_result"

  input_name = "import/" + input_layer
  output_name = "import/" + output_layer
  input_operation = graph.get_operation_by_name(input_name)
  output_operation = graph.get_operation_by_name(output_name)

Please suggest me a solution
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment