image_retraining example does not work #17423

tushuhei · 2018-03-05T05:09:12Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 10.13.3
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): v1.5.0-0-g37aa430d84 1.5.0
Python version: 3.6.4
Exact command to reproduce:
python3 tensorflow/examples/image_retraining/retrain.py
--image_dir ~/Resources/tf-retrain-images/
--learning_rate=0.0001
--testing_percentage=10
--validation_percentage=10
--train_batch_size=32
--validation_batch_size=-1
--flip_left_right True
--random_scale=30
--random_brightness=30
--eval_step_interval=100
--how_many_training_steps=500
--architecture mobilenet_0.25_224

Describe the problem

retrain.py fails with the error below when it starts to create bottleneck files for testing datasets after training is done. Looks like something wrong with making bottleneck files for test images.
FYI, it works when I checkout retrain.py back to commit dce9a49.

Source code / logs

...
INFO:tensorflow:2018-03-05 13:31:17.056049: Step 90: Validation accuracy = 89.0% (N=73)
INFO:tensorflow:2018-03-05 13:31:25.350794: Step 99: Train accuracy = 96.9%
INFO:tensorflow:2018-03-05 13:31:25.350940: Step 99: Cross entropy = 0.198750
INFO:tensorflow:2018-03-05 13:31:25.398267: Step 99: Validation accuracy = 89.0% (N=73)
Model path: /tmp/imagenet/mobilenet_v1_0.25_224_frozen.pb
INFO:tensorflow:Restoring parameters from /tmp/_retrain_checkpoint
INFO:tensorflow:Creating bottleneck at /tmp/bottleneck/cat/cat.1.jpg_mobilenet_0.25_224.txt
Traceback (most recent call last):
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1070, in _run
allow_operation=False)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3323, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3402, in _as_graph_element_locked
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tensorflow/examples/image_retraining/retrain.py", line 394, in create_bottleneck_file
resized_input_tensor, bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 326, in run_bottleneck_on_image
{image_data_tensor: image_data})
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1073, in _run
+ e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tensorflow/examples/image_retraining/retrain.py", line 1486, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "tensorflow/examples/image_retraining/retrain.py", line 1286, in main
bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 881, in run_final_eval
bottleneck_tensor, FLAGS.architecture))
File "tensorflow/examples/image_retraining/retrain.py", line 567, in get_random_cached_bottlenecks
resized_input_tensor, bottleneck_tensor, architecture)
File "tensorflow/examples/image_retraining/retrain.py", line 442, in get_or_create_bottleneck
bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 397, in create_bottleneck_file
str(e)))
RuntimeError: Error during processing file /Users/tushuhei/Resources/tf-retrain-images/cat/cat.1.jpg (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

The text was updated successfully, but these errors were encountered:

lcycoding · 2018-03-05T07:08:15Z

hi,

I think I found some solution to avoid the error. ( Or mess up with the graph

I've opened an issue few days ago, but closed by official in a short time.

Not sure if this is the correct solution. Refer to #17370

In def run_final_eval function
change
(sess, bottleneck_input, ground_truth_input, evaluation_step, prediction) = build_eval_session(model_info, class_count)
into
(eval_sess, bottleneck_input, ground_truth_input, evaluation_step, prediction) = build_eval_session(model_info, class_count)

and
test_accuracy, predictions = sess.run( [evaluation_step, prediction], feed_dict={ bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth })
into
test_accuracy, predictions = eval_sess.run( [evaluation_step, prediction], feed_dict={ bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth })

I think the problem is caused by the newly built eval sess is not included the DecodeJPGInput Tensor.
It's just a guess.

tensor-flower · 2018-03-08T05:12:41Z

Same. The latest commit 7a2ba8e seems to have broken the evaluation. Before that commit my code works fine, now my code has error.

RuntimeError: Error during processing file
/home/graymatics/wanqi/classifier_data/flower_photos/tulips/8712270243_8512cf4fbd.jpg (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

By the way the above solution didn't work for me, still waiting for fix.

IcyCheetah358 · 2018-03-31T10:49:40Z

I got the same error with:
(tensorflow) mrwelph@mrwelph-810-011eo ~/www/condaTF/tensorflow $ python tensorflow/examples/image_retraining/retrain.py --architecture=mobilenet_0.25_128 --flip_left_right --random_scale=30 --random_crop=30 --how_many_training_steps=500 --image_dir ~/www/takeaway/debugSet
=> strange error
RuntimeError: Error during processing file ... (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

But then I tried with several variations with just 3 training steps and noticed that adding random brightness removed the error, so this worked (ok it's not the same exact model but since I'am currently running training I won't test that the error is gone also with that):

python tensorflow/examples/image_retraining/retrain.py --architecture=mobilenet_0.25_192 --flip_left_right --random_scale=30 --random_crop=30 --how_many_training_steps=3 --image_dir ~/www/takeaway/debugSet --random_brightness=30

poxvoculi · 2018-04-05T01:11:14Z

@suharshs This issue got lost for a while. @tensor-flower claims the problem was introduced by a commit you were involved with.

suharshs · 2018-04-05T01:31:30Z

Sorry we missed this. I think i see the issue and will submit a fix soon. Thanks!

haoxi911 · 2018-04-05T11:59:36Z

Adding --random_brightness=30 didn't work for me, I have to revert code to the parent of 7a2ba8e then it works.

The previous code in run_final_eval() did not supply the proper session to get_random_cached_bottlenecks(), namely the training session with jpeg_data_tensor etc. If caching hadn't happened before, this crashed the use of data distortion before saving the trained model. Issue reported by syed-ahmed in #16 This is analogous to tensorflow/tensorflow#17423 and the fix by suharshs in tensorflow/tensorflow@ccad14e PiperOrigin-RevId: 192252572

koala99 · 2018-06-30T17:56:43Z

I use tf 1.7,also have this problem , I do like lcycoding said ,it's ok

Ahanmr · 2018-12-20T13:42:26Z

where can we find /tmp/ folder as formed at the end of bottlenecks? I'm not able to find this directory anywhere on my computer and I believe it's not generated, how can I save the model at the end of bottlenecks?

tensorflowbutler assigned poxvoculi Apr 3, 2018

poxvoculi assigned suharshs Apr 5, 2018

poxvoculi removed their assignment Apr 5, 2018

rmlarsen closed this as completed in 8b52120 Apr 6, 2018

arnoegw mentioned this issue Apr 10, 2018

In Hub image retraining, fix session mixup in case of data distortion. tensorflow/hub#20

Merged

amirbernatODT mentioned this issue Apr 16, 2018

Image retraining script memory problem and issue #17370

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image_retraining example does not work #17423

image_retraining example does not work #17423

tushuhei commented Mar 5, 2018

lcycoding commented Mar 5, 2018 •

edited

tensor-flower commented Mar 8, 2018

IcyCheetah358 commented Mar 31, 2018

poxvoculi commented Apr 5, 2018

suharshs commented Apr 5, 2018

haoxi911 commented Apr 5, 2018

koala99 commented Jun 30, 2018

Ahanmr commented Dec 20, 2018

image_retraining example does not work #17423

image_retraining example does not work #17423

Comments

tushuhei commented Mar 5, 2018

System information

Describe the problem

Source code / logs

lcycoding commented Mar 5, 2018 • edited

tensor-flower commented Mar 8, 2018

IcyCheetah358 commented Mar 31, 2018

poxvoculi commented Apr 5, 2018

suharshs commented Apr 5, 2018

haoxi911 commented Apr 5, 2018

koala99 commented Jun 30, 2018

Ahanmr commented Dec 20, 2018

lcycoding commented Mar 5, 2018 •

edited