Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image_retraining example does not work #17423

Closed
tushuhei opened this issue Mar 5, 2018 · 8 comments
Closed

image_retraining example does not work #17423

tushuhei opened this issue Mar 5, 2018 · 8 comments
Assignees

Comments

@tushuhei
Copy link

tushuhei commented Mar 5, 2018

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 10.13.3
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v1.5.0-0-g37aa430d84 1.5.0
  • Python version: 3.6.4
  • Exact command to reproduce:
    python3 tensorflow/examples/image_retraining/retrain.py
    --image_dir ~/Resources/tf-retrain-images/
    --learning_rate=0.0001
    --testing_percentage=10
    --validation_percentage=10
    --train_batch_size=32
    --validation_batch_size=-1
    --flip_left_right True
    --random_scale=30
    --random_brightness=30
    --eval_step_interval=100
    --how_many_training_steps=500
    --architecture mobilenet_0.25_224

Describe the problem

retrain.py fails with the error below when it starts to create bottleneck files for testing datasets after training is done. Looks like something wrong with making bottleneck files for test images.
FYI, it works when I checkout retrain.py back to commit dce9a49.

Source code / logs

...
INFO:tensorflow:2018-03-05 13:31:17.056049: Step 90: Validation accuracy = 89.0% (N=73)
INFO:tensorflow:2018-03-05 13:31:25.350794: Step 99: Train accuracy = 96.9%
INFO:tensorflow:2018-03-05 13:31:25.350940: Step 99: Cross entropy = 0.198750
INFO:tensorflow:2018-03-05 13:31:25.398267: Step 99: Validation accuracy = 89.0% (N=73)
Model path: /tmp/imagenet/mobilenet_v1_0.25_224_frozen.pb
INFO:tensorflow:Restoring parameters from /tmp/_retrain_checkpoint
INFO:tensorflow:Creating bottleneck at /tmp/bottleneck/cat/cat.1.jpg_mobilenet_0.25_224.txt
Traceback (most recent call last):
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1070, in _run
allow_operation=False)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3323, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3402, in _as_graph_element_locked
raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tensorflow/examples/image_retraining/retrain.py", line 394, in create_bottleneck_file
resized_input_tensor, bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 326, in run_bottleneck_on_image
{image_data_tensor: image_data})
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1073, in _run
+ e.args[0])
TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tensorflow/examples/image_retraining/retrain.py", line 1486, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/Users/tushuhei/py3env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "tensorflow/examples/image_retraining/retrain.py", line 1286, in main
bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 881, in run_final_eval
bottleneck_tensor, FLAGS.architecture))
File "tensorflow/examples/image_retraining/retrain.py", line 567, in get_random_cached_bottlenecks
resized_input_tensor, bottleneck_tensor, architecture)
File "tensorflow/examples/image_retraining/retrain.py", line 442, in get_or_create_bottleneck
bottleneck_tensor)
File "tensorflow/examples/image_retraining/retrain.py", line 397, in create_bottleneck_file
str(e)))
RuntimeError: Error during processing file /Users/tushuhei/Resources/tf-retrain-images/cat/cat.1.jpg (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

@lcycoding
Copy link

lcycoding commented Mar 5, 2018

hi,

I think I found some solution to avoid the error. ( Or mess up with the graph

I've opened an issue few days ago, but closed by official in a short time.

Not sure if this is the correct solution. Refer to #17370

In def run_final_eval function
change
(sess, bottleneck_input, ground_truth_input, evaluation_step, prediction) = build_eval_session(model_info, class_count)
into
(eval_sess, bottleneck_input, ground_truth_input, evaluation_step, prediction) = build_eval_session(model_info, class_count)

and
test_accuracy, predictions = sess.run( [evaluation_step, prediction], feed_dict={ bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth })
into
test_accuracy, predictions = eval_sess.run( [evaluation_step, prediction], feed_dict={ bottleneck_input: test_bottlenecks, ground_truth_input: test_ground_truth })

I think the problem is caused by the newly built eval sess is not included the DecodeJPGInput Tensor.
It's just a guess.

@tensor-flower
Copy link

Same. The latest commit 7a2ba8e seems to have broken the evaluation. Before that commit my code works fine, now my code has error.

RuntimeError: Error during processing file
/home/graymatics/wanqi/classifier_data/flower_photos/tulips/8712270243_8512cf4fbd.jpg (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

By the way the above solution didn't work for me, still waiting for fix.

@IcyCheetah358
Copy link

I got the same error with:
(tensorflow) mrwelph@mrwelph-810-011eo ~/www/condaTF/tensorflow $ python tensorflow/examples/image_retraining/retrain.py --architecture=mobilenet_0.25_128 --flip_left_right --random_scale=30 --random_crop=30 --how_many_training_steps=500 --image_dir ~/www/takeaway/debugSet
=> strange error
RuntimeError: Error during processing file ... (Cannot interpret feed_dict key as Tensor: Tensor Tensor("DecodeJPGInput:0", dtype=string) is not an element of this graph.)

But then I tried with several variations with just 3 training steps and noticed that adding random brightness removed the error, so this worked (ok it's not the same exact model but since I'am currently running training I won't test that the error is gone also with that):

python tensorflow/examples/image_retraining/retrain.py --architecture=mobilenet_0.25_192 --flip_left_right --random_scale=30 --random_crop=30 --how_many_training_steps=3 --image_dir ~/www/takeaway/debugSet --random_brightness=30

@poxvoculi
Copy link
Contributor

@suharshs This issue got lost for a while. @tensor-flower claims the problem was introduced by a commit you were involved with.

@poxvoculi poxvoculi removed their assignment Apr 5, 2018
@suharshs
Copy link

suharshs commented Apr 5, 2018

Sorry we missed this. I think i see the issue and will submit a fix soon. Thanks!

@haoxi911
Copy link

haoxi911 commented Apr 5, 2018

Adding --random_brightness=30 didn't work for me, I have to revert code to the parent of 7a2ba8e then it works.

arnoegw pushed a commit to tensorflow/hub that referenced this issue Apr 10, 2018
The previous code in run_final_eval() did not supply the proper session
to get_random_cached_bottlenecks(), namely the training session with
jpeg_data_tensor etc.  If caching hadn't happened before, this crashed
the use of data distortion before saving the trained model.

Issue reported by syed-ahmed in #16

This is analogous to tensorflow/tensorflow#17423
and the fix by suharshs in
tensorflow/tensorflow@ccad14e

PiperOrigin-RevId: 192252572
arnoegw pushed a commit to tensorflow/hub that referenced this issue Apr 10, 2018
The previous code in run_final_eval() did not supply the proper session
to get_random_cached_bottlenecks(), namely the training session with
jpeg_data_tensor etc.  If caching hadn't happened before, this crashed
the use of data distortion before saving the trained model.

Issue reported by syed-ahmed in #16

This is analogous to tensorflow/tensorflow#17423
and the fix by suharshs in
tensorflow/tensorflow@ccad14e

PiperOrigin-RevId: 192252572
andresusanopinto pushed a commit to tensorflow/hub that referenced this issue Apr 10, 2018
The previous code in run_final_eval() did not supply the proper session
to get_random_cached_bottlenecks(), namely the training session with
jpeg_data_tensor etc.  If caching hadn't happened before, this crashed
the use of data distortion before saving the trained model.

Issue reported by syed-ahmed in #16

This is analogous to tensorflow/tensorflow#17423
and the fix by suharshs in
tensorflow/tensorflow@ccad14e

PiperOrigin-RevId: 192252572
@koala99
Copy link

koala99 commented Jun 30, 2018

I use tf 1.7,also have this problem , I do like lcycoding said ,it's ok

@Ahanmr
Copy link

Ahanmr commented Dec 20, 2018

where can we find /tmp/ folder as formed at the end of bottlenecks? I'm not able to find this directory anywhere on my computer and I believe it's not generated, how can I save the model at the end of bottlenecks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants