New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train.py in object_detection crash. AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler' #2653
Comments
I have the same error. I search
in tensorflow sourcecode from r1.0 - r1.4. I can not find it.
|
I hava the same error too Traceback (most recent call last): I run this before many times and it runs well. But today I start from a new virtual machine with same cuda8.0, cudnn6.0 and other dependencies, the tensorflow is installed using command "pip install tensorflow-gpu" with version of 1.3.0 under python2.7.12 Since error happened in object_detection/data_decoders/tf_example_decoder.py, line 128: slim_example_decoder = tf.contrib.slim.tfexample_decoder, label_handler = slim_example_decoder.BackupHandler( .......... I checked all versions of tensorflow from r0.7-r1.4, all of them don't have attribute 'BackupHandler' in tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py. Only in master branch has this attribute. |
me too :
|
I am facing the same issue today. Yesterday I was able to run on the personal machine. But today I am trying on AWS. My command is
I get this error.
|
BUT tfexample_decoder.py has the class BackupHandler. |
@chihyuwang When do pip install tensorflow-gpu, in the installed tensorflow code 'BackupHandler' class is itself not their. May be we need build tensorflow from source i guess. I'm trying that now and will here accordingly. |
I changed to the previous commit code, now it works. |
Temporary solution (at ubuntu 16.04 @ TensorFlow r1.4 ): 1:download new tfexample_decoder.py: 2:replace ok! |
File "/home/wpq/models-master/research/object_detection/utils/variables_helper.py", line 122, in get_variables_available_in_checkpoint
|
The Class in the tfexample_decoder.py TFExampleDecoder , while the constructor calling it has a name TfExampleDecoder |
Hi all, Thanks |
The new ones have models/research/object_detection . They are still being updated . I am at least able to train with this repo . https://github.com/tensorflow/models/tree/0375c800c767db2ef070cee1529d8a50f42d1042 The older version seems to work. |
|
wow, |
To resolve this issue you'll have to update your TensorFlow version. You can install a nightly build here for GPU, or here for CPU. Any TensorFlow version after October 24th 2017 should work (which is when this method was added). |
Hello AlgorithmR: it works! The nightly build works with object_detection/train.py. |
@radzfoto I use this nightly build and this error is gone, but I get another one, from the message, I can know tensorflow can't restore the variable from checkpoint, I use ssd mobile v1 model which I download from model zoom, which one do you use?
|
hi @scotthuang1989: I use faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017. I re-downloaded it just in case, but it looks to be the same as before. I did not have problems with this frozen proto. |
@radzfoto Tensorboard is not included in the nightly builds of TensorFlow. You can install it manually with pip like so: pip install tensorflow-tensorboard. |
@radzfoto your TensorFlow version is? my faster_rcnn_inception_resnet_v2_atrous_pets.config: model { |
I have the same issue, anyone got solution yet? I was trying to go to previous commits etc. but no avail |
@mpeniak You'll have to update TensorFlow using their nightly builds as specified here: #2653 (comment). Alternatively you could revert to an earlier version of object_detection, but I'm not entirely sure when this was added. Best guess would be that anything before October 27th is compatible with the main version of TF. |
@wpq3142 , my pipeline config is moderately complicated and very specific to my project, so sharing it wouldn't help you. As per the instructions from @AlgorithmR above, I installed the nightly build two nights ago. That solved the problem for me. |
Based on AlgorithmR's comment above, I downloaded one of the linux wheel files from today (11/02). But, pip install tf_nightly_gpu-1.5.0.dev20171102-cp36-cp36m-manylinux1_x86_64.whl failed with: tf_nightly_gpu-1.5.0.dev20171102-cp36-cp36m-manylinux1_x86_64.whl is not a supported wheel on this platform. I am on an x86 machine (not amd) running Ubuntu 16.04. Is this to be done differently? |
Oops! Hadn't paid attention to that fact that there are several wheel files for Linux differing very little in their naming. This one worked for Ubuntu: |
If you need to run object detection on google ML engine where only TF 1.2 is available, I have made a fork of this project and copied those missing handler code over from current master. mrfortynine@6b940fb |
#2692 should fix this. |
Hi, what should I need to do to solve this issue on Windows machine. I get similar error: I am running the program in anaconda environment. |
Still getting this error with the latest version of the master branch. |
I'm getting the same error
Error message:
Please help. Many thanks in advance. |
Seems like the commit 7a9934d broke PR #2692 which was referenced earlier here by @tombstone and subsequently issue closed by @jch1 Should this issue be re-opened or should a new issue be created? Reproducible with master of this repo, Google Cloud ML tensorflow 1.4. Compiled with Tensorflow 1.5 locally but failing in Google Cloud ML, which I would assume is stuck with tensorflow 1.4. Is this essentially because this underlying tensorflow issue was only recently fixed? If so Google Cloud ML folks need to continue to workaround as provided by @mrfortynine ? |
@caroline6927 I can also confirm the same issue from my side. I have the latest tensorflow/model repo cloned and I am using Tensorflow 1.4.0 in the installation. UPDATE: the OS is Ubuntu 16.04 |
Getting following error on GCP. It works locally. Any ideas ? The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 4 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 5 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 6 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=826960440479&resource=ml_job%2Fjob_id%2Fobject_detection_1520535841&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22object_detection_1520535841%22 |
In my case there were not any files in /usr/local/lib/python3.5/dist-packages directory.
problem was solved ! |
@CHAMOD I am facing problem on google cloud platform. No issues when I run locally. |
@yogeshl18 I have not experience yet running it on google cloud platform. |
I am also have such kind of error .... C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config AttributeError: module 'tensorflow' has no attribute 'contrib' |
command- python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config problem error - Traceback (most recent call last): i am getting error can you all please help me with this |
System information
What is the top-level directory of the model you are using: ~/tensorflow/models/research/object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No custom code, and using a neural network supplied in the object_detection folders. The dataset for retraining is my own.
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): v1.3.0-rc2-20-g0787eee 1.3.0
Bazel version (if compiling from source):
Python version 3.6.3, using GCC 5.4.0 20160609
CUDA/cuDNN version: CUDA 8.0, cudnn 6.0, nVidia driver: 384.90
GPU: nVidia TitanXp 12GB memory
CPU: Intel x86-64 Intel Core i7-7700K @ 4.20GHz x 8, 32GB memory
Exact command to reproduce:
python train.py --logtostderr --train_dir=/media/raul/RD750G/highwai/tfrecords/checkpoint --pipeline_config_path=/media/raul/RD750G/highwai/tfrecords/models/model/highwai_data.config
Describe the problem
I run the object_detection train.py script, which I have done many times successfully. However, after a git pull today, I get the error below. I have not changed any code in object_detection at all.
Source code / logs
python train.py --logtostderr --train_dir=/media/raul/RD750G/highwai/tfrecords/checkpoint --pipeline_config_path=/media/raul/RD750G/highwai/tfrecords/models/model/highwai_data.config
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
Traceback (most recent call last):
File "train.py", line 163, in
tf.app.run()
File "/home/raul/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/raul/tensorflow/models/research/object_detection/trainer.py", line 217, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/home/raul/tensorflow/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "/home/raul/tensorflow/models/research/object_detection/builders/input_reader_builder.py", line 72, in build
label_map_proto_file=label_map_proto_file)
File "/home/raul/tensorflow/models/research/object_detection/data_decoders/tf_example_decoder.py", line 128, in init
label_handler = slim_example_decoder.BackupHandler(
AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler'
The text was updated successfully, but these errors were encountered: