Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train.py in object_detection crash. AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler' #2653

Closed
radzfoto opened this issue Oct 31, 2017 · 38 comments · Fixed by #2692
Labels
stat:awaiting model gardener Waiting on input from TensorFlow model gardener

Comments

@radzfoto
Copy link

System information

  • What is the top-level directory of the model you are using: ~/tensorflow/models/research/object_detection

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No custom code, and using a neural network supplied in the object_detection folders. The dataset for retraining is my own.

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04

  • TensorFlow installed from (source or binary): binary

  • TensorFlow version (use command below): v1.3.0-rc2-20-g0787eee 1.3.0

  • Bazel version (if compiling from source):

  • Python version 3.6.3, using GCC 5.4.0 20160609

  • CUDA/cuDNN version: CUDA 8.0, cudnn 6.0, nVidia driver: 384.90

  • GPU: nVidia TitanXp 12GB memory

  • CPU: Intel x86-64 Intel Core i7-7700K @ 4.20GHz x 8, 32GB memory

  • Exact command to reproduce:
    python train.py --logtostderr --train_dir=/media/raul/RD750G/highwai/tfrecords/checkpoint --pipeline_config_path=/media/raul/RD750G/highwai/tfrecords/models/model/highwai_data.config

Describe the problem

I run the object_detection train.py script, which I have done many times successfully. However, after a git pull today, I get the error below. I have not changed any code in object_detection at all.

Source code / logs

python train.py --logtostderr --train_dir=/media/raul/RD750G/highwai/tfrecords/checkpoint --pipeline_config_path=/media/raul/RD750G/highwai/tfrecords/models/model/highwai_data.config
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
Traceback (most recent call last):
File "train.py", line 163, in
tf.app.run()
File "/home/raul/tensorflow/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/raul/tensorflow/models/research/object_detection/trainer.py", line 217, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/home/raul/tensorflow/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "/home/raul/tensorflow/models/research/object_detection/builders/input_reader_builder.py", line 72, in build
label_map_proto_file=label_map_proto_file)
File "/home/raul/tensorflow/models/research/object_detection/data_decoders/tf_example_decoder.py", line 128, in init
label_handler = slim_example_decoder.BackupHandler(
AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler'

@scotthuang1989
Copy link

scotthuang1989 commented Oct 31, 2017

I have the same error. I search

BackupHandler

in tensorflow sourcecode from r1.0 - r1.4. I can not find it.
I aslo run the unittest at research directory:

python object_detection/data_decoders/tf_example_decoder_test.py
there are 12 errors, all the error message is similar, here is the first error message:

======================================================================
ERROR: testDecodeBoundingBox (main.TfExampleDecoderTest)

Traceback (most recent call last):
File "object_detection/data_decoders/tf_example_decoder_test.py", line 128, in testDecodeBoundingBox
'image/format': self._BytesFeature('jpeg'),
File "object_detection/data_decoders/tf_example_decoder_test.py", line 57, in _BytesFeature
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
TypeError: 'jpeg' has type str, but expected one of: bytes

@tuobay
Copy link

tuobay commented Oct 31, 2017

I hava the same error too
I run object_detection/train.py today, the error info is follows:

Traceback (most recent call last):
File "object_detection/train.py", line 163, in
tf.app.run()
File "/home/caffe/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/caffe/models/research/object_detection/trainer.py", line 217, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/home/caffe/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "/home/caffe/models/research/object_detection/builders/input_reader_builder.py", line 72, in build
label_map_proto_file=label_map_proto_file)
File "/home/caffe/models/research/object_detection/data_decoders/tf_example_decoder.py", line 128, in init
label_handler = slim_example_decoder.BackupHandler(
AttributeError: 'module' object has no attribute 'BackupHandler'

I run this before many times and it runs well. But today I start from a new virtual machine with same cuda8.0, cudnn6.0 and other dependencies, the tensorflow is installed using command "pip install tensorflow-gpu" with version of 1.3.0 under python2.7.12

Since error happened in object_detection/data_decoders/tf_example_decoder.py, line 128: slim_example_decoder = tf.contrib.slim.tfexample_decoder, label_handler = slim_example_decoder.BackupHandler( ..........
I check the tensorflow 1.3.0 sorcecode, https://github.com/tensorflow/tensorflow/blob/r1.3/tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py
This file "tfexample_decoder.py" actually does not have attribute 'BackupHandler', I don't know why object_detection/data_decoders/tf_example_decoder.py has the code 'label_handler = slim_example_decoder.BackupHandler'

I checked all versions of tensorflow from r0.7-r1.4, all of them don't have attribute 'BackupHandler' in tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py. Only in master branch has this attribute.

@hoangtocdo90
Copy link

hoangtocdo90 commented Oct 31, 2017

me too :

root@fc086fe4f615:/notebooks/models/research# python object_detection/train.py --logtostderr /notebooks/katakana --pipeline_config_path=/noteboo
ks/katakana/data/ssd_mobilenet_v1_katakana.config --train_dir=/notebooks/katakana/
Traceback (most recent call last):
File "object_detection/train.py", line 163, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 159, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/notebooks/models/research/object_detection/trainer.py", line 217, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/notebooks/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "/notebooks/models/research/object_detection/builders/input_reader_builder.py", line 72, in build
label_map_proto_file=label_map_proto_file)
File "/notebooks/models/research/object_detection/data_decoders/tf_example_decoder.py", line 128, in init
label_handler = slim_example_decoder.BackupHandler(
AttributeError: 'module' object has no attribute 'BackupHandler'

@harshkn
Copy link

harshkn commented Oct 31, 2017

I am facing the same issue today. Yesterday I was able to run on the personal machine. But today I am trying on AWS.

My command is

python object_detection/train.py --logtostderr --pipeline_config_path=/home/ubuntu/od/models/model/ssd_mobilenet_v1_pets.config --train_dir=/home/ubuntu/od/models/model/train

I get this error.

Traceback (most recent call last): File "object_detection/train.py", line 163, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "object_detection/train.py", line 159, in main worker_job_name, is_chief, FLAGS.train_dir) File "/home/ubuntu/models/research/object_detection/trainer.py", line 217, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/home/ubuntu/models/research/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/home/ubuntu/models/research/object_detection/builders/input_reader_builder.py", line 72, in build label_map_proto_file=label_map_proto_file) File "/home/ubuntu/models/research/object_detection/data_decoders/tf_example_decoder.py", line 128, in __init__ label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler'

@chihyuwang
Copy link

@Sharathnasa
Copy link

@chihyuwang When do pip install tensorflow-gpu, in the installed tensorflow code 'BackupHandler' class is itself not their. May be we need build tensorflow from source i guess. I'm trying that now and will here accordingly.

@tuobay
Copy link

tuobay commented Oct 31, 2017

I changed to the previous commit code, now it works.

@wpq3142
Copy link

wpq3142 commented Oct 31, 2017

Temporary solution (at ubuntu 16.04 @ TensorFlow r1.4 ):

1:download new tfexample_decoder.py:
$ cd $home
$ wget https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/data/tfexample_decoder.py

2:replace
$ cd /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/slim/python/slim/data
$ sudo cp ./tfexample_decoder.py tfexample_decoder-backup.py
$ sudo rm ./tfexample_decoder.py
$ sudo cp /home/wpq/tfexample_decoder.py ./tfexample_decoder.py

ok!

@wpq3142
Copy link

wpq3142 commented Oct 31, 2017

File "/home/wpq/models-master/research/object_detection/utils/variables_helper.py", line 122, in get_variables_available_in_checkpoint
ckpt_reader = tf.train.NewCheckpointReader(checkpoint_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 150, in NewCheckpointReader
return CheckpointReader(compat.as_bytes(filepattern), status)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))

Error:
tensorflow.python.framework.errors_impl.DataLossError:
Unable to open table file /home/wpq/data/potato/data/model.ckpt.data-00000-of-00001:
Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

Process finished with exit code 1

help me:
What is the reason,How to solve?

@Nilavro
Copy link

Nilavro commented Oct 31, 2017

The Class in the tfexample_decoder.py TFExampleDecoder , while the constructor calling it has a name TfExampleDecoder

@alberto139
Copy link

Hi all,
I have the same problem. I installed Tensor Flow using pip install tensorflow-gpu.
Does installing from source or a previous commit solve the problem?
If so how do I do that?

Thanks

@Nilavro
Copy link

Nilavro commented Oct 31, 2017

The new ones have models/research/object_detection . They are still being updated . I am at least able to train with this repo .

https://github.com/tensorflow/models/tree/0375c800c767db2ef070cee1529d8a50f42d1042

The older version seems to work.

@Sharathnasa
Copy link

Sharathnasa commented Oct 31, 2017

loss
@tombstone I was able to run the latest repo. I installed the tensorflow from the source code of master branch. But the loss looks really bad.. Loss is in 10 digits long. I will attach the screenshot.

@mrfortynine
Copy link

mrfortynine commented Oct 31, 2017

wow, BackupHandler only exists in master, not even in Tensorflow-1.4. I know this is research code, but it would be great if we can stick with an official TF release, given that the latest supported TF version on Google Cloud is only 1.2.

@Adrrei
Copy link

Adrrei commented Oct 31, 2017

To resolve this issue you'll have to update your TensorFlow version. You can install a nightly build here for GPU, or here for CPU.

Any TensorFlow version after October 24th 2017 should work (which is when this method was added).

@radzfoto
Copy link
Author

Hello AlgorithmR: it works! The nightly build works with object_detection/train.py.
HOWEVER, I installed this in a clean ubuntu linux env, and now tensorboard is missing. Is there something different in the nightly builds that requires a separate installation for tensorboard? And if so, do I need to build tensorboard from source?
Thank you!

@scotthuang1989
Copy link

@radzfoto I use this nightly build and this error is gone, but I get another one, from the message, I can know tensorflow can't restore the variable from checkpoint, I use ssd mobile v1 model which I download from model zoom, which one do you use?

tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /home/scott/github/models/research/object_detection/ssd_mobilenet_v1_coco_11_06_2017/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

@radzfoto
Copy link
Author

hi @scotthuang1989: I use faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017. I re-downloaded it just in case, but it looks to be the same as before. I did not have problems with this frozen proto.

@Adrrei
Copy link

Adrrei commented Nov 1, 2017

@radzfoto Tensorboard is not included in the nightly builds of TensorFlow. You can install it manually with pip like so: pip install tensorflow-tensorboard.

@wpq3142
Copy link

wpq3142 commented Nov 1, 2017

@radzfoto your TensorFlow version is?
Can you give the contents of the pipeline_config file ?

my faster_rcnn_inception_resnet_v2_atrous_pets.config:

model {
faster_rcnn {
num_classes: 37
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_inception_resnet_v2'
first_stage_features_stride: 8
}

@mpeniak
Copy link

mpeniak commented Nov 2, 2017

I have the same issue, anyone got solution yet? I was trying to go to previous commits etc. but no avail

@Adrrei
Copy link

Adrrei commented Nov 2, 2017

@mpeniak You'll have to update TensorFlow using their nightly builds as specified here: #2653 (comment).

Alternatively you could revert to an earlier version of object_detection, but I'm not entirely sure when this was added. Best guess would be that anything before October 27th is compatible with the main version of TF.

@radzfoto
Copy link
Author

radzfoto commented Nov 2, 2017

@wpq3142 , my pipeline config is moderately complicated and very specific to my project, so sharing it wouldn't help you. As per the instructions from @AlgorithmR above, I installed the nightly build two nights ago. That solved the problem for me.

@gyan777
Copy link

gyan777 commented Nov 2, 2017

Based on AlgorithmR's comment above, I downloaded one of the linux wheel files from today (11/02). But,

pip install tf_nightly_gpu-1.5.0.dev20171102-cp36-cp36m-manylinux1_x86_64.whl

failed with:

tf_nightly_gpu-1.5.0.dev20171102-cp36-cp36m-manylinux1_x86_64.whl is not a supported wheel on this platform.

I am on an x86 machine (not amd) running Ubuntu 16.04. Is this to be done differently?

@gyan777
Copy link

gyan777 commented Nov 2, 2017

Oops! Hadn't paid attention to that fact that there are several wheel files for Linux differing very little in their naming. This one worked for Ubuntu:
tf_nightly_gpu-1.5.0.dev20171102-cp27-cp27mu-manylinux1_x86_64.whl

@mrfortynine
Copy link

If you need to run object detection on google ML engine where only TF 1.2 is available, I have made a fork of this project and copied those missing handler code over from current master. mrfortynine@6b940fb

@tombstone
Copy link
Contributor

#2692 should fix this.

@mghildiy
Copy link

Hi, what should I need to do to solve this issue on Windows machine.
I ran the command:
python train.py --logtostderr --train_dir=path-to-my-training-directory --pipeline_config_path=path-to-my-ssd_mobilenet_v1_pets.config

I get similar error:
label_handler = slim_example_decoder.BackupHandler(
AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler'

I am running the program in anaconda environment.

@RDaneelOlivav
Copy link

Still getting this error with the latest version of the master branch.
Any other cause? Because it seems it was merged the fix to master but still happens here.

@salomeow
Copy link

I'm getting the same error

  • windows 7
  • tensorflow 1.4
  • python 3.6
  • just checked out tensorflow/models today (last update Feb 10 2018)

Error message:

$ python train.py --logtostderr --train_dir=./models/train --pipeline_config_path=ssd_mobilenet_v1_coco.config
WARNING:tensorflow:From C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\trainer.py:228: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
Traceback (most recent call last):
  File "train.py", line 169, in <module>
    tf.app.run()
  File "C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train.py", line 165, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\trainer.py", line 235, in train
    train_config.prefetch_queue_capacity, data_augmentation_options)
  File "C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\trainer.py", line 59, in create_input_queue
    tensor_dict = create_tensor_dict_fn()
  File "train.py", line 122, in get_next
    worker_index=FLAGS.task)).get_next()
  File "C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\builders\dataset_builder.py", line 140, in build
    label_map_proto_file=label_map_proto_file)
  File "C:\Users\wyujia\AppData\Local\Continuum\anaconda3\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\data_decoders\tf_example_decoder.py", line 153, in __init__
    label_handler = slim_example_decoder.BackupHandler(
AttributeError: module 'tensorflow.contrib.slim.python.slim.data.tfexample_decoder' has no attribute 'BackupHandler'

Please help. Many thanks in advance.

@jhovell
Copy link

jhovell commented Feb 16, 2018

Seems like the commit 7a9934d broke PR #2692

which was referenced earlier here by @tombstone and subsequently issue closed by @jch1

Should this issue be re-opened or should a new issue be created?

Reproducible with master of this repo, Google Cloud ML tensorflow 1.4. Compiled with Tensorflow 1.5 locally but failing in Google Cloud ML, which I would assume is stuck with tensorflow 1.4.

Is this essentially because this underlying tensorflow issue was only recently fixed? If so Google Cloud ML folks need to continue to workaround as provided by @mrfortynine ?

@xiaoyongzhu
Copy link

xiaoyongzhu commented Feb 23, 2018

@caroline6927 I can also confirm the same issue from my side. I have the latest tensorflow/model repo cloned and I am using Tensorflow 1.4.0 in the installation.

UPDATE: the OS is Ubuntu 16.04

@yogeshl18
Copy link

Getting following error on GCP. It works locally. Any ideas ?

The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 4 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 5 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' The replica worker 6 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 162, in init label_handler = slim_example_decoder.BackupHandler( AttributeError: 'module' object has no attribute 'BackupHandler' To find out more about why your job exited please check the logs: https://console.cloud.google.com/logs/viewer?project=826960440479&resource=ml_job%2Fjob_id%2Fobject_detection_1520535841&advancedFilter=resource.type%3D%22ml_job%22%0Aresource.labels.job_id%3D%22object_detection_1520535841%22

@CHAMOD
Copy link

CHAMOD commented Mar 17, 2018

In my case there were not any files in /usr/local/lib/python3.5/dist-packages directory.
So the assumption is that tensorflow has not been installed successfully.
I am working within an anaconda environment.so I updated tensorflow.
here is the command that I used.

conda install -c conda-forge tensorflow

problem was solved !

@yogeshl18
Copy link

@CHAMOD I am facing problem on google cloud platform. No issues when I run locally.

@CHAMOD
Copy link

CHAMOD commented Mar 19, 2018

@yogeshl18 I have not experience yet running it on google cloud platform.

@itraaj
Copy link

itraaj commented Mar 26, 2019

I am also have such kind of error ....

C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config
Traceback (most recent call last):
File "train.py", line 49, in
from object_detection.builders import dataset_builder
File "C:\Users\irijdm\AppData\Local\Continuum\anaconda3\envs\tensorflow2\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\dataset_builder.py", line 27, in
from object_detection.data_decoders import tf_example_decoder
File "C:\Users\irijdm\AppData\Local\Continuum\anaconda3\envs\tensorflow2\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\data_decoders\tf_example_decoder.py", line 27, in
slim_example_decoder = tf.contrib.slim.tfexample_decoder

AttributeError: module 'tensorflow' has no attribute 'contrib'

@Innovator07
Copy link

command- python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config

problem error - Traceback (most recent call last):
File "train.py", line 51, in
from object_detection.builders import model_builder
File "/usr/local/lib/python2.7/dist-packages/object_detection-0.1-py2.7.egg/object_detection/builders/model_builder.py", line 20, in
from object_detection.builders import anchor_generator_builder
File "/usr/local/lib/python2.7/dist-packages/object_detection-0.1-py2.7.egg/object_detection/builders/anchor_generator_builder.py", line 18, in
from object_detection.anchor_generators import flexible_grid_anchor_generator
File "/usr/local/lib/python2.7/dist-packages/object_detection-0.1-py2.7.egg/object_detection/anchor_generators/flexible_grid_anchor_generator.py", line 19, in
from object_detection.anchor_generators import grid_anchor_generator
File "/usr/local/lib/python2.7/dist-packages/object_detection-0.1-py2.7.egg/object_detection/anchor_generators/grid_anchor_generator.py", line 1
/#Copyright 2017 The TensorFlow Authors. All Rights Reserved.
^

i am getting error can you all please help me with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting model gardener Waiting on input from TensorFlow model gardener
Projects
None yet
Development

Successfully merging a pull request may close this issue.