-
Notifications
You must be signed in to change notification settings - Fork 45.4k
Description
System information
- What is the top-level directory of the model you are using:models/research
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow):NO
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04):ubuntu 16.04
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below):tf_nightly_gpu-1.5.0.dev20171031-cp35-cp35m-manylinux1_x86_64
- Bazel version (if compiling from source):
- CUDA/cuDNN version: cuda8.0
- GPU model and memory:GT1060/6GB
- Exact command to reproduce:
python object_detection/train.py --logtostderr --pipeline_config_path=/home/scott/github/models/research/object_detection/samples/configs/ssd_mobilenet_v1_pets.config --train_dir=./ssd_mobile_v1_pets_retrain
You can collect some of this information using our environment capture script:
https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh
Describe the problem
when I run above command. I got following errors, I omit most of them for simplicity:
File "/home/scott/github/models/research/object_detection/utils/variables_helper.py", line 122, in get_variables_available_in_checkpoint
ckpt_reader = tf.train.NewCheckpointReader(checkpoint_path)
File "/home/scott/anaconda3/envs/tfgpu/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 195, in NewCheckpointReader
return CheckpointReader(compat.as_bytes(filepattern), status)
File "/home/scott/anaconda3/envs/tfgpu/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /home/scott/github/models/research/object_detection/ssd_mobilenet_v1_coco_11_06_2017/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
(tfgpu) scott@scott-b250:~/github/models/research$
From the log, It seems the format of checkpoint file is not right. but I got this from model zoom. I also try other models(faster_rcnn_resnet101_coco_11_06_2017), same error.