Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A error occured while training #9

Closed
HONGJUNRL opened this issue May 29, 2018 · 6 comments
Closed

A error occured while training #9

HONGJUNRL opened this issue May 29, 2018 · 6 comments

Comments

@HONGJUNRL
Copy link

I used the pre-trained vgg16 model as you provided ,but an error occured :
INFO:tensorflow:Fine-tuning from None. Ignoring missing vars: True.

Traceback (most recent call last):

File "train_ssd.py", line 464, in

  tf.app.run()

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run

  _sys.exit(main(argv))

File "train_ssd.py", line 460, in main

  hooks=[logging_hook], max_steps=FLAGS.max_number_of_steps)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 352, in train

  loss = self._train_model(input_fn, hooks, saving_listeners)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 812, in _train_model

  features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 793, in _call_model_fn

  model_fn_results = self._model_fn(features=features, **kwargs)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/estimator/python/estimator/replicate_model_fn.py", line 220, in single_device_model_fn

  local_ps_devices=ps_devices)[0]  # One device, so one spec is out.

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/estimator/python/estimator/replicate_model_fn.py", line 558, in _get_loss_towers

  **optional_params)

File "train_ssd.py", line 403, in ssd_model_fn
scaffold=tf.train.Scaffold(init_fn=get_init_fn()))
File "train_ssd.py", line 158, in get_init_fn

  name_remap={'/kernel': '/weights', '/bias': '/biases'})

File "/home/h/Desktop/SSD.TensorFlow-master/utility/scaffolds.py", line 66, in get_init_fn_for_scaffold

  reader = tf.train.NewCheckpointReader(checkpoint_path)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 254, in NewCheckpointReader

  return CheckpointReader(compat.as_bytes(filepattern), status)

File "/home/h/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 67, in as_bytes

  (bytes_or_text,))

TypeError: Expected binary or unicode string, got None

could you help me

@HiKapok
Copy link
Owner

HiKapok commented May 29, 2018

@HONGJUNRL please check older closed issue for solution. If the problem exists, please let me know

@HONGJUNRL
Copy link
Author

I have already checked them ,but still can't solve it

@HONGJUNRL
Copy link
Author

I have solved it, but a new error occured: Nan loss during training, so what's the matter?

@HiKapok
Copy link
Owner

HiKapok commented May 29, 2018

@HONGJUNRL sorry for late reply. How do you solve that? Just give a directory path of checkpoint? Did you gave the full file path of checkpoint in the begining? Before we solve NAN problem, you should first check you batch size. If you have changed your batch size then lower the learning rate and run more steps

@HONGJUNRL
Copy link
Author

change this code 'tf.app.flags.DEFINE_string('checkpoint_path', './model','The path to a checkpoint from which to fine-tune.')' as 'tf.app.flags.DEFINE_string('checkpoint_path', './model/vgg16.ckpt','The path to a checkpoint from which to fine-tune.')'. and I change learning rate as 1e-4, it works till now

@HONGJUNRL
Copy link
Author

Lastly,thanks for helping me

@HiKapok HiKapok closed this as completed May 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants