Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How periodicaly evaluate the Performance of Models in TF-Slim? #13769

Closed
ghost opened this issue Oct 17, 2017 · 7 comments
Closed

How periodicaly evaluate the Performance of Models in TF-Slim? #13769

ghost opened this issue Oct 17, 2017 · 7 comments
Assignees

Comments

@ghost
Copy link

ghost commented Oct 17, 2017

I am trying to use DensNet for regression problem with TF-Slim. My data contains 60000 jpeg images with 37 float labels for each image. I divided my data into three different tfrecords files of a train set (60%), a validation set (20%) and a test set (20%).

I need to evaluate validation set during training loop and make a plot like image.
In TF-Slim documentation they just explain train loop and evaluation loop separately. I can just evaluate validation or test set after training loop finished. While as I said I need to evaluate during training.

I tried to use slim.evaluation.evaluation_loop function instead of slim.evaluation.evaluate_once. But it doesn't help.

slim.evaluation.evaluation_loop(
    master=FLAGS.master,
    checkpoint_dir=checkpoint_path,
    logdir=FLAGS.eval_dir,
    num_evals=num_batches,
    eval_op=list(names_to_updates.values()) + print_ops,
    variables_to_restore=variables_to_restore,
    summary_op = tf.summary.merge(summary_ops),
    eval_interval_secs = eval_interval_secs )

I tried evaluation.evaluate_repeatedly as well.

from tensorflow.contrib.training.python.training import evaluation

evaluation.evaluate_repeatedly(
    master=FLAGS.master,
    checkpoint_dir=checkpoint_path,
    eval_ops=list(names_to_updates.values()) + print_ops,
    eval_interval_secs = eval_interval_secs )

In both of these functions, they just read the latest available checkpoint from checkpoint_dir and apparently waiting for the next one, however when the new checkpoints are generated, they don't perform at all.

I use Python 2.7.13 and Tensorflow 1.3.0 on CPU.

Any help will be highly appreciated.

@drpngx
Copy link
Contributor

drpngx commented Oct 17, 2017

@sguada could you take a look?

@svarjo
Copy link

svarjo commented Oct 19, 2017

@rasorensen90
Copy link

I have just encountered the same problem. To me the error was caused by the following:
if tf.gfile.IsDirectory(FLAGS.checkpoint_path):
checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
else:
checkpoint_path = FLAGS.checkpoint_path

I solved it by simply setting
checkpoint_path = FLAGS.checkpoint_path

It seems like slim.evaluation.evaluation_loop needs the actual directory of the checkpoint files where as evaluate_once needs the latest checkpoint file. Which makes sense.
I hope it helps.

@ghost
Copy link
Author

ghost commented Oct 25, 2017

@rasorensen90
Yes, you are right. In this code when the folder is empty in FLAGS.checkpoint_path and waiting for new checkpoints, the checkpoint_path returns NONE.
if tf.gfile.IsDirectory(FLAGS.checkpoint_path): checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path) else: checkpoint_path = FLAGS.checkpoint_path

I added this line and it works fine.

`

       if tf.gfile.IsDirectory(FLAGS.checkpoint_path):

            if tf.train.latest_checkpoint(FLAGS.checkpoint_path):
   
                     checkpoint_path = tf.train.latest_checkpoint(FLAGS.checkpoint_path)
    
            else:
    
                     checkpoint_path = FLAGS.checkpoint_path

`

@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly.

1 similar comment
@tensorflowbutler
Copy link
Member

It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly.

@drpngx
Copy link
Contributor

drpngx commented Jan 4, 2018

Looks like @Ellie68 posted a resolution.

@drpngx drpngx closed this as completed Jan 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants