Precondition Error when is_training is set to false #17

TomRoussel · 2017-08-03T09:15:06Z

I noticed that when the depth test graph is being build, the is_training argument for disp_net is not set to False. Won't this negatively affect the test performance, as the batch normalization won't be configured properly?

When setting this argument to True, an exception is raised. (Related to batch norm)

FailedPreconditionError: Attempting to use uninitialized value depth_net/upcnv3/BatchNorm/moving_mean
	 [[Node: depth_net/upcnv3/BatchNorm/moving_mean/read = Identity[T=DT_FLOAT, _class=["loc:@depth_net/upcnv3/BatchNorm/moving_mean"], _device="/job:localhost/replica:0/task:0/gpu:0"](depth_net/upcnv3/BatchNorm/moving_mean)]]
	 [[Node: depth_prediction/truediv/_131 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_459_depth_prediction/truediv", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

I get this when using the model that was provided in the "download_model.sh" script

The text was updated successfully, but these errors were encountered:

tinghuiz · 2017-08-07T03:08:31Z

This is due to a bug in the training code as the saver is only defined to save the trainable_variables that do not include the moving mean/variance for batch norm. I am planning to re-train the model sometime with the proper batch norm configuration.

zhenheny · 2017-08-15T01:02:10Z

Hi, Have you tried training with moving mean/variance restored while testing? I tried save everything while training, and restore bn parameters when testing, but got much worse results.

tinghuiz · 2017-08-18T04:37:54Z

What were your batchnorm hyperparameters? The tensorflow default 'decay' for batchnorm (https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm) seems too high from my preliminary experiments. I will update the code with proper batchnorm configuration soon (most likely within a week).

zhenheny · 2017-08-21T18:38:12Z

@tinghuiz Thank you for the response. I have used the default slim parameters (same as in your code). With default setting, the decay is 0.999.

tinghuiz · 2017-08-21T23:37:49Z

From some online discussion of the batch_norm layer, decay of 0.999 is not desirable for relatively small-scale problems (i.e. problems that don't require millions of training steps). Can you try a smaller decay such as 0.9 or 0.95 and see if it helps?

zhenheny · 2017-08-22T00:08:50Z

I will try. One more question about bn is that for train_op, only trainable_variables are fed into the optimizer. My reading from the documents is that bn parameters are not in the trainable_variable list but in global_variable list. Is the bn mean and variance changing if train_op only applies on trainable_vars?

tinghuiz · 2017-08-22T06:07:32Z

Good point. You should replace it with something like self.train_op = slim.learning.create_train_op(total_loss, optim)

tinghuiz · 2017-08-24T21:37:39Z

I have removed batch_norm altogether in the latest update.

offbye · 2018-12-06T06:54:21Z

how to save moving_mean and moving_variebles to trainable_variables ? have solved ？

tinghuiz closed this as completed Aug 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precondition Error when is_training is set to false #17

Precondition Error when is_training is set to false #17

TomRoussel commented Aug 3, 2017

tinghuiz commented Aug 7, 2017

zhenheny commented Aug 15, 2017

tinghuiz commented Aug 18, 2017

zhenheny commented Aug 21, 2017

tinghuiz commented Aug 21, 2017

zhenheny commented Aug 22, 2017

tinghuiz commented Aug 22, 2017 •

edited

tinghuiz commented Aug 24, 2017

offbye commented Dec 6, 2018

Precondition Error when is_training is set to false #17

Precondition Error when is_training is set to false #17

Comments

TomRoussel commented Aug 3, 2017

tinghuiz commented Aug 7, 2017

zhenheny commented Aug 15, 2017

tinghuiz commented Aug 18, 2017

zhenheny commented Aug 21, 2017

tinghuiz commented Aug 21, 2017

zhenheny commented Aug 22, 2017

tinghuiz commented Aug 22, 2017 • edited

tinghuiz commented Aug 24, 2017

offbye commented Dec 6, 2018

tinghuiz commented Aug 22, 2017 •

edited