tf.contrib.estimator.replicate_model_fn fails when a trainable variable doesn't have gradient #16829

x10000year · 2018-02-07T11:27:24Z

tf.contrib.estimator.replicate_model_fn fails when the gradient of a trainable variable is None. The error messages are:

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 302, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/estimator/python/estimator/replicate_model_fn.py", line 235, in replicated_model_fn
    local_ps_devices=ps_devices)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/estimator/python/estimator/replicate_model_fn.py", line 558, in _get_loss_towers
    **optional_params)
  File "model-60m-1280-2gpus-16-32-64-128-bn50000/net.py", line 38, in model_fn
    train_op = optimizer.minimize(model.total_loss, global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 353, in minimize
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/estimator/python/estimator/replicate_model_fn.py", line 317, in apply_gradients
    with ops_lib.control_dependencies(_extract_tensors(grads_and_vars)):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 4304, in control_dependencies
    return get_default_graph().control_dependencies(control_inputs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 4017, in control_dependencies
    c = self.as_graph_element(c)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3035, in as_graph_element
    return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3124, in _as_graph_element_locked
    types_str))
TypeError: Can not convert a NoneType into a Tensor or Operation.

The text was updated successfully, but these errors were encountered:

tensorflowbutler · 2018-02-07T19:36:32Z

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
Bazel version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce

x10000year · 2018-02-12T06:38:49Z

@isaprykin

isaprykin · 2018-02-13T18:40:28Z

@x10000year Hi! I fixed this yesterday and the fix is coming. Thanks for reporting.
I'll link the commit when it's available and then close the issue. I hope you can re-built or take the nightly build to take advantage of the fix.

x10000year · 2018-02-25T13:34:05Z

Thank you!

tensorflowbutler added the stat:awaiting response Status - Awaiting response from author label Feb 7, 2018

isaprykin self-assigned this Feb 12, 2018

isaprykin added type:bug Bug stat:awaiting response Status - Awaiting response from author and removed stat:awaiting response Status - Awaiting response from author labels Feb 12, 2018

martinwicke closed this as completed in 9656433 Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.contrib.estimator.replicate_model_fn fails when a trainable variable doesn't have gradient #16829

tf.contrib.estimator.replicate_model_fn fails when a trainable variable doesn't have gradient #16829

x10000year commented Feb 7, 2018

tensorflowbutler commented Feb 7, 2018

x10000year commented Feb 12, 2018

isaprykin commented Feb 13, 2018

x10000year commented Feb 25, 2018

tf.contrib.estimator.replicate_model_fn fails when a trainable variable doesn't have gradient #16829

tf.contrib.estimator.replicate_model_fn fails when a trainable variable doesn't have gradient #16829

Comments

x10000year commented Feb 7, 2018

tensorflowbutler commented Feb 7, 2018

x10000year commented Feb 12, 2018

isaprykin commented Feb 13, 2018

x10000year commented Feb 25, 2018