Description
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No (one line modification to stock example)
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 14.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): tensorflow-gpu==2.0.0-alpha0 (also fails with every other tf 2.0 build I have explored)
- Python version: 3.6
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source): N/A
- CUDA/cuDNN version: Tried multiple
- GPU model and memory: Tried multiple
Describe the current behavior
The program crashes with a TypeError as below:
TypeError: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: encoder/unified_gru/ones_like:0
This occurs when trying to backprop the gradients through the LSTM/GRU with recurrent_dropout
enabled.
Describe the expected behavior
No error
Code to reproduce the issue
Since this problem shows up at the time of training, one needs to have the entire training pipeline (dataset, model etc.) setup to demonstrate this bug. As a result, I used the Neural Machine Translation tutorial from TensorFlow and modified their model to include recurrent_dropout
. The entire code can be found in this Colab notebook; run the code blocks all the way till the block where we're training the model to see the bug.
Other info.logs
x--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 8 9 for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)): ---> 10 batch_loss = train_step(inp, targ, enc_hidden) 11 total_loss += batch_loss 12 6 frames /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds) 436 # Lifting succeeded, so variables are initialized and we can run the 437 # stateless function. --> 438 return self._stateless_fn(*args, **kwds) 439 else: 440 canon_args, canon_kwds = self._canonicalize_function_inputs(args, kwds) /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs) 1286 """Calls a graph function specialized to the inputs.""" 1287 graph_function, args, kwargs = self._maybe_define_function(args, kwargs) -> 1288 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access 1289 1290 @property /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _filtered_call(self, args, kwargs) 572 """ 573 return self._call_flat( --> 574 (t for t in nest.flatten((args, kwargs)) 575 if isinstance(t, (ops.Tensor, 576 resource_variable_ops.ResourceVariable)))) /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _call_flat(self, args) 625 # Only need to override the gradient in graph mode and when we have outputs. 626 if context.executing_eagerly() or not self.outputs: --> 627 outputs = self._inference_function.call(ctx, args) 628 else: 629 self._register_gradient() /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in call(self, ctx, args) 413 attrs=("executor_type", executor_type, 414 "config_proto", config), --> 415 ctx=ctx) 416 # Replace empty list with None 417 outputs = outputs or None /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 68 if any(ops._is_keras_symbolic_tensor(x) for x in inputs): 69 raise core._SymbolicException ---> 70 raise e 71 # pylint: enable=protected-access 72 return tensors /usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 tensors = pywrap_tensorflow.TFE_Py_Execute(ctx._handle, device_name, 59 op_name, inputs, attrs, ---> 60 num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None: TypeError: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: encoder/unified_gru/ones_like:0