TF 2.0: Cannot use recurrent_dropout with LSTMs/GRUs

**System information**
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No (one line modification to stock example)
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 14.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): tensorflow-gpu==2.0.0-alpha0 (also fails with every other tf 2.0 build I have explored)
- Python version: 3.6
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source): N/A
- CUDA/cuDNN version: Tried multiple
- GPU model and memory: Tried multiple

**Describe the current behavior**
The program crashes with a TypeError as below:

`TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: encoder/unified_gru/ones_like:0`

This occurs when trying to backprop the gradients through the LSTM/GRU with `recurrent_dropout` enabled.

**Describe the expected behavior**
No error

**Code to reproduce the issue**
Since this problem shows up at the time of training, one needs to have the entire training pipeline (dataset, model etc.) setup to demonstrate this bug. As a result, I used the [Neural Machine Translation tutorial](https://www.tensorflow.org/alpha/tutorials/text/nmt_with_attention) from TensorFlow and modified their model to include `recurrent_dropout`. The entire code can be found in [this Colab notebook](https://colab.research.google.com/drive/1dLE58i2tttY6J_Yr8dX8f57Ai0_54kTE); run the code blocks all the way till the block where we're training the model to see the bug.

**Other info.logs**
<pre>
x---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-fa5128338d20> in <module>()
      8 
      9   for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)):
---> 10     batch_loss = train_step(inp, targ, enc_hidden)
     11     total_loss += batch_loss
     12 

6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    436         # Lifting succeeded, so variables are initialized and we can run the
    437         # stateless function.
--> 438         return self._stateless_fn(*args, **kwds)
    439     else:
    440       canon_args, canon_kwds = self._canonicalize_function_inputs(args, kwds)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
   1286     """Calls a graph function specialized to the inputs."""
   1287     graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 1288     return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
   1289 
   1290   @property

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _filtered_call(self, args, kwargs)
    572     """
    573     return self._call_flat(
--> 574         (t for t in nest.flatten((args, kwargs))
    575          if isinstance(t, (ops.Tensor,
    576                            resource_variable_ops.ResourceVariable))))

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _call_flat(self, args)
    625     # Only need to override the gradient in graph mode and when we have outputs.
    626     if context.executing_eagerly() or not self.outputs:
--> 627       outputs = self._inference_function.call(ctx, args)
    628     else:
    629       self._register_gradient()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in call(self, ctx, args)
    413             attrs=("executor_type", executor_type,
    414                    "config_proto", config),
--> 415             ctx=ctx)
    416       # Replace empty list with None
    417       outputs = outputs or None

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     68     if any(ops._is_keras_symbolic_tensor(x) for x in inputs):
     69       raise core._SymbolicException
---> 70     raise e
     71   # pylint: enable=protected-access
     72   return tensors

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     tensors = pywrap_tensorflow.TFE_Py_Execute(ctx._handle, device_name,
     59                                                op_name, inputs, attrs,
---> 60                                                num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  @tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: encoder/unified_gru/ones_like:0
</pre>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TF 2.0: Cannot use recurrent_dropout with LSTMs/GRUs #29187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TF 2.0: Cannot use recurrent_dropout with LSTMs/GRUs #29187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions