Skip to content

Unable to migrate TF1 code to TF2, tape.gradient returns None #47387

@ghost

Description

Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code: yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macos big sur
  • TensorFlow installed from (source or binary): binary, pip
  • TensorFlow version (use command below): v2.4.0-49-g85c8b2a817f 2.4.1
  • Python version: 3.8.7

Describe the current behavior
I'm migrating code from TF1 that uses tf.gradients() for doing a custom gradient calculation. I'm trying to get the same results I get with TF1 in TF2 using tf.GradientTape() however, no matter what I do including:

  • I tried to use tape.watch() and the issue persists.
  • I tried manually creating a tf.Variable() with trainable=True, watch the variable and the issue persists.
  • I tried using tf.gradients() within a tf.function and tf.compat.v1.gradients() if there is any difference at all, and the issue persists

Here's a jupyter notebook with the full code to be able to reproduce the issue.

Here's the code I'm migrating. Check lines [156 - 176]. Below is the part of interest:

    g = tf.gradients(-loss, f)  # loss being a float and f being a (m, n) tensor
    k = -f_pol / (f + eps)  # f_pol another (m, n) tensor and eps a float
    k_dot_g = tf.reduce_sum(k * g, axis=-1)
    adj = tf.maximum(
        0.0,
        (tf.reduce_sum(k * g, axis=-1) - delta)
        / (tf.reduce_sum(tf.square(k), axis=-1) + eps),
    )
    g = g - tf.reshape(adj, [nenvs * nsteps, 1]) * k
    grads_f = -g / (nenvs * nsteps)
    grads_policy = tf.gradients(f, params, grads_f)  # params being the model parameters

and here's a simplified version of what I'm trying to do:

with tf.GradientTape() as tape:
    f = calculate_f()
    f_pol = calculate_f_pol()
    others = do_further_calculations()
    loss = calculate_loss()
g = tape.gradient(-loss, f)
print(g)

results in:

None

Describe the expected behavior

As far as I understand tf.GradientTape() is the TF2 alternative to tf.gradients(). I'm trying to replicate the exact same results in TF2 and it doesn't work, so this implies either there is something wrong with my code or it is a bug. This is not the first time it happens with someone, I found numerous other complains including closed issues and neither includes a solution I have not tried.

Standalone code to reproduce the issue
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

jupyter notebook

Other info / logs Include any logs or source code that would be helpful to
diagnose the problem. If including tracebacks, please include the full
traceback. Large logs and files should be attached.

Metadata

Metadata

Assignees

Labels

TF 2.4for issues related to TF 2.4comp:opsOPs related issuestype:bugBug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions