---
title: Tensorflow GradientTape
tags: 小书匠,Tensorflow,GradientTape
grammar_cjkRuby: true
# renderNumberedHeading: true
---

[toc!]

# Tensorflow GradientTape

## 基本使用

### 对于 tf.constant

对于 tf.constant 创建的变量，要计算，需要 watch；如果不 watch，计算出来的是 None

In [14]:
import tensorflow as tf
x = tf.constant(3.0)

with tf.GradientTape() as tape:
    y = x * x
dx = tape.gradient(y, x)
print(dx)

None


In [15]:
with tf.GradientTape() as tape:
    tape.watch(x)
    y = x * x
dx = tape.gradient(y, x)
print(dx)

tf.Tensor(6.0, shape=(), dtype=float32)


### 对于 tf.Variable

对于 tf.Variable 创建出来的，会自动加入求导列表中，无需 watch

In [16]:
x = tf.Variable(3.0)

with tf.GradientTape() as tape:
    y = x * x
dx = tape.gradient(y, x)
print(dx)

tf.Tensor(6.0, shape=(), dtype=float32)


注意，待求导的变量除了初始化之外，所有参与的运算都要在 GradientTape 中才可以运算

In [17]:
x1 = tf.Variable(3.0)
x2 = x1 + 2
with tf.GradientTape() as tape:
    y = x2 * x2
dx = tape.gradient(y, x1)
print(dx)

None


## Persistent 参数

默认情况下，tape 只能用来计算一次导数。第二次计算导数会报错

```
x = tf.Variable(3.0)

with tf.GradientTape() as tape:
    y = x * x
dx = tape.gradient(y, x)
new_dx = tape.gradient(y, x)
print(dx)
print(new_dx) # GradientTape.gradient can only be called once on non-persistent tapes.
```

使用 `persistent=True` 可以让 tape 保持导数，可以多次计算。

注意： 这时需要手动 GC。


In [18]:
x = tf.Variable(3.0)

with tf.GradientTape(persistent=True) as tape:
    y = x * x
dx = tape.gradient(y, x)
new_dx = tape.gradient(y, x)
print(dx) 
print(new_dx) # GradientTape.gradient can only be called once on non-persistent tapes.
del tape # 手动gc

tf.Tensor(6.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)


### 多个 tape

对于上面这种需要多次对不同的变量计算梯度的问题，除了使用 persistent 参数之外，还可以建立多个 GradientTape，如下面的[代码]([深度卷积生成对抗网络  |  TensorFlow Core](https://www.tensorflow.org/tutorials/generative/dcgan))

```py
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
  generated_images = generator(noise, training=True)

  real_output = discriminator(images, training=True)
  fake_output = discriminator(generated_images, training=True)

  gen_loss = generator_loss(fake_output)
  disc_loss = discriminator_loss(real_output, fake_output)

gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
```

## 计算高阶导数

In [19]:
x = tf.Variable(3.0)

with tf.GradientTape() as t1:
    with tf.GradientTape() as t2:
        y = x * x
    dx = t2.gradient(y, x)
ddx = t1.gradient(dx, x)

print(dx) 
print(ddx) # GradientTape.gradient can only be called once on non-persistent tapes.

tf.Tensor(6.0, shape=(), dtype=float32)
tf.Tensor(2.0, shape=(), dtype=float32)


# References
- http://localhost:8888/lab/tree/DL-Project/learnTensorflow/tensorflow-GradientTape.ipynb