# GradientTape

[tf.GradientTape](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape?hl=zh_tw)

Record operations for automatic differentiation.

```python
tf.GradientTape(
    persistent=False,
    watch_accessed_variables=True
)
```

- Auguments:

    - `persistent`: Boolean control whether a persistent gradient tape is created.
    - `watch_accessed_variables`: Boolean control whether the tape will automatically `watch` any (trainable) variables accessed while the tape is active.
    
---
   
[tf.GradientTape.gradient](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape?hl=zh_tw#gradient)

Computes the gradient using operations recorded in context of this tape.

```python
tf.GradientTape.gradient(
    target,
    sources,
    output_gradients=None,
    unconnected_gradients=tf.UnconnectedGradients.NONE
)
```

- Auguments:

    - `target`: Tensor (or list of tensors) to be differentiated.
    - `sources`: a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
    - `output_gradients`: a list of gradients, one for each element of target. Defaults to None.
    - `unconnected_gradients`: a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.

---

[tf.GradientTape.watch](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape?hl=zh_tw#watch)

Ensures that tensor is being traced by this tape.


```python
tf.GradientTape.watch(tensot)
```

- Arguments:

    - `tensor`: a Tensor or list of Tensors.


For example, consider the function `y = x * x`. The gradient at `x = 3.0` can be computed as:

In [1]:
import tensorflow as tf

tf.__version__

'2.0.0'

In [2]:
x = tf.constant(3.0)
x

<tf.Tensor: id=0, shape=(), dtype=float32, numpy=3.0>

In [3]:
with tf.GradientTape() as tape:
    tape.watch(x)
    y = x * x
dy_dx = tape.gradient(y, x) # Will compute to 6.0

print(dy_dx)
print("Value:", dy_dx.numpy())

tf.Tensor(6.0, shape=(), dtype=float32)
Value: 6.0


GradientTapes can be nested to compute higher-order derivatives. For example,

In [4]:
with tf.GradientTape() as tape2:
    tape2.watch(x)
    with tf.GradientTape() as tape3:
        tape3.watch(x)
        y = x * x
    dy_dx = tape3.gradient(y, x)
d2y_d2x = tape2.gradient(dy_dx, x)

print("Function: y = x * x, x = 3.0")
print("First Derivative:", dy_dx.numpy())
print("Second Derivative:", d2y_d2x.numpy())

Function: y = x * x, x = 3.0
First Derivative: 6.0
Second Derivative: 2.0


## Compute multiple gradient

By default, the resources held by a `GradientTape` are released as soon as `GradientTape.gradient()` method is called.

To compute multiple gradients over the same computation, create `a persistent gradient tape`. This allows multiple calls to the gradient() method as resources are released when the tape object is garbage collection.

In [5]:
with tf.GradientTape(persistent=True) as persistent_tape:
    persistent_tape.watch(x)
    y = x * x
    z = y * y
dz_dx = persistent_tape.gradient(z, x) # 108.0 (4*x^3 at x = 3)
dy_dx = persistent_tape.gradient(y, x) # 6.0
print("First derivative of function y = x ^ 4 at x = 3 is", dz_dx.numpy())

# Drop the reference to the tape
del persistent_tape 
#persistent_tape # NameError: name 'persistent_tape' is not defined

First derivative of function y = x ^ 4 at x = 3 is 108.0


## Disable automatic tracking

By default, GradientTape will automatically watch any trainable variables that are accessed inside the context.

If you want fine-grained contorl over which variables are watched you disable automatic tracking by passing `watch_accessed_variables=False` to the tape constructor.

In [6]:
variable_a = tf.Variable(3.0, name='x1')
variable_b = tf.Variable(2.0, name='x2')

with tf.GradientTape(persistent=True, watch_accessed_variables=False) as disable_tracking_tape:
    disable_tracking_tape.watch(variable_a)
    y = variable_a ** 2 # Gradients will be available for `variable_a`.
    z = variable_b ** 3 # No gradients will be available since `variable_b` is 
                        # not being watched.
gradient_1 = disable_tracking_tape.gradient(y, variable_a) # 6.0
gradient_2 = disable_tracking_tape.gradient(z, variable_b) # None

print(gradient_1)
print(gradient_2)

tf.Tensor(6.0, shape=(), dtype=float32)
None
