### we can back propagate at a graph where back propagation is done.
### This is called `double backpropagation`

Where can we use this? Let's consider following case.

## differential of `z` for `x` where `x = 2.0`

### $y\ =\ x^{2}$
### $z\ =\ \big(\frac{\partial y}{\partial{x}}\big)^{3} + y$

---

### $\frac{\partial y}{\partial{x}} = 2x$
### $z = 8x^{3} + y$
### $\frac{\partial z}{\partial{x}} = 24x^{2} + 2x$

**The result should be `100`!**

## Use `dezero` to solve this!

In [1]:
import numpy as np
from dezero import Variable

x = Variable(np.array(2.0))
y = x**2

y.backward(create_graph=True)
gx = x.grad

x.cleargrad()

z = gx**3 + y
z.backward()

print(x.grad)

Variable(100.0)


## Where do we use this `double backpropagation`?

One usage is at **`WGAN-GP`** loss function!

### $L = \underset{\tilde{x} \sim \mathbb{P}_{g}}{\mathbb{E}}\big[D(\tilde{x})\big] - \underset{x \sim \mathbb{P}_{r}}{\mathbb{E}}\big[D(x)\big] + \lambda \underset{\hat{x} \sim \mathbb{P}_{\hat{x}}}{\mathbb{E}}\big[\big(\big\Vert \nabla_{\hat{x}} D(\hat{x})\big\Vert_{2} - 1\big)^{2}\big]$ 

## $\nabla_{\hat{x}} D(\hat{x}) \leftarrow$ this could be calculated using `double backpropagation`

other places use `double backpropagation`

- MAML
- TRPO