# 微积分-求导


使用tensorflow的GradientTape实现自动求导


## 安装tensorflow


pip
```bash
pip install tensorflow
```


conda
```bash
conda install tensorflow
```

## 引入tensorflow模块

In [1]:
import tensorflow as tf
import numpy as np

## 简单的求导


对 x<sup>2</sup> 求导，是 2x

In [2]:
x = tf.Variable(5.0) # 假设x=5
with tf.GradientTape() as tape:
  y = tf.square(x)
y_grad = tape.gradient(y, x)
print(y, y_grad)

tf.Tensor(25.0, shape=(), dtype=float32) tf.Tensor(10.0, shape=(), dtype=float32)


## 偏导


L(w, b) = || X*w + b -y ||<sup>2</sup>

In [3]:
x = tf.constant([[1., 2.],[ 3., 4.]])
y = tf.constant([[1.], [2.]])
w = tf.Variable([[1.], [2.]])
b = tf.Variable(1.)
with tf.GradientTape() as tape:
  L = tf.reduce_sum(tf.square(tf.matmul(x, w) + b - y))
w_grad, b_grad = tape.gradient(L, [w, b])
print(L, w_grad, b_grad)

tf.Tensor(125.0, shape=(), dtype=float32) tf.Tensor(
[[ 70.]
 [100.]], shape=(2, 1), dtype=float32) tf.Tensor(30.0, shape=(), dtype=float32)


## 高阶求导


对 x<sup>3</sup> 求导，是 3x<sup>2</sup> 再求导，是 6x

In [4]:
x = tf.Variable(5.0) # 假设x=5
with tf.GradientTape() as tape1:
  with tf.GradientTape() as tape0:
    y = x ** 3
  y_grad0 = tape0.gradient(y, x)
y_grad1 = tape1.gradient(y_grad0, x)
print(y, y_grad0, y_grad1)

tf.Tensor(125.0, shape=(), dtype=float32) tf.Tensor(75.0, shape=(), dtype=float32) tf.Tensor(30.0, shape=(), dtype=float32)


## 持续


GradientTape比较特殊，跟之前我们理解的with是不一样的，执行一次之后就释放


> 下面的做法就是错误的

In [5]:
x1 = tf.Variable(2.0) # x1=2
x2 = tf.Variable(3.0) # x2=3
with tf.GradientTape() as tape:
  y1 = tf.square(x1)
  y2 = tf.pow(x2, 3)
y_grad1 = tape.gradient(y1, x1)
y_grad2 = tape.gradient(y2, x2)
print(y1, y_grad1, y2, y_grad2)

RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients (or jacobians)

要想执行多个，必须用持续模式

In [6]:
x1 = tf.Variable(3.0)
x2 = tf.Variable(4.0)
with tf.GradientTape(persistent=True) as tape:
  y1 = tf.square(x1)
  y2 = tf.pow(x2, 3)
y_grad1 = tape.gradient(y1, x1)
y_grad2 = tape.gradient(y2, x2)
print(y1, y_grad1, y2, y_grad2)
del tape

tf.Tensor(9.0, shape=(), dtype=float32) tf.Tensor(6.0, shape=(), dtype=float32) tf.Tensor(64.0, shape=(), dtype=float32) tf.Tensor(48.0, shape=(), dtype=float32)


## 实际应用


回归


这里用到了keras还需要安装keras


```bash
conda install keras
```

In [7]:
x_raw = np.array([2013,2014,2015,2016,2017], dtype=np.float32)
y_raw = np.array([12000,14000,15000,16500,17500], dtype=np.float32)
x = (x_raw - x_raw.min()) / (x_raw.max() - x_raw.min())
y = (y_raw - y_raw.min()) / (y_raw.max() - y_raw.min())
w = tf.Variable(0.)
b = tf.Variable(0.)
variables = [w, b]
num_epoch = 1000
optimizer = tf.keras.optimizers.SGD(1e-3)
for e in range(num_epoch):
  with tf.GradientTape() as tape:
    y_hat = w * x + b
    loss = tf.reduce_sum(tf.square(y_hat-y))
  grads = tape.gradient(loss, variables)
  optimizer.apply_gradients(zip(grads, variables))
print(w, b)

<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=0.7083977> <tf.Variable 'Variable:0' shape=() dtype=float32, numpy=0.20608652>


得到结果


y = 0.7083977x + 0.20608652


epoch越高，数值会越好。


这个结果其实并不如机器学习里直接数学算出来的好，但是深度学习大部分都是不能直接推出来结果，需要训练