<a href="https://colab.research.google.com/github/LjBLincoln/Machine_Learning/blob/master/GradientDescent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **GradientDescent**

### **1 Dimension**

```
f(x) =  x^3 + 2 * x - 3

f(x)'s derivative(x) = 6 * x^5 + 16 * x^3 - 18 * x^2 + 8 * x - 12, 

#x= 1, derivative(x) = 0  

当x < 1时，derivative < 0，斜率为负的；
当x > 1时，derivative > 0，斜率为正的；
当x 无限接近 1时，derivative也就无限=0，斜率为零。
```


   
```

x在函数中的移动:

x = x - reate * derivative

当斜率为负的时候，x增大，当斜率为正的时候，x减小；
因此x总是会向着低谷移动，使得error最小，从而求得 f(x) = 0处的解。
其中的rate代表x逆着导数方向移动的距离，
rate越大，x每次就移动的越多。
反之移动的越少。

```

In [57]:

# f(x) = 0 ,x = ?


def f(x):
  return x**3 + 2 * x - 3

def error(x):
  return (f(x) - 0)**2

def gradient_descent(x):
  delta = 0.00000001
  derivative = (error(x + delta) - error(x)) / delta
  rate = 0.01
  return x - rate * derivative

x = 0.8
for i in range(50):
  x = gradient_descent(x)
  #print(x)
  if i % 10 == 0 :
      print('x = {:6f}, f(x) = {:6f}'.format(x, f(x)))

x = 0.869619, f(x) = -0.603123
x = 0.999795, f(x) = -0.001025
x = 1.000000, f(x) = -0.000001
x = 1.000000, f(x) = -0.000000
x = 1.000000, f(x) = -0.000000


### **n Dimension**

```
f(x) = x[0] + 2 * x[1] + 4

要求f(x) = 0处，x[0]与x[1]的值，也可以通过求error函数的最小值来间接求f(x)的解。跟一维函数唯一不同的是，要分别对x[0]与x[1]进行求导。

偏导数：

保持x[1]不变，对x[0]进行求导，即f(x)对x[0]的偏导数
保持x[0]不变，对x[1]进行求导，即f(x)对x[1]的偏导数
```

In [58]:
def f(x):
  return x[0] + 2 * x[1] + 4

def error(x):
  return (f(x) - 0)**2

def gradient_descent(x):
  delta = 0.00000001
  derivative_x0 = (error([x[0] + delta, x[1]]) - error([x[0], x[1]])) / delta
  derivative_x1 = (error([x[0], x[1] + delta]) - error([x[0], x[1]])) / delta
  rate = 0.02
  x[0] = x[0] - rate * derivative_x0
  x[1] = x[1] - rate * derivative_x1
  return [x[0], x[1]]
  
x = [-0.5, -1.0]
for i in range(100):
  x = gradient_descent(x)
  #print(x)
  if i % 10 == 0 :
    print('x = {:6f},{:6f}, f(x) = {:6f}'.format(x[0],x[1],f(x)))


x = -0.560000,-1.120000, f(x) = 1.200000
x = -0.774230,-1.548460, f(x) = 0.128849
x = -0.797233,-1.594466, f(x) = 0.013835
x = -0.799703,-1.599406, f(x) = 0.001486
x = -0.799968,-1.599936, f(x) = 0.000159
x = -0.799997,-1.599993, f(x) = 0.000017
x = -0.800000,-1.599999, f(x) = 0.000002
x = -0.800000,-1.600000, f(x) = 0.000000
x = -0.800000,-1.600000, f(x) = 0.000000
x = -0.800000,-1.600000, f(x) = -0.000000


### **TensorFlow GradientDescent**

```
tf.train.GradientDescentOptimizer:

Optimizer that implements the gradient descent algorithm.


__init__
__init__(
    learning_rate,
    use_locking=False,
    name='GradientDescent'
)


Class Optimizer

minimize

minimize(
    loss,
    global_step=None,
    var_list=None,
    gate_gradients=GATE_OP,
    aggregation_method=None,
    colocate_gradients_with_ops=False,
    name=None,
    grad_loss=None
)
Add operations to minimize loss by updating var_list.

```

In [59]:
import tensorflow as tf
 
# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)

# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W*x + b
y = tf.placeholder(tf.float32)
 
# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares

# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)
 
# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]

# training loop
init = tf.global_variables_initializer()
sess = tf.Session()

sess.run(init) # reset values to wrong

for i in range(1000):
  sess.run(train, {x: x_train, y: y_train})
 
  # evaluate training accuracy
  curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x: x_train, y: y_train})
  if i % 50 == 0 :
      print("step: %s W: %s b: %s loss: %s"%(i, curr_W, curr_b, curr_loss))

step: 0 W: [-0.21999997] b: [-0.456] loss: 4.0181446
step: 50 W: [-0.7127021] b: [0.15530905] loss: 0.47665495
step: 100 W: [-0.84270465] b: [0.5375326] loss: 0.14287975
step: 150 W: [-0.913881] b: [0.74679965] loss: 0.04282894
step: 200 W: [-0.9528499] b: [0.86137295] loss: 0.012838208
step: 250 W: [-0.97418535] b: [0.9241017] loss: 0.0038483376
step: 300 W: [-0.98586655] b: [0.9584459] loss: 0.0011535526
step: 350 W: [-0.992262] b: [0.9772494] loss: 0.00034577597
step: 400 W: [-0.9957634] b: [0.98754394] loss: 0.000103651124
step: 450 W: [-0.9976805] b: [0.99318033] loss: 3.106893e-05
step: 500 W: [-0.99873006] b: [0.9962662] loss: 9.313486e-06
step: 550 W: [-0.99930465] b: [0.9979557] loss: 2.7921515e-06
step: 600 W: [-0.99961936] b: [0.99888086] loss: 8.3671165e-07
step: 650 W: [-0.99979156] b: [0.9993872] loss: 2.5092038e-07
step: 700 W: [-0.9998859] b: [0.9996646] loss: 7.514916e-08
step: 750 W: [-0.99993753] b: [0.9998163] loss: 2.2558808e-08
step: 800 W: [-0.9999658] b: [0.9998