#### 假设我们要最小化函数  $y=x^2$, 选择初始点   $x_0=5$

#### 1. 学习率为1的时候，x在5和-5之间震荡。
$x_0-\lambda*\nabla f(x) = x_0-1*2*x_0 = -x_0$

In [None]:
import tensorflow as tf
TRAINING_STEPS = 10
LEARNING_RATE = 1
x = tf.Variable(tf.constant(5, dtype=tf.float32), name="x")
y = tf.square(x)

train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(y)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer()) # init all Variables.
    for i in range(TRAINING_STEPS):
        sess.run(train_op)
        x_value = sess.run(x)
        print "After %s iteration(s): x%s is %f."% (i+1, i+1, x_value) 

#### 2. 学习率为0.001的时候，下降速度过慢，在901轮时才收敛到0.823355。

In [None]:
TRAINING_STEPS = 1000
LEARNING_RATE = 0.001
x = tf.Variable(tf.constant(5, dtype=tf.float32), name="x")
y = tf.square(x)

train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(y)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(TRAINING_STEPS):
        sess.run(train_op)
        if i % 100 == 0: 
            x_value = sess.run(x)
            print "After %s iteration(s): x%s is %f."% (i+1, i+1, x_value)

#### 3. 使用指数衰减的学习率，在迭代初期得到较高的下降速度，可以在较小的训练轮数下取得不错的收敛程度。
```python
tf.train.exponential_decay(
    learning_rate, # starter learning rate
    global_step, #
    decay_steps, # 衰减速度
    decay_rate, # 衰减系数
    staircase=False, #  If True decay the learning rate at discrete intervals
    name=None
)
```
And
```python
decayed_learning_rate = learning_rate *
                        decay_rate ^ (global_step / decay_steps)
```

#### For this case:
$\text {decayed_learning_rate} = 0.1 *0.96 ^ {\text {global_step} / 1}$

In [None]:
TRAINING_STEPS = 100
global_step = tf.Variable(0)
LEARNING_RATE = tf.train.exponential_decay(0.1, global_step, 1, 0.96, staircase=True)

x = tf.Variable(tf.constant(5, dtype=tf.float32), name="x")
y = tf.square(x)
train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(y, global_step=global_step)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(TRAINING_STEPS):
        sess.run(train_op)
        if i % 10 == 0:
            LEARNING_RATE_value = sess.run(LEARNING_RATE)
            x_value = sess.run(x)
            print "After %s iteration(s): x%s is %f, learning rate is %f."% (i+1, i+1, x_value, LEARNING_RATE_value)