### 4.2.2 自定义损失函数
y_表示正确结果，y代表预测结果。

tf.greater 和 tf.where 方法：

tf.greater: 输入是两个张量，函数会比较两个输入张量中每一个元素的大小，并返回比较结果。当输入张量维度不一致时，tf会进行类似numpy的广播操作（broadcasting）处理。

tf.where: 有三个参数。第一个为选择条件，为true时，选择第二个参数中的值，否则使用第三个参数中的值。注：判断和选择都是元素级别。

**注：书中是 tf.select, 这里需要替换成 tf.where**

In [1]:
import tensorflow as tf
v1 = tf.constant([1.0, 2.0, 3.0, 4.0])
v2 = tf.constant([4.0, 3.0, 2.0, 1.0])

sess = tf.InteractiveSession()
print(tf.greater(v1,v2).eval())
print(tf.where(tf.greater(v1, v2), v1, v2).eval())
sess.close()

[False False  True  True]
[4. 3. 3. 4.]


### 示例：损失函数对模型训练结果的影响

神经网络结构：两个输入，一个输出，中间没有隐藏层

In [2]:
import tensorflow as tf
from numpy.random import RandomState

#### 1. 定义神经网络的相关参数和变量。

In [3]:
batch_size = 8
dataset_size = 128

# 两个输入节点
x = tf.placeholder(tf.float32, shape=(None, 2), name="x-input")
# 回归问题一般只有一个输出节点
y_ = tf.placeholder(tf.float32, shape=(None, 1), name='y-input')

# 定义了一个单层的神经网络前向传播过程，这里就是简单的加权和
w1= tf.Variable(tf.random_normal([2, 1], stddev=1, seed=1), trainable=True)
y = tf.nn.tanh(tf.matmul(x, w1))
# y = tf.matmul(x, w1)


#### 2. 设置自定义的损失函数。

In [5]:
# 定义损失函数使得预测少了的损失大，于是模型应该偏向多的方向预测。
loss_less = 10
loss_more = 1
loss = tf.reduce_sum(tf.where(tf.greater(y, y_), 
                              (y - y_) * loss_more, 
                              (y_ - y) * loss_less))
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

#### 3. 生成模拟数据集。
设置回归的正确值为两个输入的和加上一个随机量。

之所以加上一个随机量是为了加入不可预测的噪音，否则不同损失函数的意义就不大了，因为不同损失函数都会在能完全预测正确的时候最低。

一般来说噪音为一个均值为0的小量，所以这里的噪声设置为 -0.05~0.05 的随机数。

In [6]:
rdm = RandomState(1)
X = rdm.rand(dataset_size,2)
Y = [[x1+x2+(rdm.rand()/10.0-0.05)] for (x1, x2) in X]

#### 4. 训练模型。

In [7]:
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 5000
    for i in range(STEPS):
        start = (i*batch_size) % dataset_size
        end = (i*batch_size) % dataset_size + batch_size
        sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
        if i % 1000 == 0:
            print("After %d training step(s), w1 is: " % (i))
            print(sess.run(w1), "\n")
    print("Final w1 is: \n", sess.run(w1))

After 0 training step(s), w1 is: 
[[-0.81031823]
 [ 1.4855988 ]] 

After 1000 training step(s), w1 is: 
[[-0.02263279]
 [ 2.1988971 ]] 

After 2000 training step(s), w1 is: 
[[0.5252797]
 [2.5851707]] 

After 3000 training step(s), w1 is: 
[[0.9152925]
 [2.815458 ]] 

After 4000 training step(s), w1 is: 
[[1.1496375]
 [2.961983 ]] 

Final w1 is: 
 [[1.3268394]
 [3.0807076]]


#### 5. 重新定义损失函数，使得预测多了的损失大，于是模型应该偏向少的方向预测。

In [8]:
loss_less = 1
loss_more = 10
loss = tf.reduce_sum(tf.where(tf.greater(y, y_), 
                              (y - y_) * loss_more, 
                              (y_ - y) * loss_less))
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 5000
    for i in range(STEPS):
        start = (i*batch_size) % dataset_size
        end = (i*batch_size) % dataset_size + batch_size
        sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
        if i % 1000 == 0:
            print("After %d training step(s), w1 is: " % (i))
            print(sess.run(w1), "\n")
    print("Final w1 is: \n", sess.run(w1))

After 0 training step(s), w1 is: 
[[-0.81031823]
 [ 1.4835987 ]] 

After 1000 training step(s), w1 is: 
[[0.05703859]
 [1.3401766 ]] 

After 2000 training step(s), w1 is: 
[[0.7567741]
 [1.1615651]] 

After 3000 training step(s), w1 is: 
[[1.0743011]
 [1.0788801]] 

After 4000 training step(s), w1 is: 
[[1.073686]
 [1.078645]] 

Final w1 is: 
 [[1.0749779]
 [1.0778998]]


#### 6. 定义损失函数为MSE。

In [9]:
loss = tf.losses.mean_squared_error(y, y_)
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    STEPS = 5000
    for i in range(STEPS):
        start = (i*batch_size) % dataset_size
        end = (i*batch_size) % dataset_size + batch_size
        sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
        if i % 1000 == 0:
            print("After %d training step(s), w1 is: " % (i))
            print(sess.run(w1), "\n")
    print("Final w1 is: \n", sess.run(w1))

After 0 training step(s), w1 is: 
[[-0.81031823]
 [ 1.4855988 ]] 

After 1000 training step(s), w1 is: 
[[-0.14825174]
 [ 2.0850313 ]] 

After 2000 training step(s), w1 is: 
[[0.24612428]
 [2.3461459 ]] 

After 3000 training step(s), w1 is: 
[[0.5294681]
 [2.4651117]] 

After 4000 training step(s), w1 is: 
[[0.7528874]
 [2.4787993]] 

Final w1 is: 
 [[0.9403336]
 [2.384106 ]]
