# 波士顿房价预测线性回归实践

按课程案例，动手完成编码实践。

通过梯度下降优化器进行优化，尝试采用不同的学习率和训练轮数等超参数，记录训练后的损失值和W、b变量值。

提交要求：

1、至少5次不同超参数的运行结果的记录文档（word格式或者txt格式）

2、你认为最优的一次带运行结果的源代码文件（.ipynb 格式）

3、以上两个文件一起压缩为一个压缩文件后作为附件上传
<br><br>

评价标准：

1、完成案例中的代码，有完整的代码，模型能运行优化出结果，8分；

2、调整过超参数，记录文件中有至少5组数据，2分；

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()

In [2]:
df = pd.read_csv('boston.csv')
y_data = df.values[:, 12]
x_data = df.values[:, :12]
# 对特征进行归一化处理
x_data = (x_data - x_data.min(axis=0)) / (x_data.max(axis=0) - x_data.min(axis=0))

In [3]:
def reset(b):
    tf.reset_default_graph()
    x = tf.placeholder(tf.float32, [None, 12], name='X')  # 12个特征
    y = tf.placeholder(tf.float32, [None, 1], name='Y')  # 标签
    with tf.name_scope('Model'):
        # 初始化12个特征的权重
        w = tf.Variable(tf.random_normal([12, 1], stddev=0.01), name='W')
        # 初始化偏置
        b = tf.Variable(b, name='b')
        # 多元线性模型
        def model(x, w, b):
            return tf.matmul(x, w) + b
        # 向前计算节点
        pred = model(x, w, b)
    # 损失函数
    with tf.name_scope('LossFunction'):
        loss_function = tf.reduce_mean(tf.pow(y - pred, 2))  # 均方误差
    # 梯度下降优化器
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss_function)
    sess = tf.Session()
    # 初始化变量
    init = tf.global_variables_initializer()
    sess.run(init)
    return sess, x, y, w, b, optimizer, loss_function

## 探索超参
- 训练轮数train_epochs取5, 50, 100
- 学习率learning_rates取0.1, 0.01, 0.001
- 初始偏置b取1.0, 30.0, 100.0

In [4]:
# train_epochs = [5, 50, 100]
# learning_rates = [0.1, 0.01, 0.001]
# bs = [1., 30., 100.]
# f = open('result.txt', 'w', encoding='utf-8')
# for learning_rate in learning_rates:
#     for b0 in bs:
#         sess, x, y, w, b, optimizer, loss_function = reset(b0)
#         for epoch in range(max(train_epochs)):
#             loss_sum = 0
#             for xs, ys in zip(x_data, y_data):
#                 xs = xs.reshape(1, 12)
#                 ys = ys.reshape(1, 1)
#                 _, loss = sess.run([optimizer, loss_function], feed_dict={x: xs, y: ys})
#                 loss_sum += loss
#             b_hat = b.eval(sess)
#             w_hat = w.eval(sess)
#             loss_average = loss_sum / len(y_data)
#             x_data, y_data = shuffle(x_data, y_data)
#             if epoch + 1 in train_epochs:
#                 text = (f'超参：train_epochs = {epoch+1}, learning_rate = {learning_rate}, 初始偏置b = {b0}\n'
#                         + f'训练后：loss = {loss_average:.5f}, b = {b_hat:.5f},\n'
#                         + f'w = {list(np.around(w_hat[:, 0], decimals=2))}\n\n')
#                 f.write(text.encode('utf-8').decode())
# f.close()

**当训练轮数train_epochs取100，学习率learning_rates取0.01，初始偏置b取100.0时损失值最小**

In [48]:
learning_rate = 0.01
train_epochs = 100
b0 = 100.0
sess, x, y, w, b, optimizer, loss_function = reset(b0)
for epoch in range(train_epochs):
    loss_sum = 0
    for xs, ys in zip(x_data, y_data):
        xs = xs.reshape(1, 12)
        ys = ys.reshape(1, 1)
        _, loss = sess.run([optimizer, loss_function], feed_dict={x: xs, y: ys})
        loss_sum += loss
    b_hat = b.eval(sess)
    w_hat = w.eval(sess)
    loss_average = loss_sum / len(y_data)
    x_data, y_data = shuffle(x_data, y_data)
print(f'超参：train_epochs = {epoch+1}, learning_rate = {learning_rate}, 初始偏置b = {b0}\n'
      + f'训练后：loss = {loss_average:.5f}, b = {b_hat:.5f},\n'
      + f'w = {list(np.around(w_hat[:, 0], decimals=2))}')

超参：train_epochs = 100, learning_rate = 0.01, 初始偏置b = 100.0
训练后：loss = 23.63922, b = 31.31480,
w = [-10.58, 4.61, 0.58, 2.82, -8.92, 19.03, 0.58, -16.5, 7.05, -6.54, -8.7, -20.03]
