## 1.目的
- 采用线性回归算法，根据已知数据建立模型，预测 PE（电力输出)
- $ PE = \theta_0 + \theta_1 * AT + \theta_2 * V + \theta_3 * AP + \theta_4 * RH$

## 2.数据集
- CCPP：燃气蒸汽联合循环(CCPP——Combined Cycle Power Plant)发电机组，简称CCPP
- 地址：http://archive.ics.uci.edu/ml/machine-learning-databases/00294/
- 描述：总共__9568__个样本，每个样本包含__5__列，分别是:AT（温度）、V（压力）、 AP（湿度）、RH（压强）、 PE（电力输出)

## 3.导入相关库

In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split, cross_val_predict
from sklearn import metrics

## 4.读取数据

- 将下载的数据转为_csv_格式,使用Pandas读取
- 查看前5行

In [2]:
ccpp = pd.read_csv("../data/ccpp.csv")
ccpp.head(5)

Unnamed: 0,AT,V,AP,RH,PE
0,8.34,40.77,1010.84,90.01,480.48
1,23.64,58.49,1011.4,74.2,445.75
2,29.74,56.9,1007.15,41.91,438.76
3,19.07,49.69,1007.22,76.79,453.09
4,11.8,40.66,1017.13,97.2,464.43


- 详情

In [3]:
ccpp.describe()

Unnamed: 0,AT,V,AP,RH,PE
count,9568.0,9568.0,9568.0,9568.0,9568.0
mean,19.651231,54.305804,1013.259078,73.308978,454.365009
std,7.452473,12.707893,5.938784,14.600269,17.066995
min,1.81,25.36,992.89,25.56,420.26
25%,13.51,41.74,1009.1,63.3275,439.75
50%,20.345,52.08,1012.94,74.975,451.55
75%,25.72,66.54,1017.26,84.83,468.43
max,37.11,81.56,1033.3,100.16,495.76


## 5. 划分出训练集和测试集
- 默认情况下75%的样本数据被作为训练集，其余25%被作为测试集

In [4]:
X = ccpp[['AT', 'V', 'AP', 'RH']]
y = ccpp[['PE']]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=66)

## 6.训练模型

In [5]:
linreg = LinearRegression()
linreg.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

- 截距和系数

In [6]:
linreg.intercept_, linreg.coef_

(array([459.05715008]),
 array([[-1.97534206, -0.23747628,  0.05809034, -0.16157705]]))

## 7.预测

In [7]:
y_pred = linreg.predict(X_test)

## 8.评估模型

- $ R^2 $ score

In [8]:
metrics.r2_score(y_test, y_pred)

0.9302147549213905

- MSE 均方误差

In [9]:
metrics.mean_squared_error(y_test, y_pred)

20.39029476832356

- MAE 均绝对值误差

In [10]:
metrics.mean_absolute_error(y_test, y_pred)

3.629817335844457