![](多元线性回归图示.png)

![](目标.png)

$\hat{y}^{(i)}=\theta_{0}+\theta_{1} X_{1}^{(i)}+\theta_{2} X_{2}^{(i)}+\ldots+\theta_{n} X_{n}^{(i)}$  
$\theta=\left(\theta_{0}, \theta_{1}, \theta_{2}, \ldots, \theta_{n}\right)^{T}$  
$\hat{y}^{(i)}=\theta_{0} X_{0}^{(i)}+\theta_{1} X_{1}^{(i)}+\theta_{2} X_{2}^{(i)}+\ldots+\theta_{n} X_{n}^{(i)}, X_{0}^{(i)} \equiv 1$  
$X^{(i)}=\left(X_{0}^{(i)}, X_{1}^{(i)}, X_{2}^{(i)}, \ldots, X_{n}^{(i)}\right)$  
$\hat{y}^{(i)}=X^{(i)} \cdot \theta$  

$X_{b}=\left(\begin{array}{ccccc}1 & X_{1}^{(1)} & X_{2}^{(1)} & \dots & X_{n}^{(1)} \\ 1 & X_{1}^{(2)} & X_{2}^{(2)} & \dots & X_{n}^{(2)} \\ \dots & & & & \dots \\ 1 & X_{1}^{(m)} & X_{2}^{(m)} & \dots & X_{n}^{(m)}\end{array}\right)$  
$\hat{y}=X_{b} \cdot \theta$

![](目标转换.png)

多元线性回归的正规方程解（Normal Equation）：  
$\theta=\left(X_{b}^{T} X_{b}\right)^{-1} X_{b}^{T} y$  
问题：时间复杂度高  
优点：不需要对数据进行归一化处理  
思考为什么？

![](返回值.png)

 ## 实现多元线性回归模型
 

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets

In [2]:
boston = datasets.load_boston()
X = boston.data
y = boston.target
X = X[y < 50]
y = y[y < 50]
print("X shape:" + str(X.shape))
print("y shape:" + str(y.shape))

X shape:(490, 13)
y shape:(490,)


In [3]:
from Machine_Learning.LinearReg.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,seed=666)

In [4]:
from Machine_Learning.LinearReg.LinearRegression import LinearRegression
reg = LinearRegression()
reg.fit_normal(X_train,y_train)

LinearRegression()

In [5]:
reg.coef_

array([-1.12728076e-01,  3.83088307e-02, -4.09966537e-02,  7.27425361e-01,
       -1.39378594e+01,  3.37684332e+00, -2.39762421e-02, -1.21315896e+00,
        2.73164472e-01, -1.40027977e-02, -8.62432754e-01,  5.37440212e-03,
       -3.59762900e-01])

In [6]:
reg.interception_

36.81014683462021

In [8]:
reg.score(X_test,y_test)

0.7989582352420643

### 使用sklearn

In [4]:
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()

In [5]:
lin_reg.fit(X_train,y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [6]:
lin_reg.coef_

array([-1.12728076e-01,  3.83088307e-02, -4.09966537e-02,  7.27425361e-01,
       -1.39378594e+01,  3.37684332e+00, -2.39762421e-02, -1.21315896e+00,
        2.73164472e-01, -1.40027977e-02, -8.62432754e-01,  5.37440212e-03,
       -3.59762900e-01])

In [7]:
lin_reg.intercept_

36.81014683464063