#### 问题的提出
通常一个样本会有多个特征，这时我们可以使用多元线性回归

#### 多元线性回归的矩阵表示
$$
\hat{y} = X_b\cdot\theta
$$
<br>
$$
其中X_b = \begin{pmatrix} 
1 & X_1^{(1)} & X_2^{(1)} & \dots & X_n^{(1)} \\
1 & X_1^{(2)} & X_2^{(2)} & \dots & X_n^{(1)} \\
... & ... & ... & ... & ...\\
\end{pmatrix} 
$$
<br>
$$
\theta = \begin{pmatrix}
\theta_0 \\
\theta_1 \\
... \\
\theta_n
\end{pmatrix}
$$
目标仍然是使${\sum\limits_{i=1}^n(y_i - \hat{y}_i)^2}$尽可能小，也就是使$(y - X_b\cdot\theta)^T(y - X_b\cdot\theta)$

#### 通过最小二乘法得到的多元线性回归的正规方程解（Normal Equation）
$$\theta = (X_b^T \cdot X_b)^{-1}X_b^T \cdot y$$
其中$\theta_0是截距$(intercept),
$\theta_1 \dots \theta_n$是系数(coefficient)

#### 正规方程解的时间复杂度高
O(n^3)
优化后O(n^2.4)

In [15]:
#### 实现一个多元线性回归
class LinearRegression(object):
    def __init__(self):
        self.coef = None
        self.intercept = None
        self._theta = None
        
    # 此时x_test是一个matrix，每行代表一个样本，每列代表一个特征值
    def fit(self, x_train, y_train):
        x_b = np.hstack([np.ones((len(x_train), 1)), x_train])
        theta = np.linalg.inv(x_b.T.dot(x_b)).dot(x_b.T).dot(y_train)
        self.coef = theta[0]
        self.intercept = theta[1:]
        self._theta = theta
        return self

    def predict(self, x_predict):
        x_b = np.hstack([np.ones((len(x_predict), 1)), x_predict])
        y_hat = x_b.dot(self._theta)
        return y_hat

    def score(self, y_test, y_hat):
        r_squared = 1 - (np.sum((y_test - y_hat)**2)) / np.sum(((y_test - np.mean(y_test))**2))
        return r_squared

In [16]:
import numpy as np
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
diabetes = load_diabetes() 
print(diabetes.feature_names)
x_train, x_test, y_train, y_test = train_test_split(diabetes.data, diabetes.target, test_size = 0.25)
lreg1 = LinearRegression()
lreg1.fit(x_train, y_train)
print(lreg1.coef)
print(lreg1.intercept)

['age', 'sex', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
153.37863243927978
[ -42.88663576 -238.47326425  535.12761562  332.71480091 -403.21337006
  101.0201579   -62.06773609  158.1621055   654.5748687    68.83769893]


In [17]:
y_hat = lreg1.predict(x_test)
print(y_hat)

[ 86.88238758 167.6106795  155.45672323 184.86596327 212.64072984
 189.15128679  75.84946832  74.54306265 215.5875488  204.14928884
 100.83366553 115.29992265 217.88928886 214.40195645 160.15363927
 180.40546156 259.4548514  268.59561159 110.31646324 150.82507154
 132.09018769  74.03621755 158.12279477 112.33430171 200.11333622
 133.63310459 245.2204352   81.30336712 129.36587662  80.79751765
 108.37121903  70.08978848 137.15405925 305.62463521  82.93027129
  84.45917548 195.52720445 189.72314944  75.4555974  169.15929234
  83.4016636  126.90928864 123.55674986 130.05021835 160.26077918
 132.75104073 198.34764372 126.52254562 103.78874585 140.43478971
 210.21495789 105.14994536 140.78023377 271.06796377 170.45254201
 124.01579401 151.99751627 185.09520448  91.31293901 131.80284958
 143.34286805 116.57341257 119.39799468 170.41380876 195.03500819
 195.19952277 222.72151049 108.28681449 192.79177924  62.73137431
 222.12416924  78.29779784  49.26468561 272.85523655 144.96377712
 158.98337

In [18]:
lreg1.score(y_test, y_hat)

0.47243325274544