# 线性回归中使用梯度下降法(向量化)

![梯度下降法的向量化.png](https://i.loli.net/2018/10/07/5bb9fa0a5d844.png)

In [1]:
import numpy as np
from sklearn import datasets

In [5]:
boston = datasets.load_boston()

In [15]:
X = boston.data
y = boston.target

X = X[y < 50.0]
y = y[y < 50.0]
print(X.shape)
print(y.shape)

(490, 13)
(490,)


In [16]:
from playML.model_selection import train_test_split

In [17]:
X_train, X_test, y_train, y_test  = train_test_split(X, y, seed=666)

## 使用常规方法

In [18]:
from playML.LinearRegression import LinearRegression

In [19]:
lin_reg1 = LinearRegression()

In [20]:
%time lin_reg1.fit_normal(X_train, y_train)

CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 27.8 ms


LinearRegression()

In [21]:
lin_reg1.score(X_test, y_test)

0.81298026026585901

## 使用梯度下降法(向量化)

In [22]:
lin_reg2 = LinearRegression()

In [23]:
lin_reg2.fit_gd(X_train, y_train) # 因为数据集的偏差很大，所以会导致数据溢出

  return np.sum((y - X_b.dot(theta)) ** 2) / len(y)
  if (abs(J(theta, X_b, y) - J(last_theta, X_b, y)) < epsilon):


LinearRegression()

In [26]:
lin_reg2.fit_gd(X_train, y_train, eta=0.000001) # eta太小，结果的准确率很低

LinearRegression()

In [27]:
lin_reg2.score(X_test, y_test) # 可能陷入了局部最优解，最好多循环一些

0.27556634853389195

In [28]:
%time lin_reg2.fit_gd(X_train, y_train, eta=0.000001, n_iters=1e6)

CPU times: user 56.2 s, sys: 70 ms, total: 56.3 s
Wall time: 56.5 s


LinearRegression()

In [29]:
lin_reg2.score(X_test, y_test)

0.75418523539807625

## 使用梯度下降法之前，最好要先进行数据归一化(强烈推荐)

In [30]:
from sklearn.preprocessing import StandardScaler

In [31]:
standardScaler = StandardScaler()

In [32]:
standardScaler.fit(X_train)

StandardScaler(copy=True, with_mean=True, with_std=True)

In [33]:
X_train_standard = standardScaler.transform(X_train)

In [34]:
lin_reg3 = LinearRegression()

In [36]:
%time lin_reg3 = lin_reg3.fit_gd(X_train_standard, y_train)

CPU times: user 280 ms, sys: 20 ms, total: 300 ms
Wall time: 301 ms


In [37]:
X_test_standard = standardScaler.transform(X_test)

In [38]:
lin_reg3.score(X_test_standard, y_test)

0.81298806201222351

## 梯度下降法的优势

> 数据量越大，梯度下降法相对于线性回归法的耗时优势越大