### 岭回归 ridge regression
岭回归即带L2正则化的线性回归，可以解决过拟合的问题

#### API
~~~python
sklearn.linear_model.Ridge
~~~
- 具有L2正则化的线性回归
- alpha：正则化强度，惩罚性系数，即$\lambda$
  - 建议取值：0-1，1-10
- solver：求解器
  - sag：如果数据集、特征都较大，会选择随机梯度下降优化
- normalize：数据是否进行标准化
- Ridge.coef_：回归权重
- Ridge.intercept_：回归偏置
> Ridge方法相当于SGDRegressor(penalty='l2',loss='squared_error')  
> 只不过SGDRegressor实现了随机梯度下降学习，但是Ridge实现了SAG。
***正则化程度，对结果的影响***

![不同的alpha对权重的影响](src/sphx_glr_plot_ridge_path_001.png)

- 正则化力度越大，权重系数越小
- 正则化力度越小，权重系数越大

In [21]:
# 获取数据集
import pandas as pd
import numpy as np
from sklearn.datasets import load_boston
boston = load_boston()
boston.data.shape
# 划分数据集
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(boston.data,boston.target,random_state=22)
# 标准化
from sklearn.preprocessing import StandardScaler
std = StandardScaler()
x_train = std.fit_transform(x_train)
x_test = std.transform(x_test)



    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np


        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    :func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows::

        from sklearn.datasets import fetch_california_h

In [23]:
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error


clf = Ridge()
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
ridge_err = mean_squared_error(y_test,y_pred)
print("coef_",clf.coef_)
print("intercept_",clf.intercept_)
print("err",ridge_err)

coef_ [-0.63591916  1.12109181 -0.09319611  0.74628129 -1.91888749  2.71927719
 -0.08590464 -3.25882705  2.41315949 -1.76930347 -1.74279405  0.87205004
 -3.89758657]
intercept_ 22.62137203166228
err 20.65644821435496


In [30]:
# 网格搜索对梯度下降调参
from sklearn.model_selection import GridSearchCV
estimator = Ridge()
params = {
    "alpha":[0.8,0.5,0.2,0.1],
    "max_iter":[0,10,20,30,40,50,100,200,500,800,1000,5000,10000,20000,50000]
}
grid = GridSearchCV(estimator,params,cv=3,n_jobs=-1)
grid.fit(x_train, y_train)
print("best_params_",grid.best_params_)
print("best_score_",grid.best_score_)


ridgeGS = Ridge(alpha=grid.best_params_["alpha"],max_iter=grid.best_params_["max_iter"])
ridgeGS.fit(x_train, y_train)


print("ridgeGS coef_:",ridgeGS.coef_)
print("ridgeGS intercept_:",ridgeGS.intercept_)


ridgeGS_pred = ridgeGS.predict(x_test)
ridgeGS_err = mean_squared_error(y_test,y_pred)
print("ridgeGS_err",ridgeGS_err)

best_params_ {'alpha': 0.8, 'max_iter': 0}
best_score_ 0.6882262093590968
ridgeGS coef_: [-0.63829726  1.12608043 -0.08671537  0.74549151 -1.92600823  2.71728832
 -0.08424494 -3.26674542  2.43047046 -1.78614331 -1.74430576  0.87231885
 -3.90072162]
ridgeGS intercept_: 22.62137203166228
ridgeGS_err 20.65644821435496
