## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則化來觀察訓練情形。

In [8]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score

In [9]:
# 讀取酒資料集
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)

#建立一個線性回歸模型
reg = linear_model.LinearRegression()
reg.fit(x_train, y_train)
y_pred = reg.predict(x_test)

In [10]:
print(reg.coef_)

[-1.09099883e-01  1.67405249e-02 -2.18753671e-01  4.66803998e-02
  3.20692287e-04  1.24491691e-01 -3.26192950e-01 -1.91327414e-01
  3.72016066e-02  7.57429505e-02 -1.55979636e-01 -2.85946973e-01
 -7.51809245e-04]


In [11]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 0.066969


In [12]:
# 看了一下 Data 資訊，label 是類別型資料，試試看用邏吉思回歸
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)
logreg = linear_model.LogisticRegression()
logreg.fit(x_train, y_train)
y_pred = logreg.predict(x_test)



In [13]:
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy: %2f' % accuracy)

Accuracy: 0.972222


In [14]:
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)

# 建立 Lasso Regression Model，取 alpha=0.1
lasso = linear_model.Lasso(alpha=0.1)
lasso.fit(x_train, y_train)
y_pred = lasso.predict(x_test)

In [15]:
lasso.coef_

array([-0.00000000e+00,  0.00000000e+00, -0.00000000e+00,  3.11003765e-02,
        1.66568969e-04, -0.00000000e+00, -2.76524348e-01,  0.00000000e+00,
       -0.00000000e+00,  9.33441102e-02, -0.00000000e+00, -1.99489077e-02,
       -1.23750027e-03])

In [18]:
# 從上述的結果，可以看到有些係數變成 0 了，Lasso 做了特徵選取
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 0.101752


In [20]:
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)

# 建立 Lasso Regression Model，取 alpha=0.8
lasso = linear_model.Lasso(alpha=0.8)
lasso.fit(x_train, y_train)
y_pred = lasso.predict(x_test)

In [21]:
lasso.coef_

array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
       -0.        , -0.        ,  0.        , -0.        ,  0.00040687,
       -0.        , -0.        , -0.0016002 ])

In [22]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 0.423403


比較 alpha=0.1 和 alpha=0.8 的結果差異，正規化強度越高，選取的特徵會越少，但因特徵過少，MSE 誤差變大。

In [23]:
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)

# 建立 Ridge Regression Model，取 alpha=0.1
ridge = linear_model.Ridge(alpha=0.1)
ridge.fit(x_train, y_train)
y_pred = ridge.predict(x_test)

In [24]:
ridge.coef_

array([-0.10868129,  0.0167836 , -0.21758328,  0.04652221,  0.00034174,
        0.12146469, -0.32432992, -0.17652621,  0.03688698,  0.07593367,
       -0.15402058, -0.28450558, -0.00075365])

In [25]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 0.067096


In [26]:
wine = datasets.load_wine()
x_train, x_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=4)

# 建立 Ridge Regression Model，取 alpha=0.8
ridge = linear_model.Ridge(alpha=0.8)
ridge.fit(x_train, y_train)
y_pred = ridge.predict(x_test)

In [27]:
ridge.coef_

array([-0.10651972,  0.01744693, -0.20563903,  0.04546906,  0.00040835,
        0.10345894, -0.314578  , -0.11444492,  0.03528817,  0.07732728,
       -0.13867993, -0.27637491, -0.0007665 ])

In [28]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 0.067716


In [31]:
# 現在換 Boston 資料集
boston = datasets.load_boston()

# 首先，建立一個 Linear Regression Model
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=4)
reg = linear_model.LinearRegression()
reg.fit(x_train, y_train)
y_pred = reg.predict(x_test)

In [32]:
reg.coef_

array([-1.15966452e-01,  4.71249231e-02,  8.25980146e-03,  3.23404531e+00,
       -1.66865890e+01,  3.88410651e+00, -1.08974442e-02, -1.54129540e+00,
        2.93208309e-01, -1.34059383e-02, -9.06296429e-01,  8.80823439e-03,
       -4.57723846e-01])

In [33]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 25.419587


In [39]:
# 接著建立 Lasso Linear Regression Model，取 alpha=0.1
boston = datasets.load_boston()
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=4)
lasso = linear_model.Lasso(alpha=0.1)
lasso.fit(x_train, y_train)
y_pred = lasso.predict(x_test)

In [40]:
lasso.coef_

array([-0.10618872,  0.04886351, -0.04536655,  1.14953069, -0.        ,
        3.82353877, -0.02089779, -1.23590613,  0.26008876, -0.01517094,
       -0.74673362,  0.00963864, -0.49877104])

In [41]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 26.452889


In [42]:
# 建立 Lasso Linear Regression Model，取 alpha=0.8
boston = datasets.load_boston()
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=4)
lasso = linear_model.Lasso(alpha=0.8)
lasso.fit(x_train, y_train)
y_pred = lasso.predict(x_test)

In [43]:
lasso.coef_

array([-0.07440077,  0.04677454, -0.        ,  0.        , -0.        ,
        1.78447355,  0.0038631 , -0.84210996,  0.24375563, -0.01579438,
       -0.70895873,  0.00814224, -0.65068371])

In [44]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 27.977013


In [45]:
# 建立 Ridge Linear Regression Model，取 alpha=0.1
boston = datasets.load_boston()
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=4)
ridge = linear_model.Ridge(alpha=0.1)
ridge.fit(x_train, y_train)
y_pred = ridge.predict(x_test)

In [46]:
ridge.coef_

array([-1.15381303e-01,  4.72528249e-02,  2.87371589e-03,  3.19642306e+00,
       -1.54713824e+01,  3.89388927e+00, -1.19943742e-02, -1.52347878e+00,
        2.90133016e-01, -1.34816989e-02, -8.93679905e-01,  8.86599187e-03,
       -4.58983115e-01])

In [47]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 25.455212


In [48]:
# 建立 Ridge Linear Regression Model，取 alpha=0.8
boston = datasets.load_boston()
x_train, x_test, y_train, y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=4)
ridge = linear_model.Ridge(alpha=0.8)
ridge.fit(x_train, y_train)
y_pred = ridge.predict(x_test)

In [49]:
ridge.coef_

array([-1.12912392e-01,  4.78428668e-02, -2.01213474e-02,  3.00386014e+00,
       -1.02427847e+01,  3.92747463e+00, -1.66133713e-02, -1.44675151e+00,
        2.77329234e-01, -1.38269648e-02, -8.40149036e-01,  9.11382104e-03,
       -4.65169516e-01])

In [50]:
print('MSE: %2f' % mean_squared_error(y_test, y_pred))

MSE: 25.690490
