## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則化來觀察訓練情形。

### 1. Boston House Price Predict

In [1]:
from sklearn import datasets, metrics
from sklearn.linear_model import LinearRegression, Lasso, Ridge
from sklearn.model_selection import train_test_split

#### 1. linear_model

In [2]:
boston = datasets.load_boston()
x_train, x_test, Y_train, Y_test = train_test_split(boston.data, boston.target, test_size=0.2, random_state=0)
linear_model = LinearRegression()
linear_model.fit(x_train, Y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [3]:
Y_pred = linear_model.predict(x_test)

In [4]:
MSE = metrics.mean_squared_error(Y_test, Y_pred)
print(f'linear Regression MSE :　{MSE}')

linear Regression MSE :　33.44897999767654


In [5]:
linear_model.coef_

array([-1.19443447e-01,  4.47799511e-02,  5.48526168e-03,  2.34080361e+00,
       -1.61236043e+01,  3.70870901e+00, -3.12108178e-03, -1.38639737e+00,
        2.44178327e-01, -1.09896366e-02, -1.04592119e+00,  8.11010693e-03,
       -4.92792725e-01])

#### 2. Lasso_model

In [6]:
Lasso_model = Lasso(alpha=1)
Lasso_model.fit(x_train, Y_train)
Y_pred = Lasso_model.predict(x_test)
MSE = metrics.mean_squared_error(Y_test, Y_pred)
print(f'Linear Regression MSE with Lasso : {MSE}')

Linear Regression MSE with Lasso : 41.700096799949


In [7]:
Lasso_model.coef_

array([-0.05889028,  0.05317657, -0.        ,  0.        , -0.        ,
        0.67954962,  0.01684077, -0.6487664 ,  0.198738  , -0.01399421,
       -0.86421958,  0.00660309, -0.73120957])

#### 3. Ridge_model

In [8]:
Ridge_model = Ridge(alpha=1)
Ridge_model.fit(x_train, Y_train)
Y_pred = Ridge_model.predict(x_test)
MSE = metrics.mean_squared_error(Y_test, Y_pred)
print(f'Linear Regression MSE with Ridge : {MSE}')

Linear Regression MSE with Ridge : 34.23160611061538


In [9]:
Ridge_model.coef_

array([-1.16807614e-01,  4.60034842e-02, -2.37620690e-02,  2.27814972e+00,
       -8.55779612e+00,  3.75513528e+00, -1.04143035e-02, -1.28009479e+00,
        2.22037885e-01, -1.15255734e-02, -9.69288272e-01,  8.53481709e-03,
       -4.98849035e-01])

- 結論：
    1. 我們用MSE來算實際值與我們設計出的模型誤差有多少，結果原始的模型具有較少的誤差，而L1、L2都具有較高的誤差值。
    2. 透過coef我們可以看到係數值(weight)的變化，Day039有提到Lasso正規化會使部分係數降至0，而Ridge正規化會使共線性高的係數逼近0。
    3. 透過GridSearchCV我們可以試著找出最佳解的alpha

### 2. Wine Class Predict

In [10]:
from sklearn.linear_model import LogisticRegression
import warnings
warnings.filterwarnings("ignore")

In [11]:
# Load data & split
wine = datasets.load_wine()
x_train, x_test, Y_train, Y_test = train_test_split(wine.data, wine.target, test_size=0.2, random_state=0)

#### 1. LogisticRegression with no regularization
***
- LogisticRegression()中有一個參數penalty，他是用來表示是否使用正規化(regularization)，預設是'l2'
- 這邊先測試未使用正規化(regularization)的準確率如何，因此要把penalty設定為'none'

In [12]:
logist_model = LogisticRegression(penalty='none', solver='lbfgs')
logist_model.fit(x_train, Y_train)
Y_pred = logist_model.predict(x_test)
acc = metrics.accuracy_score(Y_test, Y_pred)
print(f'LogisticRegression Accuracy : {acc}')

LogisticRegression Accuracy : 0.9444444444444444


In [13]:
logist_model.coef_

array([[-2.12719592e+00,  2.39042453e+00,  2.15072872e+00,
        -8.74081406e-01, -5.52566763e-02,  3.17321773e-01,
         3.73948655e+00,  1.43578499e-01, -4.70857387e-01,
         3.69029239e-01, -4.14700281e-01,  2.29891069e+00,
         2.85211037e-02],
       [ 3.20599406e+00, -3.05498475e+00, -2.15690535e+00,
         5.98335701e-01, -8.98709600e-02, -3.73554987e-01,
         4.04315307e-01,  3.20982800e-01,  2.57247200e+00,
        -5.15112790e+00,  1.36376800e+00, -8.32887846e-01,
        -2.22883578e-02],
       [-1.70688972e+01,  1.72329661e+01, -6.22795977e-01,
         7.92439146e-01,  1.40334135e+00, -1.75546061e+01,
        -3.80548547e+01,  7.12283723e-01, -1.59545908e+01,
         3.82874727e+01, -9.31314985e+00, -3.07268026e+01,
         1.12219424e-02]])

#### 2. LogisticRegression with L1 regularization

In [14]:
L1_model = LogisticRegression(penalty='l1')
L1_model.fit(x_train, Y_train)
#Y_pred = L1_model.predict(x_test)
#acc = metrics.accuracy_score(Y_test, Y_pred)
score = L1_model.score(x_test, Y_test)
#print(acc)
print(score)

0.9166666666666666


In [15]:
L1_model.coef_

array([[-1.39142630e-01,  4.86052294e-01,  0.00000000e+00,
        -5.77148947e-01, -5.70397488e-02,  0.00000000e+00,
         2.05670899e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.58573496e-02],
       [ 1.12157620e+00, -1.13375215e+00, -4.53008836e-01,
         2.31851863e-01, -9.08497440e-03,  0.00000000e+00,
         3.37514911e-03,  0.00000000e+00,  1.74087212e+00,
        -2.22211428e+00,  0.00000000e+00,  0.00000000e+00,
        -1.17398421e-02],
       [ 0.00000000e+00,  1.21059728e-01,  0.00000000e+00,
         6.48974001e-03,  1.51530605e-02,  0.00000000e+00,
        -3.17211986e+00,  0.00000000e+00,  0.00000000e+00,
         1.41371031e+00,  0.00000000e+00, -1.60028331e+00,
        -2.04284910e-03]])

#### 3. LogisticRegression with L2 regularization
***
- 因為LogisticRegression預設penalty='l2'，因此不需要做更改

In [16]:
L2_model = LogisticRegression()
L2_model.fit(x_train, Y_train)
Y_pred = L2_model.predict(x_test)
acc = metrics.accuracy_score(Y_test, Y_pred)
#score = L2_model.score(x_test, Y_test)
print(acc)
#print(score)

0.9444444444444444


In [17]:
L2_model.coef_

array([[-3.44368761e-01,  6.08577842e-01,  7.65204263e-01,
        -6.31559488e-01, -4.28209531e-02,  1.59693228e-01,
         1.34856745e+00,  8.34724547e-02, -2.43463699e-01,
        -5.45195152e-02, -1.25504087e-01,  8.04776031e-01,
         1.58161999e-02],
       [ 8.55714873e-01, -1.01077691e+00, -8.81924804e-01,
         2.49844566e-01,  1.32374297e-03,  1.42640370e-01,
         3.33911682e-01,  1.17291960e-01,  1.02891601e+00,
        -1.87763301e+00,  5.67854908e-01,  9.64181237e-02,
        -1.13359307e-02],
       [-2.86370705e-01,  5.21029698e-01,  6.84632488e-02,
         7.24896195e-02,  3.06567693e-02, -4.53372738e-01,
        -1.71648436e+00, -3.32455970e-03, -8.78924626e-01,
         1.23488978e+00, -4.22922225e-01, -1.28403533e+00,
        -1.65477024e-03]])

- 結論：
    1. 從準確率來看，不使用正規化下的準確率和L2正規化的準確率一樣是0.944，而L1正規化的準確率為0.916
    2. 我們可以從coef來觀察不同模式下，係數(weight)是否真的有所不同
        1. Lasso:部分係數為0
        2. Ridge:部分係數逼近0(ex: e-02、e-03)
        3. 我們也可以看到Lasso係數為0的項與Ridge係數逼近0的項相似
        4. Ridge的準確率隨然和未使用正規化的準確率一樣，但係數(weight)卻並不相同