## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則話來觀察訓練情形。

In [19]:
from sklearn import datasets
from sklearn.linear_model import LinearRegression, LogisticRegression, Ridge, Lasso
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

In [7]:
wine = datasets.load_wine()
print(wine["DESCR"])
print(wine)

.. _wine_dataset:

Wine recognition dataset
------------------------

**Data Set Characteristics:**

    :Number of Instances: 178 (50 in each of three classes)
    :Number of Attributes: 13 numeric, predictive attributes and the class
    :Attribute Information:
 		- Alcohol
 		- Malic acid
 		- Ash
		- Alcalinity of ash  
 		- Magnesium
		- Total phenols
 		- Flavanoids
 		- Nonflavanoid phenols
 		- Proanthocyanins
		- Color intensity
 		- Hue
 		- OD280/OD315 of diluted wines
 		- Proline

    - class:
            - class_0
            - class_1
            - class_2
		
    :Summary Statistics:
    
                                   Min   Max   Mean     SD
    Alcohol:                      11.0  14.8    13.0   0.8
    Malic Acid:                   0.74  5.80    2.34  1.12
    Ash:                          1.36  3.23    2.36  0.27
    Alcalinity of Ash:            10.6  30.0    19.5   3.3
    Magnesium:                    70.0 162.0    99.7  14.3
    Total Phenols:                0

In [30]:
train_X, test_X, train_Y, test_Y = train_test_split(wine.data, wine.target, test_size=0.2)

[[-5.96819058e-01  5.94649132e-01  1.07887629e+00 -5.64554121e-01
  -2.11910288e-02  1.46764077e-02  1.14532706e+00  8.60601851e-02
  -5.23119951e-01  4.45043314e-02 -9.55097199e-02  8.08739612e-01
   1.58069169e-02]
 [ 7.29594890e-01 -1.07082460e+00 -8.53596854e-01  2.44216050e-01
   1.28623279e-02  1.68373396e-01  5.88358770e-01  2.19227784e-01
   7.30758128e-01 -1.70601453e+00  8.20686923e-01  2.59156837e-01
  -1.27496729e-02]
 [-2.89873320e-01  6.96201752e-01  3.79938777e-02  1.80011922e-01
   6.04702809e-03 -7.01220411e-01 -1.68151949e+00 -4.91318262e-02
  -6.15932755e-01  1.05340204e+00 -4.69173062e-01 -1.28224868e+00
  -1.52868266e-04]]
MSE=0.027777777777777776
R-square=0.96




In [31]:
lgreg = LogisticRegression()
lgreg.fit(train_X, train_Y)
print(lgreg.coef_)
y_pred = lgreg.predict(test_X)
print(f"MSE={mean_squared_error(y_pred, test_Y)}")
print("R-square=%.2f" %r2_score(y_pred, test_Y))

[[-5.96819058e-01  5.94649132e-01  1.07887629e+00 -5.64554121e-01
  -2.11910288e-02  1.46764077e-02  1.14532706e+00  8.60601851e-02
  -5.23119951e-01  4.45043314e-02 -9.55097199e-02  8.08739612e-01
   1.58069169e-02]
 [ 7.29594890e-01 -1.07082460e+00 -8.53596854e-01  2.44216050e-01
   1.28623279e-02  1.68373396e-01  5.88358770e-01  2.19227784e-01
   7.30758128e-01 -1.70601453e+00  8.20686923e-01  2.59156837e-01
  -1.27496729e-02]
 [-2.89873320e-01  6.96201752e-01  3.79938777e-02  1.80011922e-01
   6.04702809e-03 -7.01220411e-01 -1.68151949e+00 -4.91318262e-02
  -6.15932755e-01  1.05340204e+00 -4.69173062e-01 -1.28224868e+00
  -1.52868266e-04]]
MSE=0.027777777777777776
R-square=0.96




In [33]:
ridge = Ridge(alpha=1)
ridge.fit(train_X, train_Y)
y_pred2 = ridge.predict(test_X)
print(ridge.coef_)
print(f"MSE={mean_squared_error(test_Y, y_pred2)}")
print("R-square=%.2f" %r2_score(test_Y, y_pred2))

[-0.10323653  0.03361135 -0.13982234  0.03944378 -0.00035861  0.11253275
 -0.30879478 -0.1113211   0.02212428  0.0654157  -0.18715088 -0.29450906
 -0.00070602]
MSE=0.06165316490875786
R-square=0.88


In [35]:
lasso = Lasso(alpha=0.1)
lasso.fit(train_X, train_Y)
y_pred3 = lasso.predict(test_X)
print(lasso.coef_)
print(f"MSE={mean_squared_error(test_Y, y_pred3)}")
print("R-square=%.2f" %r2_score(test_Y, y_pred3))

[-0.          0.         -0.          0.02878612  0.         -0.
 -0.27335835  0.         -0.          0.08738423 -0.         -0.03614732
 -0.0011859 ]
MSE=0.09037454555869318
R-square=0.86
