## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則化來觀察訓練情形。

In [19]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

In [20]:
bos = datasets.load_boston()

In [21]:
print(type(bos))
print(type(bos.target))
print(bos.data.shape)

<class 'sklearn.utils.Bunch'>
<class 'numpy.ndarray'>
(506, 13)


In [30]:
x_train,x_test,y_train,y_test = train_test_split(bos.data,bos.target,test_size=0.5,random_state=4)
print(x_train.shape,y_train.shape)

(253, 13) (253,)


In [34]:
lasso = linear_model.Lasso(alpha=0.2)

In [35]:
lasso.fit(x_train,y_train)
y_pred = lasso.predict(x_test)
y_pred

array([11.27298932, 26.40475464, 17.34176435, 13.97717024, 35.58475963,
       24.3288889 , 32.27535942, 19.11058768, 17.84177763, 22.09988206,
       29.15294476, 28.16313485, 19.3139107 , 27.77147548, 21.80609439,
       15.27803522, 21.41695149, 11.52971471, 11.40644606, 14.8566301 ,
        7.98412732, 19.77478498, 20.2454014 , 22.23414906, 17.17957211,
       20.28162988, 13.94011206, 14.69261675, 19.6003897 , 17.08629898,
       14.01561144, 24.45507163, 33.86649335, 21.97055998, 18.11362063,
       19.85393829, 29.97756259, 34.32544891, 24.92413571, 23.90049736,
       35.54970074, 30.64733641, 20.35052923, 31.23269382, 29.01681955,
       25.1463788 , 39.28463948, 17.55067061, 20.98526033, 23.38093446,
       33.58787809, 24.45778069, 18.54225885, 26.55792007, 13.63340016,
       22.80176109, 24.31358496, 32.99597955, 17.92634493, 32.69207431,
       16.43632294, 20.75580708, 30.5727099 , 14.20517233, 38.06578376,
       28.6543674 , 28.8308248 ,  9.16020244, 18.95343094, 21.75

In [36]:
lasso.coef_

array([-0.07229394,  0.05667466, -0.05718995,  0.        , -0.        ,
        3.22672062, -0.0084271 , -1.11015121,  0.29808871, -0.01663016,
       -0.58291083,  0.01153249, -0.52913977])

In [37]:
print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

Mean squared error: 29.38


In [48]:
bos = datasets.load_boston()
x_train,x_test,y_train,y_test = train_test_split(bos.data,bos.target,test_size=0.2,random_state=4)
print(x_train.shape,y_train.shape)

(404, 13) (404,)


In [49]:
ridge = linear_model.Ridge(alpha=0.5)

In [50]:
ridge.fit(x_train,y_train)

Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None,
      normalize=False, random_state=None, solver='auto', tol=0.001)

In [51]:
y_pred = ridge.predict(x_test)
y_pred

array([11.83957277, 26.87329086, 17.4128713 , 17.62760163, 36.7740556 ,
       25.29983385, 31.31918785, 19.40518212, 19.22459034, 23.70097477,
       28.67347769, 28.38181644, 19.05666085, 31.97167805, 21.64133647,
       15.41673457, 21.18188535, 11.67136262, 10.92236629, 13.747532  ,
        5.65840493, 18.20066137, 20.53521654, 22.37566603, 16.48444851,
       20.22870326, 17.33130935, 14.24036495, 20.76732026, 17.3218188 ,
       14.55110885, 23.6442802 , 34.56512039, 22.10164115, 16.92290607,
       20.10096203, 30.68517554, 35.75671748, 23.55438254, 24.52238616,
       36.89780762, 32.07148693, 19.31143935, 32.1475751 , 33.03291903,
       25.31032381, 40.57226863, 17.92299265, 19.65753166, 23.73058826,
       33.39417731, 25.8612625 , 18.14605661, 27.9217316 , 13.37020437,
       23.20026175, 24.43170803, 33.44129389, 17.00298751, 36.31915439,
       15.7541414 , 19.05010756, 31.93929328, 15.26550109, 39.20230142,
       27.60370805, 31.63593468,  9.98107329, 18.88607584, 21.65

In [52]:
print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

Mean squared error: 25.60
