## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則化來觀察訓練情形。

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score
import pandas as pd

def sklearn_to_df(sklearn_dataset):
    df = pd.DataFrame(sklearn_dataset.data, columns=sklearn_dataset.feature_names)
    df['target'] = pd.Series(sklearn_dataset.target)
    return df
boston = datasets.load_boston()
df_boston = sklearn_to_df(boston)
df_boston.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,target
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [2]:
X = df_boston
y = df_boston['target'][:,np.newaxis]
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=4)
regr = linear_model.LinearRegression()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_test)
print('Coefficients: ', regr.coef_)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

Coefficients:  [[-8.29947746e-17  2.01227923e-16 -2.66822155e-16  3.13577103e-15
  -3.45870996e-15  2.32975396e-15  8.58688121e-17  1.09016528e-16
   2.60859043e-16 -6.93889390e-17  1.00668172e-16  2.77555756e-17
   7.30101743e-16  1.00000000e+00]]
Mean squared error: 0.00


In [3]:
lasso = linear_model.Lasso(alpha=1.0)
lasso.fit(x_train, y_train)
y_pred = lasso.predict(x_test)
print('Coefficients: ', lasso.coef_)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

Coefficients:  [-0.00000000e+00  0.00000000e+00 -0.00000000e+00  0.00000000e+00
 -0.00000000e+00  0.00000000e+00 -0.00000000e+00  0.00000000e+00
  0.00000000e+00 -3.19531926e-04 -0.00000000e+00  1.33469959e-04
 -0.00000000e+00  9.84989873e-01]
Mean squared error: 0.01


In [4]:
ridge = linear_model.Ridge(alpha=1.0)
ridge.fit(x_train, y_train)
y_pred = lasso.predict(x_test)
print('Coefficients: ', ridge.coef_)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

Coefficients:  [[-1.17419242e-05  4.74978990e-06 -1.11046154e-06  2.77109462e-04
  -9.62480266e-04  3.51500605e-04 -4.25294647e-07 -1.33239148e-04
   2.89777093e-05 -1.26786629e-06 -8.16878498e-05  9.45876429e-07
  -5.21182540e-05  9.99904138e-01]]
Mean squared error: 0.01
