## [作業重點]
使用 Sklearn 中的 Lasso, Ridge 模型，來訓練各種資料集，務必了解送進去模型訓練的**資料型態**為何，也請了解模型中各項參數的意義。

機器學習的模型非常多種，但要訓練的資料多半有固定的格式，確保你了解訓練資料的格式為何，這樣在應用新模型時，就能夠最快的上手開始訓練！

## 練習時間
試著使用 sklearn datasets 的其他資料集 (boston, ...)，來訓練自己的線性迴歸模型，並加上適當的正則話來觀察訓練情形。

In [51]:
from sklearn import datasets, linear_model
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import mean_squared_error, make_scorer

In [43]:
boston = datasets.load_boston()
X = boston['data']
y = boston['target']
feature_names = boston['feature_names']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

parameters = {'alpha' : [1.0, 0.1, 0.01, 0.001]}


In [37]:
# 線性迴歸
regression = linear_model.LinearRegression()
regression.fit(X_train, y_train)

# 進行預測
y_pred = regression.predict(X_test)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

Mean squared error: 24.31


In [58]:
# Lasso, 把mean_squared_error轉成愈小愈好的評分器(scorer)
regression = GridSearchCV(linear_model.Lasso(), parameters, cv=5, scoring = make_scorer(mean_squared_error, greater_is_better=False))
regression.fit(X_train, y_train)
print(regression.best_params_, regression.best_score_)

# 進行預測
y_pred = regression.predict(X_test)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

{'alpha': 0.001} -23.6543125142037
Mean squared error: 24.31


In [59]:
# Ridge, 把mean_squared_error轉成愈小愈好的評分器(scorer)
regression = GridSearchCV(linear_model.Ridge(), parameters, cv=5,scoring = make_scorer(mean_squared_error, greater_is_better=False))
regression.fit(X_train, y_train)
print(regression.best_params_, regression.best_score_)

# 進行預測
y_pred = regression.predict(X_test)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))

{'alpha': 0.001} -23.65071934286278
Mean squared error: 24.31
