## [作業重點]
了解如何使用 Sklearn 中的 hyper-parameter search 找出最佳的超參數

### 作業
請使用不同的資料集，並使用 hyper-parameter search 的方式，看能不能找出最佳的超參數組合

In [1]:
from sklearn import datasets
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn import metrics
import warnings
warnings.filterwarnings('ignore')

In [2]:
mnist = datasets.load_digits()
x_train, x_test, Y_train, Y_test = train_test_split(mnist.data, mnist.target, test_size=0.2, random_state=0)

In [3]:
GBD = GradientBoostingClassifier(random_state=5)
GBD.fit(x_train, Y_train)
acc = GBD.score(x_test, Y_test)
print(f"[No tune hyper-parameter]GradientBoostingClassifier accuracy:{acc}")

[No tune hyper-parameter]GradientBoostingClassifier accuracy:0.9583333333333334


In [4]:
hyper_per = {'n_estimators':[ 100, 150, 200], 'max_depth':[1, 3, 5]}

[The scoring parameter: defining model evaluation rules](https://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter)

In [5]:
grid_search = GridSearchCV(GBD, hyper_per, scoring='accuracy', n_jobs=-1)
grid_result = grid_search.fit(x_train, Y_train)

In [6]:
print("Best Accuracy: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

Best Accuracy: 0.955463 using {'max_depth': 3, 'n_estimators': 150}


In [7]:
GBD_best_per = GradientBoostingClassifier(n_estimators=grid_result.best_params_['n_estimators'], max_depth=grid_result.best_params_['max_depth'])
GBD_best_per.fit(x_train, Y_train)
acc = GBD_best_per.score(x_test, Y_test)
print(f"[Tune hyper-parameter]GradientBoostingClassifier accuracy:{acc}")

[Tune hyper-parameter]GradientBoostingClassifier accuracy:0.9611111111111111


- 結論：透過GridSearchCV找到最佳的參數，MNIST預測準確率從0.958提升至0.961。