## [作業重點]
確保你了解隨機森林模型中每個超參數的意義，並觀察調整超參數對結果的影響

## 作業

1. 試著調整 RandomForestClassifier(...) 中的參數，並觀察是否會改變結果？
2. 改用其他資料集 (boston, wine)，並與回歸模型與決策樹的結果進行比較

In [1]:
from sklearn import datasets, metrics
from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor
from sklearn.ensemble import RandomForestClassifier,RandomForestRegressor
from sklearn.model_selection import train_test_split

In [2]:
print("#"*10,"classification","#"*10)
for data in [datasets.load_iris(),datasets.load_wine(),datasets.load_breast_cancer()]:
    print("="*6,data['DESCR'].splitlines()[0],"shape is",data.data.shape)
    for model in [RandomForestClassifier(n_estimators=20, criterion='gini',max_depth=None,min_samples_split=2,min_samples_leaf=1),RandomForestClassifier(n_estimators=10, criterion='entropy',max_depth=3,min_samples_split=0.6,min_samples_leaf=0.1),DecisionTreeClassifier(criterion='gini',max_depth=None,min_samples_split=2,min_samples_leaf=1),DecisionTreeClassifier(criterion='entropy',max_depth=3,min_samples_split=0.6,min_samples_leaf=0.1)]:
        print("-"*6,type(model).__name__,":",model.get_params())
        x_train, x_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=4)
        model.fit(x_train, y_train)
        y_pred = model.predict(x_test)
        print("Acuuracy: ", metrics.accuracy_score(y_test, y_pred))
        print("Feature importance: ", dict(zip(data.feature_names,model.feature_importances_)))
print("#"*10,"regression","#"*10)
for data in [datasets.load_boston(),datasets.load_diabetes()]:
    print("="*6,data['DESCR'].splitlines()[0],"shape is",data.data.shape)
    for model in [RandomForestRegressor(n_estimators=20, criterion='mse',max_depth=None,min_samples_split=2,min_samples_leaf=1),RandomForestRegressor(n_estimators=10, criterion='mae',max_depth=3,min_samples_split=0.6,min_samples_leaf=0.1),DecisionTreeRegressor(criterion='mse',max_depth=None,min_samples_split=2,min_samples_leaf=1),DecisionTreeRegressor(criterion='mae',max_depth=3,min_samples_split=0.6,min_samples_leaf=0.1)]:
        print("-"*6,type(model).__name__,":",model.get_params())
        x_train, x_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=4)
        model.fit(x_train, y_train)
        y_pred = model.predict(x_test)
        print("MSE: ", metrics.mean_squared_error(y_test, y_pred))
        print("Feature importance: ", dict(zip(data.feature_names,model.feature_importances_)))

########## classification ##########
------ RandomForestClassifier : {'bootstrap': True, 'class_weight': None, 'criterion': 'gini', 'max_depth': None, 'max_features': 'auto', 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 20, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}
Acuuracy:  0.9666666666666667
Feature importance:  {'sepal length (cm)': 0.10242042493649711, 'sepal width (cm)': 0.019746328866933126, 'petal length (cm)': 0.5000921630444558, 'petal width (cm)': 0.377741083152114}
------ RandomForestClassifier : {'bootstrap': True, 'class_weight': None, 'criterion': 'entropy', 'max_depth': 3, 'max_features': 'auto', 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 0.1, 'min_samples_split': 0.6, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 10, 'n_jobs': 

MSE:  17.411298529411756
Feature importance:  {'CRIM': 0.04514698064066565, 'ZN': 0.00044197779955872346, 'INDUS': 0.005769995591428217, 'CHAS': 0.0011193870746782867, 'NOX': 0.021457300172308368, 'RM': 0.5155080261742151, 'AGE': 0.011728075813551264, 'DIS': 0.039790487996288176, 'RAD': 0.005654112268636201, 'TAX': 0.01572273208049088, 'PTRATIO': 0.01817434173787532, 'B': 0.02313502767628773, 'LSTAT': 0.29635155497401594}
------ RandomForestRegressor : {'bootstrap': True, 'criterion': 'mae', 'max_depth': 3, 'max_features': 'auto', 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_impurity_split': None, 'min_samples_leaf': 0.1, 'min_samples_split': 0.6, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 10, 'n_jobs': None, 'oob_score': False, 'random_state': None, 'verbose': 0, 'warm_start': False}
MSE:  60.42401813725488
Feature importance:  {'CRIM': 0.0, 'ZN': 0.0, 'INDUS': 0.0, 'CHAS': 0.0, 'NOX': 0.0, 'RM': 0.3333333333333333, 'AGE': 0.0, 'DIS': 0.0, 'RAD': 0.0, 'TAX': 0.0, '