### 1. Iris 데이터를 활용하여 그리드 서치 적용
- 그리드 서치는 교차 검증을 동시에 수행

In [1]:
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
import warnings
warnings.filterwarnings('ignore')

In [2]:
iris= load_iris()

In [3]:
x_train, x_test, y_train, y_test= train_test_split(iris.data, iris.target, 
                                                   test_size= 0.3,
                                                   shuffle= True,
                                                   random_state= 0)

In [4]:
x_train.shape

(105, 4)

In [5]:
y_train.shape

(105,)

In [6]:
x_test.shape

(45, 4)

In [7]:
y_test.shape

(45,)

### KNN 그리드 서치 적용

In [8]:
param_knn= {'n_neighbors': range(1,10)}

In [9]:
knn= GridSearchCV(KNeighborsClassifier(), # 모델
                  param_knn,              # 파라미터 범위
                  cv= 10)                 # 교차검증 수

In [10]:
knn.fit(x_train, y_train)

GridSearchCV(cv=10, estimator=KNeighborsClassifier(),
             param_grid={'n_neighbors': range(1, 10)})

In [11]:
print('최적의 파라미터: ', knn.best_params_)
print('최고 교차검증 점수: ', knn.best_score_)
print('최고 성능 모델: ', knn.best_estimator_)

최적의 파라미터:  {'n_neighbors': 6}
최고 교차검증 점수:  0.9609090909090909
최고 성능 모델:  KNeighborsClassifier(n_neighbors=6)


In [12]:
knn.predict(x_test)

array([2, 1, 0, 2, 0, 2, 0, 1, 1, 1, 2, 1, 1, 1, 1, 0, 1, 1, 0, 0, 2, 1,
       0, 0, 2, 0, 0, 1, 1, 0, 2, 1, 0, 2, 2, 1, 0, 2, 1, 1, 2, 0, 2, 0,
       0])

In [13]:
knn.score(x_test, y_test)

0.9777777777777777

### Decision Tree

In [14]:
pram_df= {'max_depth': range(1,10),
          'max_leaf_nodes': range(1,10),
          'min_samples_leaf': range(1,10)}

In [15]:
dt= GridSearchCV(DecisionTreeClassifier(),
                 pram_df,
                 cv= 5)  

In [16]:
dt.fit(x_train, y_train)

GridSearchCV(cv=5, estimator=DecisionTreeClassifier(),
             param_grid={'max_depth': range(1, 10),
                         'max_leaf_nodes': range(1, 10),
                         'min_samples_leaf': range(1, 10)})

In [17]:
print('최적의 파라미터: ', dt.best_params_)
print('최고 교차검증 점수: ', dt.best_score_)
print('최고 성능 모델: ', dt.best_estimator_)

최적의 파라미터:  {'max_depth': 4, 'max_leaf_nodes': 5, 'min_samples_leaf': 2}
최고 교차검증 점수:  0.9619047619047618
최고 성능 모델:  DecisionTreeClassifier(max_depth=4, max_leaf_nodes=5, min_samples_leaf=2)


In [18]:
dt.predict(x_test)

array([2, 1, 0, 2, 0, 2, 0, 1, 1, 1, 2, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1,
       0, 0, 2, 0, 0, 1, 1, 0, 2, 1, 0, 2, 2, 1, 0, 2, 1, 1, 2, 0, 2, 0,
       0])

In [19]:
dt.score(x_test, y_test)

0.9555555555555556