# Hyper Parameter Tuning

- hyper parameter: 모델 설정과 관련해 직접 지정할 수 있는 매개변수
- model parameter: 회귀계수(가중치), 절편 등 모델의 학습 대상이 되는 변수

### GridSearchCV

In [17]:
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV

# 데이터 로드
iris_input, iris_target = load_iris(return_X_y=True)

# 모델 생성
knn = KNeighborsClassifier()

# 테스트할 파라미터 값
params = {
    'n_neighbors': range(1, 13, 2)
}

# 첫 번째 인자: 모델
# 두 번째 인자: 테스트 할 파라미터 (딕셔너리)
# scoring: 평가 지표 (accuracy, precision, recall, f1)
# cv: 반복 횟수
grid = GridSearchCV(knn, params, scoring='accuracy', cv=5)
grid.fit(iris_input, iris_target)

print("최적의 파라미터:", grid.best_params_)
print("최적화된 모델 객체:", grid.best_estimator_)
print("최적화된 점수:", grid.best_score_)

최적의 파라미터: {'n_neighbors': 7}
최적화된 모델 객체: KNeighborsClassifier(n_neighbors=7)
최적화된 점수: 0.9800000000000001


In [18]:
best_knn = grid.best_estimator_
# best_knn.predict(iris_input)
best_knn.score(iris_input, iris_target)

0.9733333333333334

### RandomSearchCV

- 하이퍼 파라미터의 값 목록이나 값의 범위를 제공하는데, 이 범위 중에 랜덤하게 값을 뽑아내 최적의 하이퍼 파라미터 조합을 찾는다.
    - 탐색범위가 넓을 때 짧은 시간 내에 좋은 결과를 얻을 수 있다.
    - 랜덤하게 값을 추출해 계산하므로, 전역 최적값을 놓칠 수 있다.

In [19]:
from sklearn.model_selection import RandomizedSearchCV

# 모델 생성
knn = KNeighborsClassifier()

# 테스트할 파라미터 값
params = {
    'n_neighbors': range(1, 100, 2)
}

# n_iter: 탐색할 최적의 하이퍼 파라미터 조합 수 (기본값: 10)
#         값이 크면 시간이 오래 걸림 / 값이 작으면 좋은 조합을 찾을 가능성 저하
rd_search = RandomizedSearchCV(knn, params, cv=5, n_iter=10, random_state=0)
rd_search.fit(iris_input, iris_target)

print("최적의 파라미터:", rd_search.best_params_)
print("최적화된 모델 객체:", rd_search.best_estimator_)
print("최적화된 점수:", rd_search.best_score_)

최적의 파라미터: {'n_neighbors': 5}
최적화된 모델 객체: KNeighborsClassifier()
최적화된 점수: 0.9733333333333334


---

### HyperOpt

**hyper.hp클래스**
<table border="1">
  <thead>
    <tr>
      <th>함수명</th>
      <th>설명</th>
      <th>사용 방법</th>
      <th>예시 코드</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>hp.uniform</td>
      <td>연속적인 실수 값 샘플링</td>
      <td>hp.uniform(label, low, high)</td>
      <td><code>hp.uniform('learning_rate', 0.01, 0.1)</code></td>
    </tr>
    <tr>
      <td>hp.quniform</td>
      <td>연속적이지만 일정 간격(q)을 갖는 값 샘플링</td>
      <td>hp.quniform(label, low, high, q)</td>
      <td><code>hp.quniform('num_layers', 1, 5, 1)</code></td>
    </tr>
    <tr>
      <td>hp.loguniform</td>
      <td>로그 스케일로 분포된 실수 값 샘플링</td>
      <td>hp.loguniform(label, low, high)</td>
      <td><code>hp.loguniform('reg_param', -3, 0)</code></td>
    </tr>
    <tr>
      <td>hp.randint</td>
      <td>정수 값 샘플링</td>
      <td>hp.randint(label, upper)</td>
      <td><code>hp.randint('num_trees', 1, 100)</code></td>
    </tr>
    <tr>
      <td>hp.choice</td>
      <td>주어진 리스트 중 임의의 값 샘플링</td>
      <td>hp.choice(label, options)</td>
      <td><code>hp.choice('optimizer', ['adam', 'sgd', 'rmsprop'])</code></td>
    </tr>
    <tr>
      <td>hp.normal</td>
      <td>정규분포에서 값 샘플링</td>
      <td>hp.normal(label, mean, std)</td>
      <td><code>hp.normal('dropout_rate', 0.3, 0.05)</code></td>
    </tr>
    <tr>
      <td>hp.lognormal</td>
      <td>로그 정규분포에서 값 샘플링</td>
      <td>hp.lognormal(label, mean, std)</td>
      <td><code>hp.lognormal('scale', 0, 1)</code></td>
    </tr>
  </tbody>
</table>

In [20]:
# !pip install hyperopt

In [21]:
from hyperopt import hp

# 검색 공간
search_space = {
    'x': hp.quniform('x', -10, 10, 1),
    'y': hp.quniform('y', -15, 15, 1)
}

In [22]:
import hyperopt

# 목적 함수
def objective(search_space):
    x = search_space['x']
    y = search_space['y']
    return {
        'loss': x ** 2 + 20 * y,
        'status': hyperopt.STATUS_OK
    }

In [23]:
from hyperopt import fmin, tpe, Trials

trials = Trials()

best_val = fmin(
    fn=objective,
    space=search_space,
    algo=tpe.suggest,
    max_evals=500,
    trials=trials
)
best_val

100%|██████████| 500/500 [00:05<00:00, 87.62trial/s, best loss: -300.0] 


{'x': np.float64(0.0), 'y': np.float64(-15.0)}

In [24]:
trials.results

[{'loss': 76.0, 'status': 'ok'},
 {'loss': 141.0, 'status': 'ok'},
 {'loss': 164.0, 'status': 'ok'},
 {'loss': -259.0, 'status': 'ok'},
 {'loss': -79.0, 'status': 'ok'},
 {'loss': -51.0, 'status': 'ok'},
 {'loss': 16.0, 'status': 'ok'},
 {'loss': 309.0, 'status': 'ok'},
 {'loss': -240.0, 'status': 'ok'},
 {'loss': -36.0, 'status': 'ok'},
 {'loss': 29.0, 'status': 'ok'},
 {'loss': 221.0, 'status': 'ok'},
 {'loss': 61.0, 'status': 'ok'},
 {'loss': -104.0, 'status': 'ok'},
 {'loss': -176.0, 'status': 'ok'},
 {'loss': 284.0, 'status': 'ok'},
 {'loss': 329.0, 'status': 'ok'},
 {'loss': -231.0, 'status': 'ok'},
 {'loss': -264.0, 'status': 'ok'},
 {'loss': 185.0, 'status': 'ok'},
 {'loss': -279.0, 'status': 'ok'},
 {'loss': -279.0, 'status': 'ok'},
 {'loss': -175.0, 'status': 'ok'},
 {'loss': -59.0, 'status': 'ok'},
 {'loss': -279.0, 'status': 'ok'},
 {'loss': -264.0, 'status': 'ok'},
 {'loss': -191.0, 'status': 'ok'},
 {'loss': -139.0, 'status': 'ok'},
 {'loss': -11.0, 'status': 'ok'},
 {'lo

In [25]:
trials.vals

{'x': [np.float64(6.0),
  np.float64(-9.0),
  np.float64(-8.0),
  np.float64(-1.0),
  np.float64(9.0),
  np.float64(-3.0),
  np.float64(-4.0),
  np.float64(7.0),
  np.float64(0.0),
  np.float64(-8.0),
  np.float64(3.0),
  np.float64(9.0),
  np.float64(-1.0),
  np.float64(4.0),
  np.float64(-2.0),
  np.float64(2.0),
  np.float64(7.0),
  np.float64(-3.0),
  np.float64(4.0),
  np.float64(5.0),
  np.float64(1.0),
  np.float64(1.0),
  np.float64(-5.0),
  np.float64(1.0),
  np.float64(1.0),
  np.float64(-6.0),
  np.float64(3.0),
  np.float64(1.0),
  np.float64(7.0),
  np.float64(-6.0),
  np.float64(5.0),
  np.float64(-1.0),
  np.float64(-1.0),
  np.float64(-9.0),
  np.float64(-3.0),
  np.float64(-6.0),
  np.float64(-1.0),
  np.float64(10.0),
  np.float64(-4.0),
  np.float64(-2.0),
  np.float64(0.0),
  np.float64(3.0),
  np.float64(8.0),
  np.float64(-7.0),
  np.float64(-4.0),
  np.float64(2.0),
  np.float64(-2.0),
  np.float64(6.0),
  np.float64(-10.0),
  np.float64(4.0),
  np.float64(2.0),


- hyperopt를 활용한 XGBoost 하이퍼 파라미터 튜닝

In [26]:
from xgboost import XGBClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, cross_val_score
from hyperopt import fmin, tpe, Trials, hp
import hyperopt

# 0. 데이터 로드 및 분리
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, random_state=42)

# 1. 검색 공간
search_space = {
    'n_estimators': hp.quniform('n_estimators', 100, 500, 100),
    'max_depth': hp.quniform('max_dapth', 3, 10, 1),
    'learning_rate': hp.uniform('learning_rate', 0.01, 0.2),
    'colsample_bytree': hp.uniform('colsample_bytree', 0.5, 1)
}

# 2. 목적 함수
def xgb_objective(ss):
    xgb_clf = XGBClassifier(
        n_estimators=int(ss['n_estimators']),
        max_depth=int(ss['max_depth']),
        learning_rate=ss['learning_rate'],
        colsample_bytree=ss['colsample_bytree']
    )
    mean_acc = cross_val_score(xgb_clf, X_train, y_train, scoring='accuracy', cv=3).mean()
    return {
        'loss': -1 * mean_acc,
        'status': hyperopt.STATUS_OK
    }

# 3. Trials() + fmin()
trials = Trials()
best = fmin(
    fn=xgb_objective,
    space=search_space,
    algo=tpe.suggest,
    max_evals=50,
    trials=trials
)

best

100%|██████████| 50/50 [00:11<00:00,  4.21trial/s, best loss: -0.9741784037558686]


{'colsample_bytree': np.float64(0.5010247093957608),
 'learning_rate': np.float64(0.16532913013986236),
 'max_dapth': np.float64(6.0),
 'n_estimators': np.float64(500.0)}

---

### Optuna

<table border="1">
    <thead>
        <tr>
            <th>함수명</th>
            <th>설명</th>
            <th>사용 방법</th>
            <th>예시 코드</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>suggest_uniform</td>
            <td>연속적인 실수 값 샘플링</td>
            <td>trial.suggest_uniform(name, low, high)</td>
            <td><code>trial.suggest_uniform('learning_rate', 0.01, 0.1)</code></td>
        </tr>
        <tr>
            <td>suggest_discrete_uniform</td>
            <td>연속적이지만 일정 간격(step)을 갖는 값 샘플링</td>
            <td>trial.suggest_discrete_uniform(name, low, high, step)</td>
            <td><code>trial.suggest_discrete_uniform('num_layers', 1, 5, 1)</code></td>
        </tr>
        <tr>
            <td>suggest_loguniform</td>
            <td>로그 스케일로 분포된 실수 값 샘플링</td>
            <td>trial.suggest_loguniform(name, low, high)</td>
            <td><code>trial.suggest_loguniform('reg_param', 1e-3, 1)</code></td>
        </tr>
        <tr>
            <td>suggest_int</td>
            <td>정수 값 샘플링</td>
            <td>trial.suggest_int(name, low, high, step)</td>
            <td><code>trial.suggest_int('num_trees', 1, 100)</code></td>
        </tr>
        <tr>
            <td>suggest_categorical</td>
            <td>주어진 리스트 중 임의의 값 샘플링</td>
            <td>trial.suggest_categorical(name, choices)</td>
            <td><code>trial.suggest_categorical('optimizer', ['adam', 'sgd', 'rmsprop'])</code></td>
        </tr>
        <tr>
            <td>suggest_float</td>
            <td>연속적인 실수 값 샘플링 (<code>step</code> 사용 가능)</td>
            <td>trial.suggest_float(name, low, high, step=None, log=False)</td>
            <td><code>trial.suggest_float('alpha', 0.1, 1.0, step=0.1)</code></td>
        </tr>
    </tbody>
</table>

In [27]:
# !pip install optuna

In [28]:
import optuna

# 목적 함수
def objective(trial):
    x = trial.suggest_uniform('x', -10, 10)
    y = trial.suggest_uniform('y', -15, 15)
    return (x - 3) ** 2 + (y + 5) ** 2

# 스터디 생성
study = optuna.create_study(direction='minimize')

# 최적화 실행
study.optimize(objective, n_trials=500)

[I 2025-08-11 10:26:12,336] A new study created in memory with name: no-name-955652d4-4ae6-425b-b854-6ea4efb1c509

suggest_uniform has been deprecated in v3.0.0. This feature will be removed in v6.0.0. See https://github.com/optuna/optuna/releases/tag/v3.0.0. Use suggest_float instead.


suggest_uniform has been deprecated in v3.0.0. This feature will be removed in v6.0.0. See https://github.com/optuna/optuna/releases/tag/v3.0.0. Use suggest_float instead.

[I 2025-08-11 10:26:12,339] Trial 0 finished with value: 169.68452849813104 and parameters: {'x': -8.110586402710407, 'y': 1.7999557561823636}. Best is trial 0 with value: 169.68452849813104.
[I 2025-08-11 10:26:12,340] Trial 1 finished with value: 218.09856034625034 and parameters: {'x': -8.926865548109252, 'y': -13.709100903283584}. Best is trial 0 with value: 169.68452849813104.
[I 2025-08-11 10:26:12,341] Trial 2 finished with value: 28.79093183574342 and parameters: {'x': 5.836177193435702, 'y': -0.4451091381704213}. Best is tr

In [29]:
study.best_value

0.004556208545386644

In [30]:
study.best_params

{'x': 3.066422930433542, 'y': -4.987991550557733}

In [31]:
import optuna.visualization as vis

vis.plot_param_importances(study).show()

In [32]:
vis.plot_optimization_history(study).show()

- optuna를 활용한 XGBoost 하이퍼 파라미터 튜닝

In [33]:
# 1. 목적 함수
def xgb_optuna_objective(trial):
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 100, 500, 100),
        'max_depth': trial.suggest_int('max_depth', 3, 10),
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.2),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1.0)
    }
    xgb_clf = XGBClassifier(**params)
    return cross_val_score(xgb_clf, X_train, y_train, scoring='accuracy', cv=3).mean()

# 2. study 객체 생성 -> 최적화
study = optuna.create_study(direction='maximize')
study.optimize(xgb_optuna_objective, n_trials=50)

# 3. 결과 출력
print(study.best_params)
print(study.best_value)

[I 2025-08-11 10:26:20,973] A new study created in memory with name: no-name-811deb45-fb77-4ff8-995e-03330007b698

suggest_int() got {'step'} as positional arguments but they were expected to be given as keyword arguments.
Positional arguments ['self', 'name', 'low', 'high', 'step', 'log'] in suggest_int() have been deprecated since v3.5.0. They will be replaced with the corresponding keyword arguments in v5.0.0, so please use the keyword specification instead. See https://github.com/optuna/optuna/releases/tag/v3.5.0 for details.

[I 2025-08-11 10:26:21,159] Trial 0 finished with value: 0.9577464788732395 and parameters: {'n_estimators': 300, 'max_depth': 5, 'learning_rate': 0.1904060540304981, 'colsample_bytree': 0.8447521616138891}. Best is trial 0 with value: 0.9577464788732395.

suggest_int() got {'step'} as positional arguments but they were expected to be given as keyword arguments.
Positional arguments ['self', 'name', 'low', 'high', 'step', 'log'] in suggest_int() have been dep

{'n_estimators': 500, 'max_depth': 7, 'learning_rate': 0.1438440611670294, 'colsample_bytree': 0.5257773175292658}
0.9741784037558686


##### HyperOpt vs Optuna

- HyperOpt
    - 'n_estimators': np.float64(500.0)
    - 'max_dapth': np.float64(6.0)
    - 'learning_rate': np.float64(0.16532913013986236)
    - 'colsample_bytree': np.float64(0.5010247093957608)

- Optuna
    - 'n_estimators': 500
    - 'max_depth': 7
    - 'learning_rate': 0.1438440611670294
    - 'colsample_bytree': 0.5257773175292658

In [35]:
from sklearn.metrics import classification_report

xgb_hopt = XGBClassifier(
    n_estimators=500,
    max_depth=6,
    learning_rate=0.17,
    colsample_bytree=0.5
)

xgb_optuna = XGBClassifier(
    n_estimators=500,
    max_depth=7,
    learning_rate=0.14,
    colsample_bytree=0.53
)

xgb_hopt.fit(X_train, y_train)
xgb_optuna.fit(X_train, y_train)

hopt_pred = xgb_hopt.predict(X_test)
optuna_pred = xgb_optuna.predict(X_test)

print('HyperOpt 최적 파라미터 적용')
print(classification_report(y_test, hopt_pred))
print('Optuna 최적 파라미터 적용')
print(classification_report(y_test, optuna_pred))

HyperOpt 최적 파라미터 적용
              precision    recall  f1-score   support

           0       0.94      0.94      0.94        54
           1       0.97      0.97      0.97        89

    accuracy                           0.96       143
   macro avg       0.96      0.96      0.96       143
weighted avg       0.96      0.96      0.96       143

Optuna 최적 파라미터 적용
              precision    recall  f1-score   support

           0       0.94      0.94      0.94        54
           1       0.97      0.97      0.97        89

    accuracy                           0.96       143
   macro avg       0.96      0.96      0.96       143
weighted avg       0.96      0.96      0.96       143

