# AutoML tools: Hyperparameter Optimization with Optuna

In this notebook we will be using Optuna for **hyperparameter optimization** in machine learning. Hyperparameter optimization is a critical step in improving the performance of machine learning models. Optuna provides an efficient and automated way to search for the best hyperparameters.

Before we dive into the specifics of Optuna, let's take a moment to understand what **hyperparameters** are. Hyperparameters are the parameters of a machine learning model that are **not learned** from the data during training. They are **set prior** to training and can have a significant influence on a model's performance and generalization ability. Examples of hyperparameters include the learning rate of an optimizer, the number of hidden layers in a neural network, and the regularization strength in a regression model.

Selecting appropriate hyperparameters is a crucial aspect of developing effective machine learning models. Poorly chosen hyperparameters can lead to bad performance, including overfitting or underfitting.

Let's explore Optuna in greater detail to better understand its features and functionalities.

In [0]:
pip install -q optuna

Python interpreter will be restarted.
Python interpreter will be restarted.


In [0]:
import pandas as pd
from sklearn.model_selection import train_test_split
import optuna



In [0]:
data = pd.read_csv('../../Data/Boston.csv')

X = data.iloc[:, 1:14]
y = data.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

#### Define the objective function

As we found out before, according to LazyRegressor, the best performing model without hyperparameter tuning is GradientBoostingRegressor with **0.79 r2-score** (see notebook 'AutoML tools: LazyPredict & PyCaret'). Let's optimize this model with Optuna.

We start with **defining objective function**. The objective function is a crucial component of hyperparameter optimization. It defines the metric you want to optimize (e.g., accuracy, loss). This function takes hyperparameters as input, builds and trains a model, and evaluates its performance on a validation set.

Let's take a look at GradientBoostingRegressor documentation: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html.

As we can see, GradientBoostingRegressor has many parameters. Which of them should we optimize?

The overall parameters can be divided into 3 categories:

- **Tree-Specific Parameters** min_samples_split, min_samples_leaf, max_leaf_nodes etc.
- **Boosting Parameters**: learning_rate, n_estimators, subsample
- **Miscellaneous Parameters**: loss, random_state etc.

We will tune only first two types of parameters. *Let's follow the general approach for parameter tuning, explained in this article: https://luminousdata.wordpress.com/2017/07/27/complete-guide-to-parameter-tuning-in-gradient-boosting-gbm-in-python/.*

First, we take a default **learning rate** (0.1). Now we should determine the **optimum number of trees** (n_estimators) for this learning rate.

In [0]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import r2_score

# Objective function
def objective_1(trial):

  # Define hyperparameters to optimize
  # Suggest the number of trees in range [10, 300]
  n_estimators = trial.suggest_int('n_estimators', 10, 300)

  model = GradientBoostingRegressor(
        n_estimators = n_estimators,
        random_state = 42
    )

  # Train the model
  model.fit(X_train, y_train)

  # Make predictions on the test set
  y_pred = model.predict(X_test)

  # Calculate r2
  r2 = r2_score(y_test, y_pred)
  return r2

#### Create and run the study

In [0]:
# Create a new study (set of trials)
study = optuna.create_study(direction='maximize')

# Optimize an objective function
study.optimize(objective_1, n_trials=50)

# Print the results
print('Study #1')
# Attribute 'trials' returns the list of all trials
print('Number of finished trials:', len(study.trials))
# Attribute 'best_trial' returns the best trial in the study
print('Best trial:')
trial = study.best_trial
# 'value' returns the r2-score of the best trial in the study
print('Value:', trial.value)
print('Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')
print("\n")

[I 2023-08-23 11:47:46,240] A new study created in memory with name: no-name-8fd9faec-5e27-4141-9b81-164fc11bc8d7
[I 2023-08-23 11:48:02,022] Trial 0 finished with value: 0.8003995993911465 and parameters: {'n_estimators': 201}. Best is trial 0 with value: 0.8003995993911465.
[I 2023-08-23 11:48:11,950] Trial 1 finished with value: 0.7950426318200838 and parameters: {'n_estimators': 109}. Best is trial 0 with value: 0.8003995993911465.
[I 2023-08-23 11:48:20,805] Trial 2 finished with value: 0.7695193068351005 and parameters: {'n_estimators': 42}. Best is trial 0 with value: 0.8003995993911465.
[I 2023-08-23 11:48:30,300] Trial 3 finished with value: 0.7950507406790485 and parameters: {'n_estimators': 110}. Best is trial 0 with value: 0.8003995993911465.
[I 2023-08-23 11:48:39,814] Trial 4 finished with value: 0.7507099617049262 and parameters: {'n_estimators': 23}. Best is trial 0 with value: 0.8003995993911465.
[I 2023-08-23 11:48:49,656] Trial 5 finished with value: 0.74884712291273

Study #1
Number of finished trials: 50
Best trial:
Value: 0.8008881922455585
Params:
    n_estimators: 189




Our study showed **the optimum number of trees**.

Now we will use this number of trees in our model and tune tree-specific parameters. We should choose the order of tuning variables wisely, i.e. start with the ones that have a bigger effect on the outcome. For example, we need to focus on variables max_depth and min_samples_split first, as they have a strong impact.

Let's tune **max_depth** and **min_samples_split**.

In [0]:
def objective_2(trial):

  # Define hyperparameters to optimize
  max_depth = trial.suggest_int('max_depth', 2, 32, log=True)
  min_samples_split = trial.suggest_float('min_samples_split', 0.1, 1)

  model = GradientBoostingRegressor(
        n_estimators = 178,
        max_depth = max_depth,
        min_samples_split = min_samples_split,
        random_state = 42
    )

  # Train the model
  model.fit(X_train, y_train)

  # Make predictions on the test set
  y_pred = model.predict(X_test)

  # Calculate accuracy
  r2 = r2_score(y_test, y_pred)
  return r2

study = optuna.create_study(direction='maximize')

study.optimize(objective_2, n_trials=100)

print('Study #2')
print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial
print('Value:', trial.value)
print('Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')
print("\n")

[I 2023-08-23 11:55:36,595] A new study created in memory with name: no-name-6745870b-0f07-4a50-916b-eed85342e27b
[I 2023-08-23 11:55:45,607] Trial 0 finished with value: 0.7304096886753353 and parameters: {'max_depth': 4, 'min_samples_split': 0.9442376261210743}. Best is trial 0 with value: 0.7304096886753353.
[I 2023-08-23 11:55:54,443] Trial 1 finished with value: 0.831903178739239 and parameters: {'max_depth': 11, 'min_samples_split': 0.11847084474561576}. Best is trial 1 with value: 0.831903178739239.
[I 2023-08-23 11:56:03,211] Trial 2 finished with value: 0.7267917489718398 and parameters: {'max_depth': 2, 'min_samples_split': 0.24356963878346233}. Best is trial 1 with value: 0.831903178739239.
[I 2023-08-23 11:56:12,393] Trial 3 finished with value: 0.7016448920331522 and parameters: {'max_depth': 11, 'min_samples_split': 0.8210066605375843}. Best is trial 1 with value: 0.831903178739239.
[I 2023-08-23 11:56:21,707] Trial 4 finished with value: 0.7297976468108636 and parameters

Study #2
Number of finished trials: 100
Best trial:
Value: 0.8462853202470999
Params:
    max_depth: 20
    min_samples_split: 0.1313271874690831




We got the best values for max_depth and min_samples_split. At this point, we can notice that there is a big impovement in r2-score compared to the untuned model.

Now, let's keep max_depth in our model and tune **min_samples_split** and **min_samples_leaf** together.

In [0]:
def objective_3(trial):

  # Define hyperparameters to optimize
  min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 70, 10)
  min_samples_split = trial.suggest_float('min_samples_split', 0.1, 1)

  model = GradientBoostingRegressor(
        n_estimators=178,
        max_depth=24,
        min_samples_split=min_samples_split,
        min_samples_leaf=min_samples_leaf,
        random_state=42
    )

  # Train the model
  model.fit(X_train, y_train)

  # Make predictions on the test set
  y_pred = model.predict(X_test)

  # Calculate accuracy
  r2 = r2_score(y_test, y_pred)
  return r2

study = optuna.create_study(direction='maximize')

study.optimize(objective_3, n_trials=50)

print('Study #3')
print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial
print('Value:', trial.value)
print('Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')
print("\n")

[I 2023-08-23 12:11:50,958] A new study created in memory with name: no-name-a88006d9-23f7-457e-b5a3-dbdffef56370
[I 2023-08-23 12:12:00,572] Trial 0 finished with value: 0.724142190764552 and parameters: {'min_samples_leaf': 1, 'min_samples_split': 0.9632831884501147}. Best is trial 0 with value: 0.724142190764552.
[I 2023-08-23 12:12:10,259] Trial 1 finished with value: 0.8179874345355909 and parameters: {'min_samples_leaf': 1, 'min_samples_split': 0.22745569637225194}. Best is trial 1 with value: 0.8179874345355909.
[I 2023-08-23 12:12:19,696] Trial 2 finished with value: 0.607112121014506 and parameters: {'min_samples_leaf': 51, 'min_samples_split': 0.7074774187047405}. Best is trial 1 with value: 0.8179874345355909.
[I 2023-08-23 12:12:28,852] Trial 3 finished with value: 0.6760410124931437 and parameters: {'min_samples_leaf': 21, 'min_samples_split': 0.8665010636574427}. Best is trial 1 with value: 0.8179874345355909.
[I 2023-08-23 12:12:37,734] Trial 4 finished with value: 0.719

Study #3
Number of finished trials: 50
Best trial:
Value: 0.8462114170415743
Params:
    min_samples_leaf: 1
    min_samples_split: 0.286991444225391




We have the last tree-specific parameter we need to tune - **max_features**. We will try values from 1 to 13 in steps of 2.

In [0]:
def objective_4(trial):

  # Define hyperparameters to optimize
  max_features = trial.suggest_int('max_features', 1, 13, 2)

  model = GradientBoostingRegressor(
        n_estimators=178,
        max_depth=24,
        min_samples_split=0.31220765553286495,
        min_samples_leaf=1,
        max_features = max_features,
        random_state=42
    )

  # Train the model
  model.fit(X_train, y_train)

  # Make predictions on the test set
  y_pred = model.predict(X_test)

  # Calculate accuracy
  r2 = r2_score(y_test, y_pred)
  return r2

study = optuna.create_study(direction='maximize')

study.optimize(objective_4, n_trials=50)

print('Study #4')
print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial
print('Value:', trial.value)
print('Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')
print("\n")

[I 2023-08-23 12:19:41,748] A new study created in memory with name: no-name-f20f0290-2235-489b-8de8-d6d2433b05e8
[I 2023-08-23 12:19:50,516] Trial 0 finished with value: 0.7050458824913537 and parameters: {'max_features': 1}. Best is trial 0 with value: 0.7050458824913537.
[I 2023-08-23 12:19:59,317] Trial 1 finished with value: 0.7089990497383067 and parameters: {'max_features': 3}. Best is trial 1 with value: 0.7089990497383067.
[I 2023-08-23 12:20:08,624] Trial 2 finished with value: 0.7692184356104415 and parameters: {'max_features': 5}. Best is trial 2 with value: 0.7692184356104415.
[I 2023-08-23 12:20:17,870] Trial 3 finished with value: 0.7050458824913537 and parameters: {'max_features': 1}. Best is trial 2 with value: 0.7692184356104415.
[I 2023-08-23 12:20:27,037] Trial 4 finished with value: 0.7692184356104415 and parameters: {'max_features': 5}. Best is trial 2 with value: 0.7692184356104415.
[I 2023-08-23 12:20:36,572] Trial 5 finished with value: 0.8504990113714167 and p

Study #4
Number of finished trials: 50
Best trial:
Value: 0.8504990113714167
Params:
    max_features: 11




Now we will tune boosting parameter **subsample**.

In [0]:
def objective_5(trial):
  # Define hyperparameters to optimize
  subsample = trial.suggest_float('subsample', 0.6, 1, step=0.05)

  model = GradientBoostingRegressor(
        n_estimators=178,
        max_depth=24,
        min_samples_split=0.31220765553286495,
        min_samples_leaf=1,
        max_features = 11,
        subsample=subsample,
        random_state=42
    )

    # Train the model
  model.fit(X_train, y_train)

    # Make predictions on the test set
  y_pred = model.predict(X_test)

    # Calculate accuracy
  r2 = r2_score(y_test, y_pred)
  return r2

study = optuna.create_study(direction='maximize')

study.optimize(objective_5, n_trials=50)

print('Study #5')
print('Number of finished trials:', len(study.trials))
print('Best trial:')
trial = study.best_trial
print('Value:', trial.value)
print('Params:')
for key, value in trial.params.items():
    print(f'    {key}: {value}')
print("\n")

[I 2023-08-23 12:27:21,632] A new study created in memory with name: no-name-4f4086be-a4a6-4164-8f5b-232aa4165101
[I 2023-08-23 12:27:30,889] Trial 0 finished with value: 0.8075411367714808 and parameters: {'subsample': 0.85}. Best is trial 0 with value: 0.8075411367714808.
[I 2023-08-23 12:27:39,396] Trial 1 finished with value: 0.7748563040063576 and parameters: {'subsample': 0.8}. Best is trial 0 with value: 0.8075411367714808.
[I 2023-08-23 12:27:48,173] Trial 2 finished with value: 0.8096222068375332 and parameters: {'subsample': 0.9}. Best is trial 2 with value: 0.8096222068375332.
[I 2023-08-23 12:27:56,780] Trial 3 finished with value: 0.7597690635120866 and parameters: {'subsample': 0.7}. Best is trial 2 with value: 0.8096222068375332.
[I 2023-08-23 12:28:05,702] Trial 4 finished with value: 0.7661907966616728 and parameters: {'subsample': 0.75}. Best is trial 2 with value: 0.8096222068375332.
[I 2023-08-23 12:28:14,837] Trial 5 finished with value: 0.7407362662069623 and para

Study #5
Number of finished trials: 50
Best trial:
Value: 0.8504990113714167
Params:
    subsample: 1.0




It can be seen, that default value of subsample is optimal.

Now we will create a final model with all the parameters we tuned.

In [0]:
final_model = GradientBoostingRegressor(
    n_estimators=178,
    max_depth=24,
    min_samples_split=0.31220765553286495,
    min_samples_leaf=1,
    max_features=11,
    subsample=1,
    random_state=42
)

final_model.fit(X_train, y_train)
final_predictions = final_model.predict(X_test)
final_score = r2_score(y_test, final_predictions)
print('Trained and evaluated the final model using the best hyperparameters.\n')
print('Final model score:', final_score)

Trained and evaluated the final model using the best hyperparameters.

Final model score: 0.8504990113714167


Perfect! Using the Optuna library, we tuned the hyperparameters of the model and got an improvement in the r2-score.

## Your turn!

Now it's your turn to put what you've learned about the Optuna library into practice! You will try to optimize the model hyperparameters for a classification problem. Select one of the best untuned models based on the results of LazyPredict, create an objective function and run the study. Good luck!

In [0]:
# Task: Import titanic.csv dataset

titanic_df = ...

In [0]:
X = titanic_df[['Sex', 'Embarked', 'Pclass', 'Age', 'Survived']]
y = titanic_df[['Survived']]

In [0]:
# Task: split the dataset into train and test sets

...



In [0]:
# Choose one of the best untuned model based on lazypredict results (see notebook AutoML tools: LazyPredict & PyCaret)
# Find documentation for this model and check which hyperparameters you can tune

# Define objective function
def objective(trial):
  ...
# Note: use classification score function!

# create a new study
study = ...

# optimize an objective function
...

# print the results
...



Congratulations! :) You finished the notebook about hyperparameter optimization with Optuna.

This notebook has provided an introduction to the Optuna library and its significance in automating hyperparameter tuning for machine learning models. By utilizing its functionalities, we efficiently tuned the hyperparameters for a regression and classification problems.

We encourage you to explore the Optuna documentation further.

**Documentation:**

https://optuna.readthedocs.io/en/stable/reference/index.html.

Keep up the excellent work!