# Why Is Everyone at Kaggle Obsessed with Optuna For Hyperparameter Tuning?
## Let's find out by trying it out...
![](images/pixabay.jpg)
<figcaption style="text-align: center;">
    <strong>
        Photo by 
        <a href='https://pixabay.com/users/bomei615-2623913/?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=1751855'>Bo Mei</a>
        on 
        <a href='https://pixabay.com/?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=1751855'>Pixabay.</a> All images are by author unless specified otherwise.
    </strong>
</figcaption>

## Setup

In [10]:
import matplotlib.pyplot as plt
import numpy as np
import optuna
import pandas as pd
import seaborn as sns

optuna.logging.set_verbosity(optuna.logging.WARNING)

## Introduction

Turns out, I have been living under a rock...

Every single MOOC I have taken taught me to use GridSearch for hyperparameter tuning. Naively, I loved it and went the extra mile to learn its cousins - Randomized GridSearch and Halving GridSearch.

Even though they were great, I somehow tried to avoid hyperparameter tuning as much as possible. Why?

First, they take such a damn long time to train. We are talking in multiples of 24-hour sessions if you are doing an exhaustive GridSearch. If you are not, then there is a pretty high chance randomized search comes up with hyperparameters that are actually worse than the defaults.

While I was complaining, Kagglers have been using Optuna almost exclusively for the past 2 years to do hyperparameter tuning. 

After giving it a try, I am truly amazed at how it takes the whole tuning experience to the next level. So, without further ado, let me show you how to use it in your own workflow.

## What is Optuna?

![](https://raw.githubusercontent.com/optuna/optuna/master/docs/image/optuna-logo.png)
<figcaption style="text-align: center;">
    <strong>
        Optuna logo
    </strong>
</figcaption>

Optuna is a next-generation automatic hyperparameter tuning framework, written completely in Python.

Its most prominent features are:
- the ability to define more Pythonic search spaces using loops and conditionals. 
- completely platform agnostic API - using Optuna, you can tune estimators of almost any ML, DL package/framework including Sklearn, PyTorch, TensorFlow, Keras, XGBoost, LightGBM, CatBoost, etc.
- a large suite of optimization algorithms with early stopping and pruning features baked in.
- easy parallelization with little or no changes to the code.
- built-in support for visual exploration of search results.

## Optuna basics

Let's familiarize ourselves with Optuna by tuning a simple function like $(x-1)^2 + (y+3)^2$. We know the function converges to its minimum at x=1 and y=-3. Let's see if Optuna can find these:

In [8]:
import optuna  # pip install optuna


def objective(trial):
    x = trial.suggest_float("x", -7, 7)
    y = trial.suggest_float("y", -7, 7)
    return (x - 1) ** 2 + (y + 3) ** 2

After importing `optuna`, we define an objective that returns the function we want to minimize. 

In the body of the objective, we define the parameters to be optimized, in this case simple `x` and `y`. The argument `trial` is a special Trial object of optuna which does the optimization for each hyperparameter. 

Along many others, it has a `suggest_float` function which takes the name of the hyperparameter and the range to look for its optimal value. In other words

```
x = trial.suggest_float("x", -7, 7)
```
is almost the same as `{"x": np.arange(-7, 7)}` when doing GridSearch.

To start the optimization, we create a `study` object from Optuna and pass the `objective` function to its `optimize` function:

In [11]:
study = optuna.create_study()
study.optimize(objective, n_trials=100)  # number of iterations

In [13]:
best_params = study.best_params
best_params

{'x': 0.8806708977164549, 'y': -3.0941841160297767}

Pretty close but not as close as you would want. Here, we only did 100 trials, as can be seen with:

In [15]:
len(study.trials)

100

Here, I will introduce the first magic that comes with Optuna. We can resume the optimization even after it is finished if we are not satisfied with the results! 

This is a huge advantage over other tools because after the search is done, they completely forget the history of previous trials. Optuna does not!

To continue searching, just call `optimize` again with the desired params. Here, we will add 100 more trials:

In [16]:
study.optimize(objective, n_trials=100)

In [18]:
best_params = study.best_params
best_params

{'x': 1.045706853335669, 'y': -2.9501109059847512}

As you can see, the results are much closer to the optimal parameters. 