### Optuna
Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. <br>
It is entirely written in Python.

### Concepts
Trial: A single execution of objective function. <br>
Study: Optimization session, which is set of trials. <br>
Parameter: A variable whose value is to be optimized


The goal of study is to find out the optimal set of hyperparameters values through multiple trials.


#### A simple example

Let's try to optimize: <br>
(x - 2)^2

i.e min: f(x) <br>
f(x) = (x - 2)^2 <br>

We know f(x) is minimum at x = 2


In [None]:
# Import the library
import optuna

# Conventionally we define the function which need to optimized as "objective"
def objective(trial):
    """
        Args:
            trial: A trial object corrresponds to a single execution of the objective function and
                   is internally instantiated upon each invocation of the function

        Returns:

    """

    # Sample the variable which needs to be optimized
    # "suggest" API are called inside the objective function to obtain parameter of trial. 
    x = trial.suggest_float("x", -10, 10)

    # Always return the objective function which needs to minimized
    # In case of ML, it would be Loss 
    return (x - 2)**2

# Create the study object
study  = optuna.create_study()

# Call the optimize method by passing "objective" function and "n_trials"
study.optimize(objective, n_trials=100)

In [2]:
# Get the best parameters 
print(f"Best parameter is: {study.best_params}")

# Get the best value
print(f"Best value is: {study.best_value}")

# Get the best trial
print(f"Best trial is: {study.best_trial}")

# Get num of trials
print(f"Number of trials are: {len(study.trials)}")

# To get all trials
# study.trials

Best parameter is: {'x': 1.9949253522281367}
Best value is: 2.5752050008476984e-05
Best trial is: FrozenTrial(number=33, state=TrialState.COMPLETE, values=[2.5752050008476984e-05], datetime_start=datetime.datetime(2023, 8, 17, 13, 58, 19, 792665), datetime_complete=datetime.datetime(2023, 8, 17, 13, 58, 19, 798175), params={'x': 1.9949253522281367}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'x': FloatDistribution(high=10.0, log=False, low=-10.0, step=None)}, trial_id=33, value=None)
Number of trials are: 100


#### Pythonic search space

For HyperParameter sampling optuna provides the following features: <br>
1  optuna.trial.Trial.suggest_categorical <br>
2  optuna.trial.Trial.suggest_discrete_uniform <br>
3  optuna.trial.Trial.suggest_float <br>
4  optuna.trial.Trial.suggest_int <br>
5  optuna.trial.Trial.suggest_loguniform <br>
6  optuna.trial.Trial.suggest_uniform <br>

In [23]:
def objective(trial):
    # Categorical parameter
    optimizer = trial.suggest_categorical("optimizer", ["MomentumSGD", "Adam"])
    print(optimizer)

    # Integer parameter
    num_layers = trial.suggest_int("num_layers", 1, 3)
    print(num_layers)

    # Integer parameter (log)
    num_channels = trial.suggest_int("num_channels", 32, 512, log=True)
    print(num_channels)

    # Integer parameter (discretized)
    num_units = trial.suggest_int("num_units", 10, 100, step=5)
    print(num_units)

    # Floating point parameter
    dropout_rate = trial.suggest_float("dropout_rate", 0.0, 1.0)
    print(dropout_rate)

    # Floating point parameter (log)
    learning_rate = trial.suggest_float("learning_rate", 1e-5, 1e-2, log=True)
    print(learning_rate)

    # Floating point parameter (discretized)
    drop_path_rate = trial.suggest_float("drop_path_rate", 0.0, 1.0, step=0.1)
    print(drop_path_rate)

In [25]:
study = optuna.create_study()
study.optimize(objective, n_trials=4)

[I 2023-08-17 15:11:16,065] A new study created in memory with name: no-name-ced8d171-5f8c-4a7a-be23-7b03717d0421
[W 2023-08-17 15:11:16,070] Trial 0 failed with parameters: {'optimizer': 'MomentumSGD', 'num_layers': 1, 'num_channels': 120, 'num_units': 50, 'dropout_rate': 0.014343831891252656, 'learning_rate': 8.24688538144633e-05, 'drop_path_rate': 0.0} because of the following error: The value None could not be cast to float..
[W 2023-08-17 15:11:16,071] Trial 0 failed with value None.
[W 2023-08-17 15:11:16,075] Trial 1 failed with parameters: {'optimizer': 'MomentumSGD', 'num_layers': 2, 'num_channels': 46, 'num_units': 60, 'dropout_rate': 0.6910942427081528, 'learning_rate': 0.0002123946191339557, 'drop_path_rate': 0.6000000000000001} because of the following error: The value None could not be cast to float..
[W 2023-08-17 15:11:16,076] Trial 1 failed with value None.
[W 2023-08-17 15:11:16,079] Trial 2 failed with parameters: {'optimizer': 'Adam', 'num_layers': 2, 'num_channels'

MomentumSGD
1
120
50
0.014343831891252656
8.24688538144633e-05
0.0
MomentumSGD
2
46
60
0.6910942427081528
0.0002123946191339557
0.6000000000000001
Adam
2
355
20
0.2616117069526859
0.005407678060643385
0.7000000000000001
Adam
2
63
80
0.6933545727235636
0.00011992755330276369
0.0


##### Note
The difficulty of optimization increases roughly exponentially with regard to the number of parameters. That is, the number of necessary trials increases exponentially when you increase the number of parameters, so it is recommended to not add unimportant parameters.

##### Efficient optimization Algorithms
Optuna enables efficient hyperparameter optimization by adopting state-of-the-art algorithms for sampling hyperparameters and pruning efficiently unpromising trials.

##### Sampling algorithms
Samplers basically continually narrow down the search space using the records of suggested parameter values and evaluated objective values, leading to an optimal search space with giving off parameters leading to better objective values. <br>

##### Sampling algorithms
1) Grid search: <b>GridSampler</b>
2) Random search: <b>RandomSampler</b>
3) Tree-structured Parzen estimator: <b>TPESampler</b>
4) CMA-ES: <b>CmaEsSampler</b>
5) Algorithm to enable partial fixed parameters implemented in <b>PartialFixedSampler</b>
6) Nondominated Sorting Genetic Algorithm II implemented in <b>NSGAIISampler</b>
7) A Quasi Monte Carlo sampling algorithm implemented in <b>QMCSampler</b>

The default sampler is <b>TPESampler</b>

#### Setting Samplers
study = optuna.create_study(sampler=<b>optuna.samplers.RandomSampler()</b>)

In [36]:
study = optuna.create_study(sampler=optuna.samplers.RandomSampler())
print(f"Sampler used is: {study.sampler.__class__.__name__}")

[I 2023-08-17 15:34:38,838] A new study created in memory with name: no-name-92f28b9b-65b4-4cfa-b7a4-2ea93614e618


Sampler used is: RandomSampler


#### Pruning Algorithms
Pruners automatically stops unpromising trials at early stages of training (aka, automated early stopping).

Pruning Algorithms:
1. Median Pruning algorithm: <b>MedianPruner</b>
2. Non-pruning algorithm: <b>NopPruner</b>
3. Algorithm to implement pruner with tolerance implemented: <b>PatientPruner</b>
4. Algorithm to prune specified percentile of trials implemented: <b>PercentilePruner</b>
5. Asynchronous Successive Halving algorithm: <b>SuccessiveHalvingPruner</b>
6. Hyperband algorithm: <b>HyperbrandPruner</b>
7. Threshold pruning algorithm: <b>ThresholdPruner</b>

To turn on pruning feature, you need to call <i>report()</i> and <i>should_prune()</i> after <b>each</b> step of iterating training. <br>
<i>report</i> periodically monitors the intermediate objectives values. <br>
<i>should_prune</i> decides termination of the trial that does not meet a predefined condition.

In [37]:
import logging
import sys

import sklearn.datasets
import sklearn.linear_model
import sklearn.model_selection


def objective(trial):
    iris = sklearn.datasets.load_iris()
    classes = list(set(iris.target))
    train_x, valid_x, train_y, valid_y = sklearn.model_selection.train_test_split(
        iris.data, iris.target, test_size=0.25, random_state=0
    )

    alpha = trial.suggest_float("alpha", 1e-5, 1e-1, log=True)
    clf = sklearn.linear_model.SGDClassifier(alpha=alpha)

    for step in range(100):
        clf.partial_fit(train_x, train_y, classes=classes)

        # Report intermediate objective value.
        intermediate_value = 1.0 - clf.score(valid_x, valid_y)
        trial.report(intermediate_value, step)

        # Handle pruning based on the intermediate value.
        if trial.should_prune():
            raise optuna.TrialPruned()

    return 1.0 - clf.score(valid_x, valid_y)

In [38]:
study = optuna.create_study(pruner=optuna.pruners.MedianPruner())
study.optimize(objective, n_trials=20)

[I 2023-08-17 15:55:53,878] A new study created in memory with name: no-name-3502495e-d54d-4df0-a46b-e0a2f06d32d1
[I 2023-08-17 15:55:54,066] Trial 0 finished with value: 0.07894736842105265 and parameters: {'alpha': 0.008500363754434878}. Best is trial 0 with value: 0.07894736842105265.
[I 2023-08-17 15:55:54,223] Trial 1 finished with value: 0.3421052631578947 and parameters: {'alpha': 7.412350618752943e-05}. Best is trial 0 with value: 0.07894736842105265.
[I 2023-08-17 15:55:54,380] Trial 2 finished with value: 0.23684210526315785 and parameters: {'alpha': 0.008726733282400114}. Best is trial 0 with value: 0.07894736842105265.
[I 2023-08-17 15:55:54,538] Trial 3 finished with value: 0.21052631578947367 and parameters: {'alpha': 5.3743226938865354e-05}. Best is trial 0 with value: 0.07894736842105265.
[I 2023-08-17 15:55:54,696] Trial 4 finished with value: 0.052631578947368474 and parameters: {'alpha': 0.00040510090654396005}. Best is trial 4 with value: 0.052631578947368474.
[I 20

#### Which sampler and pruner should be used?
For non deep learning task, following is the best:
- For <i>RandomSampler</i>, <i>MedianPruner</i> is the best
- For <i>TPESampler</i>, HyperbrandPruner</i> is the best

For Deep learning tasks:


![DeepLearning_Sampler_and_pruner](.\Images\Deep_learning_Sampler_Pruner.png)


In [None]:
optuna.create_study()