# [Optuna](https://optuna.org/)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sebascarag/AI-SyntheticSound/blob/main/Testing%20with%20Optuna.ipynb)

es una biblioteca de optimización de hiperparámetros de código abierto en Python. Permite automatizar la búsqueda y selección de los mejores hiperparámetros para un modelo de aprendizaje automático, mejorando el rendimiento y la eficiencia. Utiliza técnicas de optimización inteligente para explorar el espacio de hiperparámetros y encontrar configuraciones que conduzcan a un mejor rendimiento en términos de las métricas que le específicas.

Algunas características de Optuna son:

* Optimización Automatizada: Optuna automatiza el proceso de búsqueda de hiperparámetros óptimos, lo que ahorra tiempo y esfuerzo al evitar la búsqueda manual.

* Múltiples Algoritmos de Optimización: Ofrece diversos algoritmos de optimización, como TPE (Tree-structured Parzen Estimator), CMA-ES (Covariance Matrix Adaptation Evolution Strategy) y Random Search, lo que permite elegir el algoritmo más adecuado para diferentes tipos de problemas y espacios de hiperparámetros.

* Espacios de Búsqueda Flexibles: Optuna permite definir espacios de búsqueda flexibles para cada hiperparámetro, incluyendo rangos continuos, discretos e incluso categóricos, lo que facilita la exploración de múltiples opciones.

* Integración con Diferentes Marcos de Aprendizaje Automático: Se integra sin problemas con populares marcos de aprendizaje automático como scikit-learn, PyTorch, TensorFlow y XGBoost.

* Sintonización Multi-Métrica: Puede optimizar múltiples métricas simultáneamente, lo que es esencial para problemas en los que se deben considerar múltiples aspectos del rendimiento.

* Visualización y Análisis de Resultados: Ofrece herramientas para visualizar y analizar los resultados de la optimización, lo que ayuda a comprender cómo cambian los hiperparámetros a lo largo de las iteraciones.

* Pruebas Paralelas: Admite la ejecución de pruebas paralelas para una búsqueda más eficiente de hiperparámetros en sistemas con múltiples núcleos o recursos de GPU.

* Persistencia de Datos: Optuna almacena los resultados y configuraciones de hiperparámetros, lo que permite pausar y reanudar la optimización en cualquier momento.

* Fácil Integración en Código Externo: Puedes integrar Optuna en tu código existente con relativa facilidad, lo que permite realizar experimentos de optimización sin reescribir completamente tu código.

* Documentación Completa: Optuna proporciona una documentación detallada y ejemplos prácticos para guiar a los usuarios en el proceso de optimización de hiperparámetros.


## Example 1: DecisionTreeClassifier

[Ref 1:](https://www.themachinelearners.com/optuna-busqueda-hiperparametros/) Optuna: ¿El futuro de la búsqueda de hiperparámetros?

[Ref 2:](https://towardsdatascience.com/state-of-the-art-machine-learning-hyperparameter-optimization-with-optuna-a315d8564de1)  State-of-the-Art Machine Learning Hyperparameter Optimization with Optuna

Dataset:

[The California housing dataset](https://inria.github.io/scikit-learn-mooc/python_scripts/datasets_california_housing.html)

[SKLearn - Boston Housing Dataset](https://www.kaggle.com/code/alexandrecazals/sklearn-boston-housing-dataset)

[DecisionTreeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html)

In [None]:
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeClassifier
from sklearn.cluster import KMeans
from sklearn.model_selection import cross_val_score

from sklearn import preprocessing

!pip install optuna
import optuna
from optuna.samplers import TPESampler
import time

Collecting optuna
  Downloading optuna-3.2.0-py3-none-any.whl (390 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m390.6/390.6 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.11.1-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.5/224.5 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting cmaes>=0.9.1 (from optuna)
  Downloading cmaes-0.10.0-py3-none-any.whl (29 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.7.0-py2.py3-none-any.whl (11 kB)
Collecting Mako (from alembic>=1.5.0->optuna)
  Downloading Mako-1.2.4-py3-none-any.whl (78 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m78.7/78.7 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: Mako, colorlog, cmaes, alembic, optuna
Successfully installed Mako-1.2.4 alembic-1.11.1 cmaes-0.10.0 colorlog-6.7.0 optuna-3.2.0


In [None]:
data = datasets.fetch_california_housing(as_frame=True)

In [None]:
print(data.DESCR)

.. _california_housing_dataset:

California Housing dataset
--------------------------

**Data Set Characteristics:**

    :Number of Instances: 20640

    :Number of Attributes: 8 numeric, predictive attributes and the target

    :Attribute Information:
        - MedInc        median income in block group
        - HouseAge      median house age in block group
        - AveRooms      average number of rooms per household
        - AveBedrms     average number of bedrooms per household
        - Population    block group population
        - AveOccup      average number of household members
        - Latitude      block group latitude
        - Longitude     block group longitude

    :Missing Attribute Values: None

This dataset was obtained from the StatLib repository.
https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html

The target variable is the median house value for California districts,
expressed in hundreds of thousands of dollars ($100,000).

This dataset was derived

In [None]:
data.data.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25


In [None]:
data.target.head()

0    4.526
1    3.585
2    3.521
3    3.413
4    3.422
Name: MedHouseVal, dtype: float64

In [None]:
data.frame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20640 entries, 0 to 20639
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   MedInc       20640 non-null  float64
 1   HouseAge     20640 non-null  float64
 2   AveRooms     20640 non-null  float64
 3   AveBedrms    20640 non-null  float64
 4   Population   20640 non-null  float64
 5   AveOccup     20640 non-null  float64
 6   Latitude     20640 non-null  float64
 7   Longitude    20640 non-null  float64
 8   MedHouseVal  20640 non-null  float64
dtypes: float64(9)
memory usage: 1.4 MB


In [None]:
california_housing_data = data.data.copy()
california_housing_data['MedHouseVal'] = data.target.copy()
california_housing_data.describe()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,MedHouseVal
count,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0
mean,3.870671,28.639486,5.429,1.096675,1425.476744,3.070655,35.631861,-119.569704,2.068558
std,1.899822,12.585558,2.474173,0.473911,1132.462122,10.38605,2.135952,2.003532,1.153956
min,0.4999,1.0,0.846154,0.333333,3.0,0.692308,32.54,-124.35,0.14999
25%,2.5634,18.0,4.440716,1.006079,787.0,2.429741,33.93,-121.8,1.196
50%,3.5348,29.0,5.229129,1.04878,1166.0,2.818116,34.26,-118.49,1.797
75%,4.74325,37.0,6.052381,1.099526,1725.0,3.282261,37.71,-118.01,2.64725
max,15.0001,52.0,141.909091,34.066667,35682.0,1243.333333,41.95,-114.31,5.00001


In [None]:
X = data.data

# y = ['cheapest' if i < 1  else 'cheap' if i < 3 else 'normal' if i < 4 else 'expensive' for i in california_housing_data.MedHouseVal ]
y = data.target

y_encoded = y.astype('int')

# lab_enc = preprocessing.LabelEncoder()
# y_encoded = lab_enc.fit_transform(y)
# y_encoded

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.2, random_state=1)

print(X_train.shape)
print(y_train.shape)

model = DecisionTreeClassifier(random_state=2).fit(X_train, y_train)
base_score = cross_val_score(model, X_train, y_train, scoring='accuracy', cv=2).mean()
# base_score = model.score(X_test,y_test)
base_score

(16512, 8)
(16512,)


0.5924176356589147

In [None]:
start_time = time.time()
def objective(trial):
  # Define the search space
  params = {
    'random_state': 2,
    'criterion': trial.suggest_categorical('criterion', ['gini', 'entropy', 'log_loss']), # default="gini"
    'max_depth': trial.suggest_int('max_depth', 1, X_train.shape[1]), # default=None
    'min_samples_split': trial.suggest_int('min_samples_split', 2, 20), # default=2
    'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 20), # default=1
    'min_impurity_decrease': trial.suggest_float('min_impurity_decrease', 0, 1), # default=0.0
  }
  model = DecisionTreeClassifier(**params).fit(X_train, y_train)
  score = cross_val_score(model, X_train, y_train, scoring='accuracy').mean()
  return score

study_DT = optuna.create_study(study_name="dectree_artf", direction="maximize", sampler=TPESampler())
study_DT.optimize(objective, n_trials=100)
print("For only {time:.2f} dollars!".format(time = ((time.time() - start_time)/60)))

pruned_trials = study_DT.get_trials(states=[optuna.trial.TrialState.PRUNED])
complete_trials = study_DT.get_trials(states=[optuna.trial.TrialState.COMPLETE])

print("# Pruned trials: ", len(pruned_trials))
print("# Complete trials: ", len(complete_trials))
trial = study_DT.best_trial
print("Best Score: ", trial.value)
print("Best Params by key: ")
for key, value in trial.params.items():
    print("  {}: {}".format(key, value))


[I 2023-07-23 21:25:37,655] A new study created in memory with name: dectree_artf
[I 2023-07-23 21:25:37,833] Trial 0 finished with value: 0.4026768609488685 and parameters: {'criterion': 'gini', 'max_depth': 3, 'min_samples_split': 7, 'min_samples_leaf': 12, 'min_impurity_decrease': 0.5981561992523379}. Best is trial 0 with value: 0.4026768609488685.
[I 2023-07-23 21:25:38,005] Trial 1 finished with value: 0.4026768609488685 and parameters: {'criterion': 'gini', 'max_depth': 1, 'min_samples_split': 17, 'min_samples_leaf': 1, 'min_impurity_decrease': 0.5318828901559359}. Best is trial 0 with value: 0.4026768609488685.
[I 2023-07-23 21:25:38,261] Trial 2 finished with value: 0.4026768609488685 and parameters: {'criterion': 'log_loss', 'max_depth': 4, 'min_samples_split': 11, 'min_samples_leaf': 6, 'min_impurity_decrease': 0.3248139295874346}. Best is trial 0 with value: 0.4026768609488685.
[I 2023-07-23 21:25:39,007] Trial 3 finished with value: 0.4636027706765118 and parameters: {'crit

For only 0.66 dollars!
# Pruned trials:  0
# Complete trials:  100
Best Score:  0.6188213713906178
Best Params by key: 
  criterion: gini
  max_depth: 7
  min_samples_split: 8
  min_samples_leaf: 12
  min_impurity_decrease: 0.0004548410739498936


In [None]:
optuna.visualization.plot_optimization_history(study_DT)

In [None]:
# using best params
print("Best Params object: ", study_DT.best_params)
modelo2=DecisionTreeClassifier(**study_DT.best_params).fit(X_train,y_train)
score = cross_val_score(model, X_train, y_train, scoring='accuracy').mean()
print("prev score:", base_score)
print("new score:", score)

Best Params object:  {'criterion': 'gini', 'max_depth': 7, 'min_samples_split': 8, 'min_samples_leaf': 12, 'min_impurity_decrease': 0.0004548410739498936}
prev score: 0.5924176356589147
new score: 0.609132530619797


## Example 2: Tensorflow Sequential model

[Ref:](https://inside-machinelearning.com/en/optuna-tutorial/) Optuna: Get the Best out of your Hyperparameters – Easy Tutorial

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, Sequential
import optuna

In [None]:
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.reuters.load_data(path="reuters.npz")

# normalize each ones X
max_len = 2
X_train = tf.keras.utils.pad_sequences(X_train, maxlen=max_len)
X_test = tf.keras.utils.pad_sequences(X_test, maxlen=max_len)
X_train.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz


(8982, 2)

In [None]:
def create_model(trial):
    # Some hyperparameters we want to optimize
    n_hidden = trial.suggest_int('n_hidden', 1, 3)
    n_units = trial.suggest_int('n_units', 32, 128)
    learning_rate = trial.suggest_float('learning_rate', 1e-5, 1e-1)

    model = Sequential()
    model.add(layers.Dense(n_units, input_dim=X_train.shape[1], activation='relu'))
    for i in range(n_hidden):
        model.add(layers.Dense(n_units, activation='relu'))
    model.add(layers.Dense(y_train[0].size, activation='softmax'))
    model.compile(loss='categorical_crossentropy',#'binary_crossentropy', #
                  optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
                  metrics=['accuracy'])
    return model

In [None]:
def objective(trial):
    model = create_model(trial)
    model.fit(X_train, y_train,
              epochs=10,
              batch_size=32,
              validation_split=0.2,
              verbose=0)
    score = model.evaluate(X_train, y_train, verbose=0)[1]
    return score

In [None]:
study = optuna.create_study()
study.optimize(objective, n_trials=10, n_jobs=-1)
print(study.best_params)

[I 2023-07-24 04:31:35,709] A new study created in memory with name: no-name-175ff1d5-129e-4d7a-b078-ff23ea4e0d6a
  return dispatch_target(*args, **kwargs)
[I 2023-07-24 04:31:49,796] Trial 1 finished with value: 0.0480961911380291 and parameters: {'n_hidden': 2, 'n_units': 121, 'learning_rate': 0.019242318840980138}. Best is trial 1 with value: 0.0480961911380291.
[I 2023-07-24 04:31:50,186] Trial 0 finished with value: 0.0480961911380291 and parameters: {'n_hidden': 1, 'n_units': 127, 'learning_rate': 0.057730257518572435}. Best is trial 1 with value: 0.0480961911380291.
[I 2023-07-24 04:32:06,276] Trial 3 finished with value: 0.0480961911380291 and parameters: {'n_hidden': 3, 'n_units': 37, 'learning_rate': 0.07795702746923278}. Best is trial 1 with value: 0.0480961911380291.
[I 2023-07-24 04:32:14,448] Trial 2 finished with value: 0.0480961911380291 and parameters: {'n_hidden': 3, 'n_units': 97, 'learning_rate': 0.013957760799693157}. Best is trial 1 with value: 0.0480961911380291.

{'n_hidden': 2, 'n_units': 121, 'learning_rate': 0.019242318840980138}


In [None]:
optuna.visualization.plot_optimization_history(study)

## Extra: Pruning Mechanism

[Efficient Hyperparameter Optimization with Optuna Framework](https://broutonlab.com/blog/efficient-hyperparameter-optimization-with-optuna-framework) and its [Colab notebook](https://colab.research.google.com/drive/1OegNO802ZrluOqO4upRhy9OYiIH6KG6k?usp=sharing)