<a href="https://colab.research.google.com/github/Felipe-Oliveira11/Hyperparameter-Optimization/blob/master/Hyperopt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Hyperopt


O Hyperopt é uma maneira de pesquisar em um espaço de hiperparâmetros. Por exemplo, ele pode usar o algoritmo TPE (Tree-Structured Parzen Estimator) , que explora de forma inteligente o espaço de pesquisa enquanto reduz os melhores parâmetros estimados, ele faz uso da <b>Otimização bayesiana. </b>

<br>
Um dos algoritmos modernos de otimização, o TPE (Tree Stuctured Parzen Estimator), é um algoritmo avançado de otimização de hiperparâmetros baseado em árvore.


A API é bastante simples e fácil de usar. Precisamos definir um espaço de pesquisa, objetivo e executar a função de otimização, bem simples.

 


<br>
<hr>

In [1]:
!pip install hyperopt 



In [2]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns 
import scipy.stats as stats

%matplotlib inline 
import warnings
warnings.filterwarnings('ignore')


from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score, roc_auc_score, precision_score

  import pandas.util.testing as tm


In [3]:
# dados 
path = '/content/drive/My Drive/Inteligência Artificial - Colab/ML- Supervisionado /Random Forest /data.csv'
data = pd.read_csv(path)
data.head()

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,radius_se,texture_se,perimeter_se,area_se,smoothness_se,compactness_se,concavity_se,concave points_se,symmetry_se,fractal_dimension_se,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,Unnamed: 32
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,1.095,0.9053,8.589,153.4,0.006399,0.04904,0.05373,0.01587,0.03003,0.006193,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,0.5435,0.7339,3.398,74.08,0.005225,0.01308,0.0186,0.0134,0.01389,0.003532,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,0.7456,0.7869,4.585,94.03,0.00615,0.04006,0.03832,0.02058,0.0225,0.004571,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,0.4956,1.156,3.445,27.23,0.00911,0.07458,0.05661,0.01867,0.05963,0.009208,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,0.7572,0.7813,5.438,94.44,0.01149,0.02461,0.05688,0.01885,0.01756,0.005115,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,


In [4]:
# Modelagem 
data.drop('id', axis=1, inplace=True)
data.drop('Unnamed: 32', axis=1, inplace=True)

X = data.drop('diagnosis', axis=1)
y = data['diagnosis']



X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.30, random_state=42)


# encoding 
label = LabelEncoder()
y_train = label.fit_transform(y_train)
y_test = label.transform(y_test)


# Standard 
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)



mdl = RandomForestClassifier(n_estimators=100, random_state=42)
mdl.fit(X_train, y_train)
y_pred = mdl.predict(X_test)

print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.96      0.99      0.98       108
           1       0.98      0.94      0.96        63

    accuracy                           0.97       171
   macro avg       0.97      0.96      0.97       171
weighted avg       0.97      0.97      0.97       171



<hr>
<br>
<br>

Vamos utilizar quatro módulos específicos do hyperopt que são:


* fmin: Será a função que vamos minimizar (acurácia), combinada com as distribuições dos parâmetros passados em spaces.

* tpe: algoritmo baseado em estruturas de árvore. 

* Trials: objeto nos permite armazenar informações a cada etapa iterativa. 

* hp: utilizado para definir os valores dos parâmetros. 

In [5]:
from hyperopt import fmin, tpe, hp, Trials

In [6]:
# distribuição de probabilidade de parâmetros 
spaces = {'n_estimators': hp.randint('n_estimators', 2000),
         'max_depth': hp.randint('max_depth', 30),
          'min_samples_leaf': hp.uniform('min_samples_leaf', 1,40),
          'max_features': hp.randint('max_features', 30)}

In [7]:
# Função objetivo 
def objective(params):
    all_params = {**params}
    return -accuracy_score(y_test, y_pred)

In [8]:
# trials armazena os sets de parâmetros 
#há dois métodos de otimização disponíveis: tpe.suggest | tpe.rand.suggest"
#parâmetro algo defini o algoritmo de pesquisa exemplo: tpe.suggest


trials = Trials()
best = fmin(objective, spaces, trials=trials, algo=tpe.suggest, max_evals=30)  

100%|██████████| 30/30 [00:00<00:00, 199.74it/s, best loss: -0.9707602339181286]


In [9]:
# melhores hiperparâmetros
best

{'max_depth': 13,
 'max_features': 26,
 'min_samples_leaf': 21.42255815419864,
 'n_estimators': 613}

In [10]:
# modelo tunado 
mdl = RandomForestClassifier(n_estimators=1083, max_depth=33, min_samples_leaf=34, random_state=42)
mdl.fit(X_train, y_train)
y_pred = mdl.predict(X_test)

print(accuracy_score(y_test, y_pred))

0.9707602339181286


<hr>
<br>

In [11]:
from xgboost import XGBClassifier 

In [12]:
xgb = XGBClassifier(random_state=42)
xgb.fit(X_train, y_train)
y_pred = xgb.predict(X_test)

print('Sem tuning')
print('\n')
print(classification_report(y_test, y_pred))

Sem tuning


              precision    recall  f1-score   support

           0       0.96      0.98      0.97       108
           1       0.97      0.94      0.95        63

    accuracy                           0.96       171
   macro avg       0.97      0.96      0.96       171
weighted avg       0.96      0.96      0.96       171



In [13]:
# Função objetivo 

def objetive(params):
  all_params = {**params}
  return -accuracy_score(y_test, y_pred)

In [14]:
spaces = {'max_depth': hp.choice('max_depth', range(1,30,1)),          
         'learning_rate': hp.loguniform('learning_rate', np.log(0.01), np.log(0.5)),
          'n_estimators': hp.randint('n_estimators', 3000)}

In [15]:
trials = Trials()
best = fmin(objetive, spaces, algo=tpe.suggest, max_evals=30)

100%|██████████| 30/30 [00:00<00:00, 134.82it/s, best loss: -0.9649122807017544]


In [16]:
best

{'learning_rate': 0.028455791762700456, 'max_depth': 9, 'n_estimators': 1396}

In [19]:
xgb = XGBClassifier(learning_rate= 0.0284, max_depth= 9, n_estimators= 1396, random_state=42)

In [20]:
xgb.fit(X_train, y_train)
y_pred = xgb.predict(X_test)

print('Tuning')
print('\n')
print(classification_report(y_test, y_pred))

Tuning


              precision    recall  f1-score   support

           0       0.97      0.99      0.98       108
           1       0.98      0.95      0.97        63

    accuracy                           0.98       171
   macro avg       0.98      0.97      0.97       171
weighted avg       0.98      0.98      0.98       171



<hr>
<br>

### Conclusão 


A biblioteca é muito simples e fácil de usar, de maneira funcional eu consigo definir os ranges dos parâmetros para o tuning, além de poder definir distribuições de probabilidade para cada parâmetro, hyperopt utiliza a otimização bayesiana que é um dos melhores métodos para Tunar os hiperparâmetros, e comparado com o GridSearch conseguimos obter um resultado "concreto" da melhor combinção de parâmetros, além de ser muito superior a velocidade do GridSearch, hyperopt possui uma API de fácil entendimento e seu pseudo-código é bem limpo, a desvantagem maior que encontrei é que, a própria documentação possui poucos exemplos e tutoriais, a saída é buscar artigos e tutoriais a parte sobre o hyperopt.  