# Simulated Annealing for Hyperparameter Optimisation

Hyperparameter optimisation is an important part of the modelling process in supervised learning. It involves searching for the combination of hyperparameters that provide the best predictive performance. As the number of possible combinations quickly becomes so large that an exhaustive search becomes infeasible, a more intelligent search method is required. 

One popular method is a randomised search, in which values for each hyperparameter are chosen at random and the best combination is saved. Despite being a naive approach, it works suprisingly well in practice. However, I was curious to see what other approaches could be applied to this problem.

The search for optimal hyperparameters can essentially be formulated as a combinatorial optimisation problem. One particular heuristic method, which has been shown to be successful for numerous discrete optimisation problems is simulated annealing. Simulated annleaing is a metaheuristic that is inspired by the annealing process in metallurgy and as it can be easily applied to a hyperparameter search, I decided to try it out.

In [1]:
cd sa_hyperopt

C:\Users\user\Documents\git-repositories\simulated-annealing-hyperparameter-opt\sa_hyperopt


In [2]:
from sklearn.datasets import load_breast_cancer
import pandas as pd
import numpy as np
from simulated_annealing import SimulatedAnnealingSearchCV
from sklearn.ensemble import RandomForestClassifier
from lightgbm import LGBMClassifier
from hyperparameter_configs import get_lightgbm_parameters
from sklearn import metrics
from sklearn.model_selection import RandomizedSearchCV, train_test_split
import logging
import sys

In [3]:
seed = 0
np.random.seed(seed)
logging.basicConfig(level=logging.INFO, stream=sys.stdout)

## Load data

In [4]:
results = load_breast_cancer()
X = pd.DataFrame(results['data'], columns=list(results['feature_names']))
y = pd.Series(results['target'])

In [5]:
X.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 569 entries, 0 to 568
Data columns (total 30 columns):
mean radius                569 non-null float64
mean texture               569 non-null float64
mean perimeter             569 non-null float64
mean area                  569 non-null float64
mean smoothness            569 non-null float64
mean compactness           569 non-null float64
mean concavity             569 non-null float64
mean concave points        569 non-null float64
mean symmetry              569 non-null float64
mean fractal dimension     569 non-null float64
radius error               569 non-null float64
texture error              569 non-null float64
perimeter error            569 non-null float64
area error                 569 non-null float64
smoothness error           569 non-null float64
compactness error          569 non-null float64
concavity error            569 non-null float64
concave points error       569 non-null float64
symmetry error             569 

In [15]:
X.sort_index().head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


In [7]:
params = get_lightgbm_parameters('binary')
gb = LGBMClassifier()
args = {'cv': 5,
        'scoring': 'brier_score_loss',
        'n_jobs': -1}

## Simulated annealing

In [8]:
sim = SimulatedAnnealingSearchCV(gb, param_distributions=params, seed=0, **args)

maybe


In [9]:
%%time
sim.fit(X, y, initial_temperature=10)

INFO:root: Iteration 1: Local improvement from -0.0972 to -0.0881, parameters updated
INFO:root: -> Global improvement from -0.0972 to -0.0881, parameters updated
INFO:root: Iteration 2: Local improvement from -0.0881 to -0.0762, parameters updated
INFO:root: -> Global improvement from -0.0881 to -0.0762, parameters updated
INFO:root: Iteration 3: No improvement from -0.0762 to -0.0788 but parameters updated
INFO:root: Iteration 4: No improvement from -0.0788 to -0.0837, parameters unchanged
INFO:root: Iteration 5: Local improvement from -0.0788 to -0.0767, parameters updated
INFO:root: Iteration 6: Local improvement from -0.0767 to -0.0732, parameters updated
INFO:root: -> Global improvement from -0.0762 to -0.0732, parameters updated
INFO:root: Iteration 7: No improvement from -0.0732 to -0.0741 but parameters updated
INFO:root: Iteration 8: No improvement from -0.0741 to -0.0741, parameters unchanged
INFO:root: Iteration 9: Local improvement from -0.0741 to -0.0237, parameters updat

INFO:root: Iteration 88: Local improvement from -0.0272 to -0.0270, parameters updated
INFO:root: Iteration 89: No improvement from -0.0270 to -0.0273 but parameters updated
INFO:root: Iteration 90: No improvement from -0.0273 to -0.0273, parameters unchanged
INFO:root: Iteration 91: No improvement from -0.0273 to -0.0273, parameters unchanged
INFO:root: Iteration 92: No improvement from -0.0273 to -0.0787, parameters unchanged
INFO:root: Iteration 93: No improvement from -0.0273 to -0.0273, parameters unchanged
INFO:root: Iteration 94: No improvement from -0.0273 to -0.0273, parameters unchanged
INFO:root: Iteration 95: No improvement from -0.0273 to -0.0276 but parameters updated
INFO:root: Iteration 96: No improvement from -0.0276 to -0.0499, parameters unchanged
INFO:root: Iteration 97: Local improvement from -0.0276 to -0.0275, parameters updated
INFO:root: Iteration 98: No improvement from -0.0275 to -0.0379, parameters unchanged
INFO:root: Iteration 99: No improvement from -0.02

INFO:root: Iteration 180: No improvement from -0.0274 to -0.2338, parameters unchanged
INFO:root: Iteration 181: No improvement from -0.0274 to -0.0344, parameters unchanged
INFO:root: Iteration 182: No improvement from -0.0274 to -0.0274 but parameters updated
INFO:root: Iteration 183: No improvement from -0.0274 to -0.0434, parameters unchanged
INFO:root: Iteration 184: No improvement from -0.0274 to -0.0274, parameters unchanged
INFO:root: Iteration 185: Local improvement from -0.0274 to -0.0262, parameters updated
INFO:root: Iteration 186: No improvement from -0.0262 to -0.0963, parameters unchanged
INFO:root: Iteration 187: No improvement from -0.0262 to -0.0296, parameters unchanged
INFO:root: Iteration 188: No improvement from -0.0262 to -0.0265 but parameters updated
INFO:root: Iteration 189: No improvement from -0.0265 to -0.0265, parameters unchanged
INFO:root: Iteration 190: No improvement from -0.0265 to -0.0265, parameters unchanged
INFO:root: Iteration 191: Local improvem

In [10]:
print('{:.4f}'.format(-sim.best_score_))

0.0182


## Randomised search

In [11]:
randcv = RandomizedSearchCV(gb, param_distributions=params, n_iter=230, **args)

In [12]:
%%time
randcv_fit = randcv.fit(X, y)

Wall time: 19.4 s




In [13]:
print('{:.4f}'.format(-randcv_fit.best_score_))

0.0213


## Results

In [14]:
print('Best score found with simulated annealing = {:.4f}'.format(-sim.best_score_))
print('Best score found with randomised search = {:.4f}'.format(-randcv_fit.best_score_))

Best score found with simulated annealing = 0.0182
Best score found with randomised search = 0.0213
