# Baysian Optimization

Form of hyper-parameter tuning.

Example of using something designed for pytorch in fastai

Repository for Today: [BayesianOptimization](https://github.com/fmfn/BayesianOptimization)

## How does it work?

Bayesian optimization works by constructing a posterior distribution of functions (gaussian process) that best describes the function you want to optimize. As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in parameter space are worth exploring and which are not, as seen in the picture below.

![](https://github.com/fmfn/BayesianOptimization/raw/master/examples/bo_example.png)

- Taken from their github readme

In [2]:
!pip install bayesian-optimization

Collecting bayesian-optimization
  Downloading https://files.pythonhosted.org/packages/72/0c/173ac467d0a53e33e41b521e4ceba74a8ac7c7873d7b857a8fbdca88302d/bayesian-optimization-1.0.1.tar.gz
Building wheels for collected packages: bayesian-optimization
  Building wheel for bayesian-optimization (setup.py) ... [?25l[?25hdone
  Created wheel for bayesian-optimization: filename=bayesian_optimization-1.0.1-cp36-none-any.whl size=10031 sha256=bf519bbd7268a6d30ad9b7ca7c245ebfb97b3d7f8e6c0b98f5cfe700ae7791d1
  Stored in directory: /root/.cache/pip/wheels/1d/0d/3b/6b9d4477a34b3905f246ff4e7acf6aafd4cc9b77d473629b77
Successfully built bayesian-optimization
Installing collected packages: bayesian-optimization
Successfully installed bayesian-optimization-1.0.1


In [0]:
from fastai import *
from fastai.tabular import *
from bayes_opt import BayesianOptimization
from fastprogress import *
from fastai.utils.mod_display import progress_disabled_ctx

For today's example, we will use the Adults problem and adjust weight decay, learning rate, and drop out

In [0]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')

In [0]:
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]

In [0]:
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .databunch())

Next we need to define a `fit_with` function, where our inputs will be whatever we want to test our hyperparemeters on, essentially you can adjust *anything* in here. For us, we only care about how hyperparemeters dealing with our model do.

In [0]:
def fit_with(lr:float, wd:float, dp:float):
  # create a Learner
  learn = tabular_learner(data, layers=[200,100], metrics=accuracy, emb_drop=dp, wd=wd)
  
  # Train for x epochs
  with progress_disabled_ctx(learn) as learn:
    learn.fit_one_cycle(3, lr)
    
  # Save, print, and return the overall accuracy
  acc = float(learn.validate()[1])
  
  return acc

Finally, we need to determine what our ranges for our hyperparameters need to be as a dict

In [0]:
hps = {'lr': (1e-05, 1e-02),
      'wd': (4e-4, 0.4),
      'dp': (0.01, 0.5)}

Now we can build our optimizer

In [0]:
optim = BayesianOptimization( 
    f = fit_with, # our function
    pbounds=hps, # our boundaries
    verbose=2, # 1 prints a maximum only when observed, 0 is silent
    random_state=1)

And now we do a search!

In [23]:
%time optim.maximize(n_iter=10)

| [0m 21      [0m | [0m 0.835   [0m | [0m 0.2231  [0m | [0m 0.01    [0m | [0m 0.3484  [0m |
CPU times: user 3min 33s, sys: 30.2 s, total: 4min 3s
Wall time: 3min 43s


Now let's look at our best results

In [25]:
print(optim.max)

{'target': 0.8349999785423279, 'params': {'dp': 0.2740201996616449, 'lr': 0.004197753198888915, 'wd': 0.27421371235854514}}


We can also look at all of our results

In [27]:
for i, res in enumerate(optim.res):
  print('Iteration {} \n\t{}'.format(i, res))

Iteration 0 
	{'target': 0.824999988079071, 'params': {'dp': 0.21434078230426126, 'lr': 0.007206041689487159, 'wd': 0.00044570417701101674}}
Iteration 1 
	{'target': 0.8149999976158142, 'params': {'dp': 0.1581429605896015, 'lr': 0.0014760913492629594, 'wd': 0.0372985024696116}}
Iteration 2 
	{'target': 0.8199999928474426, 'params': {'dp': 0.10126750357505873, 'lr': 0.0034621516631600474, 'wd': 0.15894828270257572}}
Iteration 3 
	{'target': 0.8349999785423279, 'params': {'dp': 0.2740201996616449, 'lr': 0.004197753198888915, 'wd': 0.27421371235854514}}
Iteration 4 
	{'target': 0.824999988079071, 'params': {'dp': 0.11018160236844353, 'lr': 0.008782393189545545, 'wd': 0.011344082241891295}}
Iteration 5 
	{'target': 0.8199999928474426, 'params': {'dp': 0.5, 'lr': 0.01, 'wd': 0.4}}
Iteration 6 
	{'target': 0.824999988079071, 'params': {'dp': 0.3385290799874171, 'lr': 0.004178874975647598, 'wd': 0.2236524554469224}}
Iteration 7 
	{'target': 0.8100000023841858, 'params': {'dp': 0.0787895999116