<a href="https://colab.research.google.com/github/muellerzr/BaysianOptimizationFastAI/blob/master/Baysian_Optimization_in_FastAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Notebook

This notebook shows a working implementation of using a BaysianOptimization library within Fast.AI. Essentially we modify fit_with to have whatever hyperparameters we want to use, and their respective ranges. 

In [0]:
!pip install bayesian-optimization

Collecting bayesian-optimization
  Downloading https://files.pythonhosted.org/packages/72/0c/173ac467d0a53e33e41b521e4ceba74a8ac7c7873d7b857a8fbdca88302d/bayesian-optimization-1.0.1.tar.gz
Building wheels for collected packages: bayesian-optimization
  Building wheel for bayesian-optimization (setup.py) ... [?25l[?25hdone
  Stored in directory: /root/.cache/pip/wheels/1d/0d/3b/6b9d4477a34b3905f246ff4e7acf6aafd4cc9b77d473629b77
Successfully built bayesian-optimization
Installing collected packages: bayesian-optimization
Successfully installed bayesian-optimization-1.0.1


In [0]:
from fastai import *
from fastai.tabular import *
from bayes_opt import BayesianOptimization
from fastprogress import *

Here the example will be the Adults dataset, where we will adjust the weight decay, learning rate, and dropout

In [0]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')

In [0]:
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]

In [0]:
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())

Anything you want adjusted goes into `fit_with`

In [0]:
def fit_with(lr, wd, dp):
  
  # Create our learner with the parameters
  learn = tabular_learner(data, layers=[200,100], metrics=accuracy, emb_drop=dp, wd=wd)
  
  # train the model at the specified learning rate
  with progress_disabled_ctx(learn) as learn:
    learn.fit_one_cycle(3, max_lr=lr)
    
  # save, print, and return the model's overall accuracy
  acc = float(learn.validate(learn.data.valid_dl)[1])
  
  # Small change to the standard, we are only returning accuracy
  
  return acc

Lastly we need to dictate the upper and lower bounds we want to examine

In [0]:
pbounds = {'lr': (1e-5, 1e-2), 'wd': (4e-4, 0.4), 'dp': (0.01, 0.5)}

Now we make the optimizer

In [0]:
optimizer = BayesianOptimization( 
    f = fit_with, # use our custom fit function    
    pbounds=pbounds, # use our limits
    verbose = 2, # 1 prints a maximum only when it is observed, 0 is completely silent
    random_state=1)

Now we run it! It can take a while depending. Then we can print the best one!

In [0]:
optimizer.maximize()

In [0]:
print(optimizer.max)