# Bayesian Optimization Example

In this example we will perform a Bayesian Optimization using [Hyperas](https://github.com/maxpumperla/hyperas) on the [breast cancer](https://github.com/autonomio/datasets/blob/master/autonomio-datasets/breast_cancer.csv) classification task. You can run this example on CPU. It will take more or less 3 minutes.

We will continue to use the same example of the [Grid Search](./grid_search_example.ipynb) and [Random Search](./random_search_example.ipynb) notebooks.

## Initial Setup

Import the packages we need for the computation.

In [1]:
import os
import pandas as pd
import wrangle as wr

from numpy import nan

from sklearn.model_selection import train_test_split
from keras.utils import to_categorical

Using TensorFlow backend.


### Load the dataset

Load, clean and preprocess the dataset

In [2]:
def data():
    """
    Data providing function:
    This function is separated from model() so that hyperopt
    won't reload data for each evaluation run.
    """
    # Mounting point
    MP = '/floyd/input/bcds'
    
    def breast_cancer():
        """Download and preprocess(cleaning) the dataset"""
        df = pd.read_csv(os.path.join(MP, 'breast_cancer.csv'))

        # then some minimal data cleanup
        df.drop("Unnamed: 32", axis=1, inplace=True)
        df.drop("id", axis=1, inplace=True)

        # separate to x and y
        y = df.diagnosis.values
        x = df.drop('diagnosis', axis=1).values

        # convert the string labels to binary
        y = (y == 'M').astype(int)

        return x, y
    
    # Load the dataset
    x, y = breast_cancer()

    # Normalize every feature to mean 0, std 1
    x = wr.mean_zero(pd.DataFrame(x)).values

    input_dim = x.shape[1] # number of columns

    # Train - Test split: 66 - 33
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=7)
    return x_train, y_train, x_test, y_test

### Model definition

Define the model and the variables to search.

In [3]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten

from hyperas.distributions import choice, uniform
from hyperopt import Trials, STATUS_OK, tpe

def model(x_train, y_train, x_test, y_test):
    """
    Model providing function:
    Create Keras model with double curly brackets dropped-in as needed.
    Return value has to be a valid python dictionary with two customary keys:
        - loss: Specify a numeric evaluation metric to be minimized
        - status: Just use STATUS_OK and see hyperopt documentation if not feasible
    The last one is optional, though recommended, namely:
        - model: specify the model just created so that we can later use it again.
    """
    model = Sequential()
    
    # L1
    model.add(Dense({{choice([8,9,10])}}, 
                    input_dim=input_dim, 
                    kernel_initializer={{choice(['uniform', 'normal'])}}, 
                    activation={{choice(['relu', 'elu'])}}))
    # Dropout
    model.add(Dropout({{uniform(0, 1)}}))
    # L2
    model.add(Dense(1, 
                    kernel_initializer={{choice(['uniform', 'normal'])}}, 
                    activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', 
                  optimizer={{choice(['nadam', 'adam', 'sgd'])}}, 
                  metrics=['accuracy'])
    
    model.fit(x_train, y_train,
              batch_size=1024,
              epochs=10,
              verbose=2,
              validation_data=(x_test, y_test))
    
    score, acc = model.evaluate(x_test, y_test, verbose=0)
    print('Test accuracy:', acc)
    return {'loss': -acc, 'status': STATUS_OK, 'model': model}

## SMBO in action

Run 5 iterations using using the [Tree Parzen Estimator or TPE algorithm](https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf) provided with [hyperopt](https://github.com/hyperopt/hyperopt).

In [4]:
from hyperas import optim

best_run, best_model = optim.minimize(model=model,
                                      data=data,
                                      algo=tpe.suggest,
                                      max_evals=5,
                                      trials=Trials(),
                                      notebook_name='bayesian_optimization_example')

>>> Imports:
#coding=utf-8

try:
    import os
except:
    pass

try:
    import pandas as pd
except:
    pass

try:
    import wrangle as wr
except:
    pass

try:
    from numpy import nan
except:
    pass

try:
    from sklearn.model_selection import train_test_split
except:
    pass

try:
    from keras.utils import to_categorical
except:
    pass

try:
    from keras.models import Sequential
except:
    pass

try:
    from keras.layers import Dense, Dropout, Flatten
except:
    pass

try:
    from hyperas.distributions import choice, uniform
except:
    pass

try:
    from hyperopt import Trials, STATUS_OK, tpe
except:
    pass

try:
    from hyperas import optim
except:
    pass

>>> Hyperas search space:

def get_space():
    return {
        'Dense': hp.choice('Dense', [8,9,10]),
        'kernel_initializer': hp.choice('kernel_initializer', ['uniform', 'normal']),
        'activation': hp.choice('activation', ['relu', 'elu']),
        'Dropout': hp.uniform('Dropout', 0, 1),
   

### Results

Let's see which configuration give us the best performance.

In [5]:
x_train, y_train, x_test, y_test = data()
print("Evalutation of best performing model:")
print(best_model.evaluate(x_test, y_test))
print("Best performing model chosen hyper-parameters:")
print(best_run)

Evalutation of best performing model:
[0.6786171705164807, 0.9308510663661551]
Best performing model chosen hyper-parameters:
{'Dense': 2, 'Dropout': 0.8366666847115819, 'activation': 0, 'kernel_initializer': 1, 'kernel_initializer_1': 0, 'optimizer': 0}


**That's all folks - don't forget to shutdown your workspace once you're done 🙂**