## Hyperparameter optimization
Aim: To optimize the efficacy of the dropout layer in our model, by tuning the dropout rate parameter using Bayesian hyperparameter optimization with the `hyperopt` library. We can also tune other hyperparameters of the model such as the learning rate, number of neurons per layer, batch size, etc.
Parameters that define the models architechture are called hyperparameters.

General method:
- Define the range of possible values for the hyperparameter you want to optimize
- Define a method for sampling hyperparameter values
- Define a metric to evaluate the performance of the model, in our case we will use the validation loss
- Define a cross-validation method

Bayesian optimization uses a gaussian process to model the prior probablitly of model efficacy across the hyperparameter space, basically approximating how well the model will perform under a certain valued hyperparameter.

### Bayesian Optimization
Bayesian Optimization is reffered to as a Sequential model-based optimizer (SMBO) algorithm.



In [None]:
pip install hyperopt



In [11]:
import tensorflow as tf
import numpy as np
import hyperopt
from hyperopt import hp, fmin, tpe, Trials
from keras.layers import (Input, Conv2D, MaxPooling2D, BatchNormalization)
import os

The hyperparameter space defines all hyperparameters we are going to tune for the model and their accepted range of values. In this case we are going to focus on the following hyperparameters; learning rate, dropout rate, batch normalization, batch size, pooling type, 

In [4]:
space = {
    # Uniform distribustion to find appropriate learning rate
    'lr' : hp.uniform('lr',0.0,1.0),
    # Uniform distribustion to find appropriate dropout rate
    'dr' : hp.uniform('dr',0.0,0.5),
    # To find the best optimizer
    'optimizer' : hp.choice('optimizer', ['adam', 'Nadam'])
    # Uniform distribustion to find batch size
    'batch_size' : hp.uniform()

}

In [None]:
# prints graph of random values for each hyperparameter
# These values are not based on any specific model yet
print(hyperopt.pyll.stochastic.sample(space))

{'dr': 0.19436712036448606, 'lr': 0.7366261700596459, 'optimizer': 'Nadam'}


The fmin function is used to find the optimal value for a scalar valued function.

First we define a function for fmin to optimize. In our case we will judge the models performance based on the validation accuracy so we want out optimize function to return the negative value of the accuracy.

In [None]:
def optimize(hype_space):
  """
  build and train the model based on hype_space parameters
  return the negative of the max validation loss for that model
  """


In [None]:
trials = Trials()
best = fmin(optimize,
            space=space,
            algo=tpe.suggest(),
            trails=trails,
            max_evals=15,
            )

NameError: ignored