<a href="https://colab.research.google.com/github/vinay10949/AnalyticsAndML/blob/master/Python/Hyperparameters_Optimization_with_Bayesian_Optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a id=000> </a>
# <center style='color:blue;background:yellow'> Hyperparameters Optimization with Bayesian Optimization </center>
Author:  [**Dayal Chand Aichara**](https://www.linkedin.com/in/dcaichara/) <div style="text-align:left">Date: 15-08-2019 </div>
***  
There are three steps to optimize hyperparameters with bayesian optimization.
1. Define Objective Function  <br>
    Write a function which has model and its output which has be maximized or minimized. 
2. Define parameters Search range <br>
    Write a domain space of parameters in which parameters have to be optimized.
3. Define bayesian optimization and optimize.   <br>
    Put model function and domain space in Bayesian Optimization function and optimize parameters.


***
### Install <span style='color:blue;background:orange;font-family:romon;font-size:25px'>[**bayesian-optimization**](https://github.com/fmfn/BayesianOptimization) </span> python package via pip.
`pip install bayesian-optimization`

##  Notebook Content
1. [Import Libraries](#0)
1. [Data](#1)
1. [Simple Example](#2)
1. [LGBM](#3)
1. [CatBoost](#4)
1. [XGBoost](#5)


<a id=0> </a>
##  <span style='color:red;background:gray'>1. </span> <span style='color:blue;background:orange'>Import Libraries </span>

In [1]:
! pip install catboost
! pip install bayesian-optimization

Collecting catboost
[?25l  Downloading https://files.pythonhosted.org/packages/b1/61/2b8106c8870601671d99ca94d8b8d180f2b740b7cdb95c930147508abcf9/catboost-0.23-cp36-none-manylinux1_x86_64.whl (64.7MB)
[K     |████████████████████████████████| 64.8MB 43kB/s 
Installing collected packages: catboost
Successfully installed catboost-0.23
Collecting bayesian-optimization
  Downloading https://files.pythonhosted.org/packages/b5/26/9842333adbb8f17bcb3d699400a8b1ccde0af0b6de8d07224e183728acdf/bayesian_optimization-1.1.0-py3-none-any.whl
Installing collected packages: bayesian-optimization
Successfully installed bayesian-optimization-1.1.0


In [0]:
import pandas as pd
import numpy as np
import lightgbm as lgb
import catboost as cgb
import xgboost as xgb
from bayes_opt import BayesianOptimization
from sklearn.datasets import load_boston
from sklearn.metrics import r2_score

import warnings
warnings.filterwarnings('ignore')


<a id=1> </a>
##  <span style='color:red;background:gray'>2. </span> <span style='color:blue;background:orange'>Data </span>

In [0]:
boston=load_boston()
X =pd.DataFrame(boston.data,columns=boston.feature_names)
y = boston.target

In [6]:
df= X
df['Price'] = y
df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,Price
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [7]:
X.shape

(506, 14)

### Data has 14 columns in which 13 are features and <span style='background:blue;color:gray'>**last column is Price**</span>. 

<a id=2> </a>
##  <span style='color:red;background:gray'>3. </span> <span style='color:blue;background:orange'>Simple Example </span>

In [0]:
# Define objective function
def simple_fx(x, y, z ):
    return -x ** 2 - (y - 1) -z** 2 + 1

In [0]:
# Search Space
pds = {'x': (1, 4), 'y': (-3, 3), 'z': (1,6)}

In [11]:
# optimization function and optimization
optimizer = BayesianOptimization(f=simple_fx,
                                 pbounds=pds,
                                 random_state=1)
optimizer.maximize(init_points=3,n_iter=10)

|   iter    |  target   |     x     |     y     |     z     |
-------------------------------------------------------------
| [0m 1       [0m | [0m-5.39    [0m | [0m 2.251   [0m | [0m 1.322   [0m | [0m 1.001   [0m |
| [95m 2       [0m | [95m-1.654   [0m | [95m 1.907   [0m | [95m-2.119   [0m | [95m 1.462   [0m |
| [0m 3       [0m | [0m-8.406   [0m | [0m 1.559   [0m | [0m-0.9266  [0m | [0m 2.984   [0m |
| [0m 4       [0m | [0m-28.0    [0m | [0m 2.128   [0m | [0m 1.739   [0m | [0m 4.872   [0m |
| [0m 5       [0m | [0m-2.014   [0m | [0m 1.906   [0m | [0m-1.845   [0m | [0m 1.492   [0m |
| [95m 6       [0m | [95m-1.346   [0m | [95m 1.985   [0m | [95m-2.067   [0m | [95m 1.214   [0m |
| [95m 7       [0m | [95m-0.03446 [0m | [95m 1.69    [0m | [95m-2.091   [0m | [95m 1.126   [0m |
| [95m 8       [0m | [95m 0.3578  [0m | [95m 1.508   [0m | [95m-1.981   [0m | [95m 1.162   [0m |
| [95m 9       [0m | [95m 1.295   [0

In [12]:
# Check best results
optimizer.max

{'params': {'x': 1.0, 'y': -3.0, 'z': 1.0}, 'target': 3.0}

In [13]:
#Get search history
optimizer.res

[{'params': {'x': 2.251066014107722,
   'y': 1.3219469606529488,
   'z': 1.0005718740867244},
  'target': -5.390389235737196},
 {'params': {'x': 1.9069977178955193,
   'y': -2.119464655097322,
   'z': 1.461692973843989},
  'target': -1.653721990746281},
 {'params': {'x': 1.5587806341330128,
   'y': -0.9266356377417138,
   'z': 2.9838373711533497},
  'target': -8.406446885097736},
 {'params': {'x': 2.1275211422930003,
   'y': 1.7392712101561543,
   'z': 4.872234510476437},
  'target': -28.00428654613743},
 {'params': {'x': 1.9060736160797531,
   'y': -1.8448083718893056,
   'z': 1.4918239751751468},
  'target': -2.0138470309334173},
 {'params': {'x': 1.9845712352311047,
   'y': -2.0671024148912105,
   'z': 1.2142710302113944},
  'target': -1.3458747076261433},
 {'params': {'x': 1.6900995572226563,
   'y': -2.090698702947696,
   'z': 1.1263741333635533},
  'target': -0.03445649868701817},
 {'params': {'x': 1.5080035830943876,
   'y': -1.981263949848377,
   'z': 1.161633026124484},
  'tar

<a id=3> </a>
##  <span style='color:red;background:gray'>4. </span> <span style='color:blue;background:orange'>LightGBM </span>

In [0]:
dtrain = lgb.Dataset(data=X, label=y)



In [0]:
def lgb_r2_score(preds, dtrain):
    labels = dtrain.get_label()
    return 'r2', r2_score(labels, preds), True

In [0]:
# Objective Function
def hyp_lgbm(num_leaves, feature_fraction, bagging_fraction, max_depth, min_split_gain, min_child_weight):
      
        params = {'application':'regression','num_iterations': 200,
                  'learning_rate':0.05, 'early_stopping_round':50,
                  'metric':'lgb_r2_score'} # Default parameters
        params["num_leaves"] = int(round(num_leaves))
        params['feature_fraction'] = max(min(feature_fraction, 1), 0)
        params['bagging_fraction'] = max(min(bagging_fraction, 1), 0)
        params['max_depth'] = int(round(max_depth))
        params['min_split_gain'] = min_split_gain
        params['min_child_weight'] = min_child_weight
        cv_results = lgb.cv(params, dtrain, nfold=5, seed=101,categorical_feature=[], stratified=False,
                            verbose_eval =None, feval=lgb_r2_score)
        # print(cv_results)
        return np.max(cv_results['r2-mean'])

In [0]:
# Domain space-- Range of hyperparameters 
pds = {'num_leaves': (80, 100),
          'feature_fraction': (0.1, 0.9),
          'bagging_fraction': (0.8, 1),
          'max_depth': (17, 25),
          'min_split_gain': (0.001, 0.1),
          'min_child_weight': (10, 25)
          }

In [18]:
# Surrogate model
optimizer = BayesianOptimization(hyp_lgbm, pds, random_state=77)
                                  
# Optimize
optimizer.maximize(init_points=5, n_iter=15)

|   iter    |  target   | baggin... | featur... | max_depth | min_ch... | min_sp... | num_le... |
-------------------------------------------------------------------------------------------------
| [0m 1       [0m | [0m 0.9766  [0m | [0m 0.9838  [0m | [0m 0.6138  [0m | [0m 23.03   [0m | [0m 12.09   [0m | [0m 0.009645[0m | [0m 95.76   [0m |
| [0m 2       [0m | [0m 0.976   [0m | [0m 0.8652  [0m | [0m 0.5329  [0m | [0m 18.92   [0m | [0m 18.18   [0m | [0m 0.04065 [0m | [0m 94.3    [0m |
| [0m 3       [0m | [0m 0.976   [0m | [0m 0.9673  [0m | [0m 0.5708  [0m | [0m 19.37   [0m | [0m 14.22   [0m | [0m 0.07085 [0m | [0m 88.45   [0m |
| [95m 4       [0m | [95m 0.9848  [0m | [95m 0.8115  [0m | [95m 0.6976  [0m | [95m 20.62   [0m | [95m 12.64   [0m | [95m 0.005888[0m | [95m 85.85   [0m |
| [95m 5       [0m | [95m 0.9848  [0m | [95m 0.8134  [0m | [95m 0.7009  [0m | [95m 17.51   [0m | [95m 16.48   [0m | [95m 0.03705 [0m |

In [19]:
optimizer.max

{'params': {'bagging_fraction': 0.8079975488198999,
  'feature_fraction': 0.888668466429874,
  'max_depth': 24.36990209738294,
  'min_child_weight': 14.780047999821054,
  'min_split_gain': 0.007680843341096549,
  'num_leaves': 84.8425584516649},
 'target': 0.9921436484500147}

In [20]:
def bayesion_opt_lgbm(X=X, y=y, init_iter=3, n_iters=7, random_state=11, seed = 101, num_iterations = 200):
  dtrain = lgb.Dataset(data=X, label=y)
  def lgb_r2_score(preds, dtrain):
      labels = dtrain.get_label()
      return 'r2', r2_score(labels, preds), True
  # Objective Function
  def hyp_lgbm(num_leaves, feature_fraction, bagging_fraction, max_depth, min_split_gain, min_child_weight):
        
          params = {'application':'regression','num_iterations': num_iterations,
                    'learning_rate':0.05, 'early_stopping_round':50,
                    'metric':'lgb_r2_score'} # Default parameters
          params["num_leaves"] = int(round(num_leaves))
          params['feature_fraction'] = max(min(feature_fraction, 1), 0)
          params['bagging_fraction'] = max(min(bagging_fraction, 1), 0)
          params['max_depth'] = int(round(max_depth))
          params['min_split_gain'] = min_split_gain
          params['min_child_weight'] = min_child_weight
          cv_results = lgb.cv(params, dtrain, nfold=5, seed=seed,categorical_feature=[], stratified=False,
                              verbose_eval =None, feval=lgb_r2_score)
          # print(cv_results)
          return np.max(cv_results['r2-mean'])
  # Domain space-- Range of hyperparameters 
  pds = {'num_leaves': (50, 70),
            'feature_fraction': (0.1, 0.9),
            'bagging_fraction': (0.8, 1),
            'max_depth': (13, 23),
            'min_split_gain': (0.001, 0.1),
            'min_child_weight': (10, 25)
            }

  # Surrogate model
  optimizer = BayesianOptimization(hyp_lgbm, pds, random_state=random_state)
                                    
  # Optimize
  optimizer.maximize(init_points=init_iter, n_iter=n_iters)

bayesion_opt_lgbm(X=X, y=y, init_iter=5, n_iters=15, random_state=717, seed = 1011, num_iterations = 300)

|   iter    |  target   | baggin... | featur... | max_depth | min_ch... | min_sp... | num_le... |
-------------------------------------------------------------------------------------------------
| [0m 1       [0m | [0m 0.9853  [0m | [0m 0.9304  [0m | [0m 0.6844  [0m | [0m 14.93   [0m | [0m 17.8    [0m | [0m 0.06796 [0m | [0m 66.09   [0m |
| [95m 2       [0m | [95m 0.9917  [0m | [95m 0.9067  [0m | [95m 0.8264  [0m | [95m 17.43   [0m | [95m 13.84   [0m | [95m 0.02859 [0m | [95m 50.87   [0m |
| [0m 3       [0m | [0m 0.9862  [0m | [0m 0.8938  [0m | [0m 0.8533  [0m | [0m 19.0    [0m | [0m 23.03   [0m | [0m 0.03538 [0m | [0m 60.4    [0m |
| [0m 4       [0m | [0m 0.9769  [0m | [0m 0.8164  [0m | [0m 0.5826  [0m | [0m 13.68   [0m | [0m 16.29   [0m | [0m 0.05168 [0m | [0m 52.17   [0m |
| [0m 5       [0m | [0m 0.9742  [0m | [0m 0.9952  [0m | [0m 0.5607  [0m | [0m 19.89   [0m | [0m 23.38   [0m | [0m 0.09741 [0m | [0m 5

<a id=4> </a>
##  <span style='color:red;background:gray'>5. </span> <span style='color:blue;background:orange'>CatBoost </span>

In [0]:
cat_features = []

cv_dataset = cgb.Pool(data=X,
                  label=y,
                  cat_features=cat_features)

In [0]:
def hyp_cat(depth, bagging_temperature):
    params = {"iterations": 300,
              "learning_rate": 0.05,
              "eval_metric": "R2",
              "verbose": False}
    params[ "depth"] = int(round(depth))
    params["bagging_temperature"] = bagging_temperature

    scores = cgb.cv(cv_dataset,
                params,
                fold_count=3)
    return np.max(scores['test-R2-mean'])

In [0]:
pds = {'depth': (6, 10),
          'bagging_temperature': (1,5),
          }

In [24]:
# Surrogate model
optimizer = BayesianOptimization(hyp_cat, pds, random_state=100)
                                  
# Optimize
optimizer.maximize(init_points=3, n_iter=7)

|   iter    |  target   | baggin... |   depth   |
-------------------------------------------------
| [0m 1       [0m | [0m 0.9412  [0m | [0m 3.174   [0m | [0m 7.113   [0m |
| [0m 2       [0m | [0m 0.8966  [0m | [0m 2.698   [0m | [0m 9.379   [0m |
| [95m 3       [0m | [95m 0.963   [0m | [95m 1.019   [0m | [95m 6.486   [0m |
| [0m 4       [0m | [0m 0.963   [0m | [0m 1.0     [0m | [0m 6.0     [0m |
| [0m 5       [0m | [0m 0.963   [0m | [0m 5.0     [0m | [0m 6.0     [0m |
| [0m 6       [0m | [0m 0.963   [0m | [0m 3.097   [0m | [0m 6.006   [0m |
| [0m 7       [0m | [0m 0.963   [0m | [0m 1.988   [0m | [0m 6.0     [0m |
| [0m 8       [0m | [0m 0.9302  [0m | [0m 1.001   [0m | [0m 7.679   [0m |
| [0m 9       [0m | [0m 0.8521  [0m | [0m 5.0     [0m | [0m 10.0    [0m |
| [0m 10      [0m | [0m 0.963   [0m | [0m 4.131   [0m | [0m 6.002   [0m |


In [25]:
optimizer.max

{'params': {'bagging_temperature': 1.0188754247638903,
  'depth': 6.486276483132457},
 'target': 0.9629577215514743}

<a id=5> </a>
##  <span style='color:red;background:gray'>6. </span> <span style='color:blue;background:orange'>XGBoost </span>

In [0]:
dtrain = xgb.DMatrix(X, y, feature_names=X.columns.values)
def xgb_r2(preds, dtrain):
    labels = dtrain.get_label()
    return 'r2', r2_score(preds, labels)

In [0]:
def hyp_xgb(max_depth, subsample, colsample_bytree,min_child_weight, gamma ):
    params = {
    'n_estimators': 300,
    'eta': 0.05,
    'objective': 'reg:linear',
    'eval_metric':'mae',
    'silent': 1
     }
    params['max_depth'] = int(round(max_depth))
    params['subsample'] = max(min(subsample, 1), 0)
    params['colsample_bytree'] = max(min(colsample_bytree, 1), 0)
    params['min_child_weight'] = int(min_child_weight)
    params['gamma'] = max(gamma, 0)
    scores = xgb.cv(params, dtrain, num_boost_round=1000,verbose_eval=False, early_stopping_rounds=5, feval=xgb_r2, maximize=True, nfold=5)
    return  scores['test-r2-mean'].iloc[-1]

In [0]:
pds ={
  'min_child_weight':(14, 20),
  'gamma':(0, 5),
  'subsample':(0.5, 1),
  'colsample_bytree':(0.1, 1),
  'max_depth': (6, 10)
}

In [0]:
# Surrogate model
optimizer = BayesianOptimization(hyp_xgb, pds, random_state=103)
                                  
# Optimize
optimizer.maximize(init_points=5, n_iter=15)

|   iter    |  target   | colsam... |   gamma   | max_depth | min_ch... | subsample |
-------------------------------------------------------------------------------------
| [0m 1       [0m | [0m 0.9732  [0m | [0m 0.4889  [0m | [0m 0.8711  [0m | [0m 6.684   [0m | [0m 18.97   [0m | [0m 0.7936  [0m |
| [95m 2       [0m | [95m 0.9774  [0m | [95m 0.5134  [0m | [95m 4.113   [0m | [95m 9.286   [0m | [95m 15.84   [0m | [95m 0.6004  [0m |
| [0m 3       [0m | [0m 0.97    [0m | [0m 0.4629  [0m | [0m 4.739   [0m | [0m 8.71    [0m | [0m 17.65   [0m | [0m 0.8362  [0m |
| [0m 4       [0m | [0m 0.8031  [0m | [0m 0.1072  [0m | [0m 1.683   [0m | [0m 7.429   [0m | [0m 16.92   [0m | [0m 0.9391  [0m |
| [95m 5       [0m | [95m 0.9909  [0m | [95m 0.7794  [0m | [95m 3.152   [0m | [95m 7.506   [0m | [95m 17.54   [0m | [95m 0.8687  [0m |
| [95m 6       [0m | [95m 0.9919  [0m | [95m 1.0     [0m | [95m 5.0     [0m | [95m 6.0     [0m

In [0]:
optimizer.max

{'params': {'colsample_bytree': 1.0,
  'gamma': 2.830706438799185,
  'max_depth': 8.201814553535812,
  'min_child_weight': 14.0,
  'subsample': 1.0},
 'target': 0.9989032}

<center><span style='color:red;background:pink;font-size:40px'>End of the Notebook </span> </center>

***

### <center style='color:blue;background:yellow'>  [GOT TO TOP](#000) </center>