<a href="https://colab.research.google.com/github/kshitijmamgain/Mlclass/blob/master/ML_class_demonstration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
!pip install optuna

# Getting Started
This notebook is for the new and aspirant data scientists to demonstrate the use of object oriented programming in Python to tune the machine learning (ML) models .
The notebook is divided into 2 parts so suit yourself to jump to the section you feel is appropriate for you.
1. Introduction to Bayesian Optimization
2. ML Object Classes for hyper-parameter tuning

We would use breast cancer dataset in sklearn

In [0]:
# -*- coding: utf-8 -*-
"""
Created on Tue Apr 14 16:11:15 2020

@author: Kshitij
"""
import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, confusion_matrix, classification_report

# use available dataset for classification
dataset = load_breast_cancer()

X = dataset.data
y = dataset.target

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 2/3, random_state = 1)



In [0]:
''' We will be using lighgbm api without sklearn so we will provide the params
 in a dict format'''
param = param = {'objective': 'binary', 'learning_rate': 0.5,
                 'reg_alpha': 0.5, 'reg_lambda': 0.5}

param['metric'] = 'auc'

In [0]:
# train the model
model = lgb.train(param, lgb.Dataset(X_train, label = y_train))

In [0]:
# make prediction on test dataset
# unlike sklearn classifier the 'predict' method gives probability in lightgbm
pred=model.predict(X_test)
# we get the y_pred with threshold at 0.5
y_pred = np.where(pred>0.5,1,0)

In [20]:
y_pred

array([1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1,
       0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
       1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1,
       1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1,
       0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
       1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1])

In [29]:
# see the f1 score: harmonic mean of precision and recall
f1_score(y_test, y_pred)


0.9561752988047808

<b>Conclusion:</b> The above f1-score is obtained by selecting some random value for the hyper-parameters but this could be improved using bayesian optimization for hyper-parameter tuning

# Introduction to Bayesian Optimization
Not too long ago selection of hyper-parameters was termed as the 'art', some would say it requires skills to tune while others would say it's based on intuition. Additionally, the more complex the ML algorithm gets the more are the hyper-parameters to tune.

Fortunately, there are better ways to tune the hyper-parameters which are more effective than a random search like [Bayesian Optimization](https://arxiv.org/pdf/1807.02811.pdf), which arrives on the best hyper-parameter values by estimating and updating the probability distribution that describes potential values of the objective function. In simple word this means when the algorithm finds a good result for particular value of a hyper-parameter it intensifies the search around that value. We would discuss two python libraries __Hyperopt__ and __Optuna__ which use such approach.
1. Hyperopt Tuning<br>
In the codes below five methods have been called from Hypeopt library - <b>fmin</b>, <b>hp</b>, <b>tpe</b>, <b>trials</b> and <b>STATUS_OK</b>. <br>The first, <b>fmin</b> is the method that minimizes the objective function loss. If we want our model to perform with higher accuracy then what we would like to minimize is (1 minus accuracy) in same way if we want to minimize the F1score, the loss that we would minimize is (1 minus F1score).<br>
Next we need a parameter space from where the value of hyper-parameters would be selected. Such search space is defined by <b>hp</b><br>
The <b>tpe</b> method has algorithm to search from the space defined above and <b>Trials</b> method creates a database to to record the trials. And finally <b>STATUS_OK</b> which is mandatory to be returned from objective function to store the success of the run.

In [0]:
from hyperopt import  fmin, hp, tpe, Trials, STATUS_OK


In [0]:
def objective(params):
    
    h_model = lgb.train(params, lgb.Dataset(X_train, label = y_train))
    pred=h_model.predict(X_test)
    y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
    f1sc = f1_score(y_test, y_pred)
    loss = 1 - f1sc
    return {'loss': loss, 'status' : STATUS_OK}

space = {
    'lambda_l1': hp.uniform('lambda_l1', 0.0, 1.0),
    'lambda_l2': hp.uniform("lambda_l2", 0.0, 1.0),
    'learning_rate' : hp.loguniform('learning_rate', np.log(0.05), np.log(0.25)),
    'objective' : 'binary',
    'metric' : 'auc'
    }
trials = Trials()


In [73]:
# deine the maximum evaluations
best = fmin(fn=objective, space=space, algo=tpe.suggest, trials= trials, max_evals=100)

100%|██████████| 100/100 [00:09<00:00, 10.39it/s, best loss: 0.02788844621513953]


In [97]:
best

{'lambda_l1': 0.16001564250154893,
 'lambda_l2': 0.19419820383038114,
 'learning_rate': 0.13580995267128684}

In [0]:
# train the model on best parameter results
h_model = lgb.train(best, lgb.Dataset(X_train, label = y_train))

In [0]:
# get the y_pred
pred_h=h_model.predict(X_test)

y_predh = list(map(lambda x: int(x), pred_h>0.5))

In [37]:
f1_score(y_test, y_predh)

0.9682539682539683

<b>Conclusion:</b> We see that the new optimized parameters obtained have improved the f1-score on test dataset.<br>
Also notice that when fmin method is called we also have to specify the number of maximum evaluation or the number of time the space would be searched to predict the optimum parameters.

2. Optuna<br>
Optuna library too optimizes the hyper-parameters with Bayesian Optimization. But there is ease in coding over Hyperopt, firstly, we define objective function and hyper-parameter space inside a single function. Secondly, unlike Hyperopt which only 'minimizes' the objective function in Optuna we could define if we wish to maximize or minimize the objective.<br>

Notice that we have used 'minimize' as direction since we want an output similar to Hyperopt. We could have also returned f1-score with direction as 'maximum' in Optuna. The optimized hyper-parameters are stored in best_params attribute in study.

In [0]:
import optuna
import optuna.integration.lightgbm as lgbo
def optuna_obj(trial):
    '''Defining the parameters space inside the function for optuna optimization'''
    params = {
        'lambda_l1': trial.suggest_loguniform('lambda_l1', 1e-8, 10.0),
        'lambda_l2': trial.suggest_loguniform("lambda_l2", 1e-8, 10.0),
        'learning_rate' : trial.suggest_loguniform('learning_rate', 0.05, 0.25),
        'objective' : 'binary',
        'metric' : 'auc'
            }
    o_model = lgb.train(params, lgbo.Dataset(X_train, label = y_train))
    pred=o_model.predict(X_test)
    y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
    f1sc = f1_score(y_test, y_pred)
    loss = 1 - f1sc
    return loss
study = optuna.create_study(direction='minimize')
study.optimize(optuna_obj, n_trials=1000)

In [99]:
study.best_params

{'lambda_l1': 5.555188524228337e-05,
 'lambda_l2': 6.246148576377471e-06,
 'learning_rate': 0.13212756681506424}

In [42]:
o_model = lgb.train(study.best_params, lgb.Dataset(X_train, label = y_train))
pred_o=o_model.predict(X_test)

y_predo = list(map(lambda x: int(x), pred_o>0.5))
f1_score(y_test, y_predo)

0.9603174603174603

__Conclusion__: We see that using Optuna to improved the results from the earlier result<br> We should also observe that our initial parameters and the best parameters from Hyperopt and Optuna are all different. While it is understandable for manually defined parameters to be different, the reason for different Optuna and Hyperopt best parameters could lie with maximum evaluation and the internal algorithm's approach in optimizing. The f1-score does improve with the new parameters

# ML Object Classes for hyper-parameter tuning
The focus in this section would be to demonstrate how to create a classifier class. There are plenty of resources available on OOP programming but demonstration of complete classes is limited. We would use the above example to create a class. In building a class it is important to visualize the structure of your class, here we would create a simple flow as shown in the figure below.
![alt text](https://raw.githubusercontent.com/kshitijmamgain/Mlclass/master/Classflow.png)

In [0]:
class Mlclass():
    '''Parameter Tuning Class tunes the LightGBM model with different   optimization techniques - Hyperopt, Optuna.'''
    def __init__(self, x_train, y_train):
        '''Initializes the Parameter tuning class and also initializes   LightGBM dataset object
        Parameters
        ----------
        x_train: data (string, numpy array, pandas DataFrame,or list of numpy arrays) – Data source of Dataset.
        y_train: label (list, numpy 1-D array, pandas Series / one-column DataFrame or None – Label of the data.'''
        self.x_train = x_train
        self.y_train = y_train
        self.train_set = lgb.Dataset(data=x_train, label=y_train)

    def tuning(self, optim_type):
        '''Method takes the optimization type and tunes the model'''
        #call the optim_type: Hyperopt or Optuna
        optimization = getattr(self, optim_type)
        return optimization()
  
    def hyperopt_method(self):
        # This method is called by tuning when user inputs 'hyperopt_method' while calling the tuning method
    
        #define the hyperopt space
        space = {'lambda_l1': hp.uniform('lambda_l1', 0.0, 1.0),
                 'lambda_l2': hp.uniform("lambda_l2", 0.0, 1.0),
                 'learning_rate' : hp.loguniform('learning_rate',
                                                 np.log(0.05), np.log(0.25)),
                 'objective' : 'binary'}
        # define algorithm and trials inside the class
        algo, trials= tpe.suggest, Trials()
        #Call the fmin from inside the class
        best = fmin(fn=objective,space=space,algo=algo,trials=trials,max_evals=1000)
        self.params = best
        return best, trials
    def objective(self, params):
        # same objective function with added self
        h_model = lgb.train(params, lgb.Dataset(X_train, label = y_train))
        pred=h_model.predict(X_test)
        y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
        f1sc = f1_score(y_test, y_pred)
        loss = 1 - f1sc
        return {'loss': loss,'status' : STATUS_OK}

The ML class method is first initialized with training dataset and the target. A class is defined by starting with class. The functions defined inside the class are known as methods. Our first method is __init__ which is used to initialize the class. Here we want our class to be initialized with data-set and the target so we have given the inputs parameters as 'x_train' and 'y_train'. The self in the method is used to associate the function with an instance. Using 'self. ' as prefix to the variables also makes the class variables specific to that instance. Calling this class is as easy as:

In [0]:
Obj = Mlclass(X_train, y_train)

Once the class method is initialized we would add the method for Hypeorpt optimization. We would want user to input optimization type as Hypeorpt and then tune the model.

In [82]:
Obj.tuning('hyperopt_method')

100%|██████████| 1000/1000 [01:47<00:00,  9.29it/s, best loss: 0.019920318725099473]


({'lambda_l1': 0.00013088713969735405,
  'lambda_l2': 0.0007421261199961893,
  'learning_rate': 0.12828502142851694},
 <hyperopt.base.Trials at 0x7f21bdfc9da0>)

Let's add a similar method for Optuna

In [0]:
class Mlclass():
    '''Parameter Tuning Class tunes the LightGBM model with different   optimization techniques - Hyperopt, Optuna.'''
    def __init__(self, x_train, y_train):
        '''Initializes the Parameter tuning class and also initializes   LightGBM dataset object
        Parameters
        ----------
        x_train: data (string, numpy array, pandas DataFrame,or list of numpy arrays) – Data source of Dataset.
        y_train: label (list, numpy 1-D array, pandas Series / one-column DataFrame or None – Label of the data.'''
        self.x_train = x_train
        self.y_train = y_train
        self.train_set = lgb.Dataset(data=x_train, label=y_train)

    def tuning(self, optim_type):
        '''Method takes the optimization type and tunes the model'''
        #call the optim_type: Hyperopt or Optuna
        optimization = getattr(self, optim_type)
        return optimization()
     
    def optuna_method(self):
        study = optuna.create_study(direction='minimize')
        study.optimize(optuna_obj, n_trials=1000)
        self.params = study.best_params
        return study
    
    def optuna_obj(self, trial):
        '''Same optuna objective with parameters space inside the function for optuna optimization'''
        params = {'lambda_l1': trial.suggest_loguniform('lambda_l1', 1e-8, 10.0),
                  'lambda_l2': trial.suggest_loguniform("lambda_l2", 1e-8, 10.0),
                  'learning_rate' : trial.suggest_loguniform('learning_rate', 0.05, 0.25)}
        
        o_model = lgb.train(params, lgbo.Dataset(X_train, label = y_train))
        pred=o_model.predict(X_test)
        y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
        f1sc = f1_score(y_test, y_pred)
        loss = 1 - f1sc
        return loss

In [0]:
Obj = Mlclass(X_train, y_train)
Obj.tuning('optuna_method')

To use the best parameters for evaluation we defined another variable _self.params_ which would be defined for that instance and could be accessed by a yet to be defined train method.<br>
Once the model is trained we could evaluate that by giving test data-set and test target as the parameters. All the combined methods of the class are shown below


In [0]:
class Mlclass():
    '''Parameter Tuning Class tunes the LightGBM model with different   optimization techniques - Hyperopt, Optuna.'''
    def __init__(self, x_train, y_train):
        '''Initializes the Parameter tuning class and also initializes   LightGBM dataset object
        Parameters
        ----------
        x_train: data (string, numpy array, pandas DataFrame,or list of numpy arrays) – Data source of Dataset.
        y_train: label (list, numpy 1-D array, pandas Series / one-column DataFrame or None – Label of the data.'''
        self.x_train = x_train
        self.y_train = y_train
        self.train_set = lgb.Dataset(data=x_train, label=y_train)

    def tuning(self, optim_type):
        '''Method takes the optimization type and tunes the model'''
        #call the optim_type: Hyperopt or Optuna
        optimization = getattr(self, optim_type)
        return optimization()
  
    def hyperopt_method(self):
        # This method is called by tuning when user inputs 'hyperopt_method' while calling the tuning method
    
        #define the hyperopt space
        space = {'lambda_l1': hp.uniform('lambda_l1', 0.0, 1.0),
                 'lambda_l2': hp.uniform("lambda_l2", 0.0, 1.0),
                 'learning_rate' : hp.loguniform('learning_rate',
                                                 np.log(0.05), np.log(0.25)),
                 'objective' : 'binary'}
        # define algorithm and trials inside the class
        algo, trials= tpe.suggest, Trials()
        #Call the fmin from inside the class
        best = fmin(fn=objective,space=space,algo=algo,trials=trials,max_evals=1000)
        self.params = best
        return best, trials
    def objective(self, params):
        # same objective function with added self
        h_model = lgb.train(params, lgb.Dataset(X_train, label = y_train))
        pred=h_model.predict(X_test)
        y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
        f1sc = f1_score(y_test, y_pred)
        loss = 1 - f1sc
        return {'loss': loss,'status' : STATUS_OK}
    
    def optuna_method(self):
        study = optuna.create_study(direction='minimize')
        study.optimize(optuna_obj, n_trials=1000)
        self.params = study.best_params
        return study
    
    def optuna_obj(self, trial):
        '''Same optuna objective with parameters space inside the function for optuna optimization'''
        params = {'lambda_l1': trial.suggest_loguniform('lambda_l1', 1e-8, 10.0),
                  'lambda_l2': trial.suggest_loguniform("lambda_l2", 1e-8, 10.0),
                  'learning_rate' : trial.suggest_loguniform('learning_rate', 0.05, 0.25)}
        
        o_model = lgb.train(params, lgbo.Dataset(X_train, label = y_train))
        pred=o_model.predict(X_test)
        y_pred = np.array(list(map(lambda x: int(x), pred>0.5)))
        f1sc = f1_score(y_test, y_pred)
        loss = 1 - f1sc
        return loss

    def train(self):
        """This function evaluates the model on best parameters"""
        print("Model will be trained on the following parameters: \n{}".format(self.params))
        #train the model with best parameters
        self.gbm = lgb.train(self.params, self.train_set)
    def evaluate(self, x_test, y_test):
        # predict the values from x_test
        pred = self.gbm.predict(x_test)
        y_pred = np.where(pred>0.5,1,0)
        #print confusion matrix
        print(confusion_matrix(y_test,y_pred))
        #print classification report
        print(classification_report(y_test, y_pred))

In [95]:
Obj = Mlclass(X_train, y_train)
Obj.tuning('hyperopt_method')
Obj.train()


100%|██████████| 1000/1000 [01:46<00:00,  9.38it/s, best loss: 0.02400000000000002]
Model will be trained on the following parameters: 
{'lambda_l1': 0.004494621171141966, 'lambda_l2': 0.014337729938273924, 'learning_rate': 0.09527473901625513}


TypeError: ignored

In [96]:
Obj.evaluate(X_test, y_test)

[[ 60   6]
 [  2 122]]
              precision    recall  f1-score   support

           0       0.97      0.91      0.94        66
           1       0.95      0.98      0.97       124

    accuracy                           0.96       190
   macro avg       0.96      0.95      0.95       190
weighted avg       0.96      0.96      0.96       190

