# Classification Voting Classifier VS VecStack VS Mlxtend

**Comparing 3 stacking "methods" for classifiction problems: Vecstack, Voting Classifier, Mlxtend**<br>

### Table of Contents

* [**Database and Simple-Mode**](#Database-and-Simple-Mode)
    * [Data base: breast cancer](#Database)
    * [Models](#Models)
    * [In Sample Scoring: Standalone Models](#ins1)
    * [Out Sample Scoring: Standalone Models](#ins2)


* [**VecStack**](#Vec)
    * [In Sample Scoring:VecStack](#Veci)
    * [Out Sample Scoring:VecStack](#Veco)
    
    
* [**Voting Classifier**](#vc)
    * [In Sample Scoring:Voting Classifier](#vci)
    * [Out Sample Scoring:Voting Classifier](#vco)


* [**Mlxtend**](#ml)
    * [In Sample Scoring:Mlxtend](#mli)
    * [Out Sample Scoring:Mlxtend](#mlo)
    

* [**Grid Search**](#ad)

### Methodology
The aim was to compare the results of 3 stacking "methods".<br>
**For Vecstack and Mlxtend**, I used as base models: *Gradient Boosting Classifier (GBC), XGBoost Classifier (XGB), and Random Forest (RF) and Light Gradient Boosting Method (LGBM)* as metalearner. <br>**For Voting Classifier**, I used all four models with equal weight to create a stacked model.<br>



### Findings
Use-cases (based on out of sample scores):
- 1) Comparison between Vecstack and Mlxtend: GBC and RF have exactly the same score (0.89 and 0.93) and mlxtend xgb is slightly higher that vecstack (0.982 vs 0.973).Therefor, we could expect similar resluts for both stacking models **BUT**:<br>
**Mlxtend stacked model is of 0.903 while Vecstack is at 0.982**
- 2) Comparison between Voting Classifier and Vecstack (only comparing the stacked model results):<br>
**Voting Classifier is at 0.964 while Vectstack is at 0.982**

### Conclusion<br>
**Out of the three stacking methods only Vecstack methods has a stacking score higher that all individual models**. Therefore, it converged the weak learners into strong learners.<br>

Before doing this exercise, I was expecting some minor differences between each score (mostly related to different implementation of the stacking algorithm) but not a difference of **8%** between two methods.<br>

Another important point to mention is that only Mlxtend have a lower out of sample score that in sample (which is expected). For Vecstack and Voting Classifier it is the opposite, I am not sure of the reasons, maybe it comes from the dataset or the cross-validation – **in general, out of sample score should “always” be lower than in sample score.**

**Important note:** I am NOT pretending that Vecstack is better than Mlxtend or Voting classifier, for this database it seems yes. The results might be different with another dataset or features engineering.


**Stacking Classifier rescources:**<br>
Vecstack: [Github]( https://github.com/vecxoz/vecstack)<br>
Voting Classifier: [Sklearn Ensemble]( http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html)<br>
Mlxtend:  [Github]( https://rasbt.github.io/mlxtend/)<br>


#### Basic Libraries

In [1]:
import pandas as pd
import numpy as np

## Database and Simple Model <a class = "anchor" id="Database-and-Simple-Mode"></a>

#### DataBase <a class = "anchor" id="Database"></a>
I load the "breast cancer" Database for this exercice (it is one of the pre build database available on Sklearn).

In [2]:
from sklearn.datasets import load_breast_cancer

#import breast cancer data set (binary target [0,1])
df = load_breast_cancer()

#quick check on the data and traget shape
print("features shape",df.data.shape)
print("target shape",df.target.shape)

features = df.data
target = df.target

features shape (569, 30)
target shape (569,)


#### Modelling Libraries

In [3]:
from sklearn.ensemble import GradientBoostingClassifier, RandomForestClassifier, VotingClassifier
from sklearn.model_selection import GridSearchCV, train_test_split
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score
from sklearn.model_selection import KFold


In [4]:
#Cross Validation
kfold = KFold(n_splits=3, shuffle=False, random_state=42)

#Cross Validation function
def classification_crossvalscore(model, x, y):
    acc = cross_val_score(model, x, y, scoring = 'accuracy', cv = kfold )
    #prec = cross_val_score(model, x, y, scoring = 'precision', cv = 5)
    #recall = cross_val_score(model, x, y, scoring = 'recall', cv = 5)
    #rocauc = cross_val_score(model, x, y, scoring = 'roc_auc', cv = 5)
    return acc#, prec, recall, rocauc

#### Split in training and testing

In [5]:
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size = .2, random_state = 42, shuffle = False)

#### Models - with Gridsearch <a class = "anchor" id="Models">

In [6]:
#Gradient Boosting
model1 = GradientBoostingClassifier(criterion='friedman_mse',
                                    learning_rate= 0.5,
                                    loss= 'deviance', 
                                    max_depth= 100, 
                                    n_estimators= 100)

#model1 = model1.fit(X_train,y_train)

#XGBoost
model2 = XGBClassifier(booster='gbtree',
                       gamma= 0, 
                       learning_rate= 0.5,
                       max_depth= 3,
                       n_estimators= 100, 
                       objective= 'binary:logistic')

#model2 = model2.fit(X_train,y_train)

#Random Forest
model3 = RandomForestClassifier(criterion= 'gini',
                                max_depth = 100,
                                min_samples_leaf= 1,
                                min_samples_split= 2,
                                n_estimators= 20)

#model3 = model3.fit(X_train,y_train)

#lightGBM
model4 = LGBMClassifier(boosting_type = 'gbdt',
                        learning_rate = 0.5,
                        max_depth = -1,
                        n_estimators = 1000,
                        num_leaves = 31)


#### In Sample Scoring Standalone Model <a class = "anchor" id="ins1">

In [7]:
for model,label in zip([model1,model2,model3, model4],["GBC","XGB", "RF", "LGBM"]):
              print(label,classification_crossvalscore(model,X_train,y_train).mean())

GBC 0.907764029278
XGB 0.953816660857
RF 0.958246194958
LGBM 0.956067735564


In [8]:
#predict model and stack model
GBC_pred = model1.fit(X_train,y_train).predict(X_test)
XGB_pred = model2.fit(X_train,y_train).predict(X_test)
RF_pred = model3.fit(X_train,y_train).predict(X_test)
LGB_pred = model4.fit(X_train,y_train).predict(X_test)

#### Out Sample Scoring Standalone Model <a class = "anchor" id="ins2">

In [9]:
print("Out Sample scoring-Standalone")


for model,label in zip([model1,model2,model3, model4],["GBC","XGB", "RF", "LGBM"]):
              print(label,classification_crossvalscore(model,X_train,y_train).mean())

Out Sample scoring-Standalone
GBC 0.927558963634
XGB 0.953816660857
RF 0.949416172883
LGBM 0.956067735564


## Vecstack Stacking <a class = "anchor" id="Vec">

In [10]:
from vecstack import stacking

#Initialize first level
models = [model1, model2, model3] #input models as a list for metamodel

#Performing Stacking
S_train, S_test = stacking(models,                     # list of models
                           X_train, y_train, X_test,   # data
                           regression=False,            # regression task (if you need  classification - set to False)
                           mode='oof_pred_bag',        # mode: oof for train set, predict test  //set in each fold and find mean
                           save_dir=None,              # do not save result and log (to save in current dir - set to '.')
                           metric=accuracy_score,      # metric: callable
                           n_folds=3,                  # number of folds
                           shuffle=True,               # shuffle the data
                           random_state=42,             # ensure reproducibility
                           verbose=2)                  # print all info """


# Initialize 2nd level model  with XGB
model = model4
    
# Fit 2nd level model
stack = model.fit(S_train, y_train)

# Predict
#y_pred = model.predict(S_test)

task:         [classification]
n_classes:    [2]
metric:       [accuracy_score]
mode:         [oof_pred_bag]
n_models:     [3]

model  0:     [GradientBoostingClassifier]
    fold  0:  [0.94078947]
    fold  1:  [0.92763158]
    fold  2:  [0.91390728]
    ----
    MEAN:     [0.92744278] + [0.01097542]
    FULL:     [0.92747253]

model  1:     [XGBClassifier]
    fold  0:  [0.95394737]
    fold  1:  [0.94736842]
    fold  2:  [0.95364238]
    ----
    MEAN:     [0.95165272] + [0.00303202]
    FULL:     [0.95164835]

model  2:     [RandomForestClassifier]
    fold  0:  [0.96052632]
    fold  1:  [0.96052632]
    fold  2:  [0.94701987]
    ----
    MEAN:     [0.95602417] + [0.00636700]
    FULL:     [0.95604396]



#### In Sample Scoring-Vecstack<a class = "anchor" id="Veci">

In [11]:
print("In Sample scoring-Vecstack")


for model,label in zip([model1,model2,model3, stack],["GBC","XGB", "RF", "Vecstack"]):
              print(label,classification_crossvalscore(model,X_train,y_train).mean())

In Sample scoring-Vecstack
GBC 0.916550482166
XGB 0.953816660857
RF 0.953860230045
Vecstack 0.956067735564


#### Out of Sample scoring-Vecstack<a class = "anchor" id="Veco">

In [12]:
#predict model and stack model
GBC_pred = model1.fit(X_train,y_train).predict(X_test)
XGB_pred = model2.fit(X_train,y_train).predict(X_test)
RF_pred = model3.fit(X_train,y_train).predict(X_test)
stack_pred = stack.predict(S_test)  #has to be on the S_test not on X_test

In [13]:
#accuracy score out of sample
print("Out of Sample scoring-Vecstack")

for model, label in zip([GBC_pred, XGB_pred, RF_pred, stack_pred],["GBC","XGB", "RF", "Vecstack"]):
    print(label,accuracy_score(y_test, model, normalize = True)) 

Out of Sample scoring-Vecstack
GBC 0.894736842105
XGB 0.973684210526
RF 0.938596491228
Vecstack 0.982456140351


## Voting Classifier Stacking<a class = "anchor" id="vc">

In [14]:
estimators = [('GBC', model1),('XGB', model2),('RF', model3), ('LGBM', model4)]

#Voting Classifier
ecf = VotingClassifier(estimators= estimators,
                          voting='soft',            #soft for probability
                          weights=None,
                          n_jobs=1,
                          flatten_transform=None
                          )


#Voting Classifier fit
vc = ecf.fit(X_train, y_train)

#Voting Classifier predict
VC_pred=vc.predict(X_test)

#### In Sample Scoring-Voting Classifier<a class = "anchor" id="vci">

In [15]:
print("In Sample scoring-Voting Classifier")

for model,label in zip([model1,model2,model3,model4, vc],["GBC","XGB", "RF","LGBM" ,"VC_stack"]):
              print(label,classification_crossvalscore(model,X_train,y_train).mean())

In Sample scoring-Voting Classifier
GBC 0.923158475659
XGB 0.953816660857
RF 0.938451260602
LGBM 0.956067735564
VC_stack 0.949474265133


#### Out of Sample Scoring-Voting Classifier<a class = "anchor" id="vco">

In [16]:
#accuracy score out of sample
print("Out of Sample scoring-Voting Classifier")


for model, label in zip([GBC_pred, XGB_pred, RF_pred,LGB_pred,VC_pred],["GBC","XGB", "RF","LGBM" ,"VC_stack"]):
    print(label,accuracy_score(y_test, model, normalize = True)) 

Out of Sample scoring-Voting Classifier
GBC 0.894736842105
XGB 0.973684210526
RF 0.938596491228
LGBM 0.982456140351
VC_stack 0.964912280702


## Mlxtend Stacking<a class = "anchor" id="ml">

In [17]:
from mlxtend.classifier import StackingClassifier

In [18]:
#models classifier
classifiers = [model1,model2,model3]

#stacking classifier
stack_c = StackingClassifier(classifiers = classifiers,
                            meta_classifier=model4)

#mlxtend classifier fit
stack = stack_c.fit(X_train, y_train)

#mlxtend classifier predict
mlx_pred=stack.predict(X_test)

#### In Sample Scoring-mlxtend Classifier<a class = "anchor" id="mli">

In [19]:
print("In Sample scoring-mlxtend Classifier")

for model,label in zip([model1,model2,model3, vc],["GBC","XGB", "RF", "mlx_stack"]):
              print(label,classification_crossvalscore(model,X_train,y_train).mean())

In Sample scoring-mlxtend Classifier
GBC 0.91878703381
XGB 0.953816660857
RF 0.945088300221
mlx_stack 0.94945974207


#### Out of Sample Scoring-mlxtend Classifier<a class = "anchor" id="mlo">

In [20]:
#accuracy score out of sample
print("Out of Sample scoring-mlxtend Classifier")

for model, label in zip([GBC_pred, LGB_pred, RF_pred, mlx_pred],["GBC","XGB", "RF", "mlx_stack"]):
    print(label,accuracy_score(y_test, model, normalize = True)) 

Out of Sample scoring-mlxtend Classifier
GBC 0.894736842105
XGB 0.982456140351
RF 0.938596491228
mlx_stack 0.90350877193


## Ad hoc Grid search<a class = "anchor" id="ad">

In [145]:
#### GBM model
#model
model = GradientBoostingClassifier()

#grid search parmaters
para =  {'loss':['deviance'], 
         'learning_rate':[0.1,0.5, 0.8],
         'n_estimators':[100, 200, 1000],
         'criterion':['friedman_mse'],
         'max_depth':[None, 100]}

#grid search
grid_search = GridSearchCV(estimator=model,
                            param_grid=para,
                            scoring=None,
                            fit_params=None,
                            n_jobs=1, iid=True,
                            refit=True,
                            cv=None,
                            verbose=5,
                            pre_dispatch='2*n_jobs',
                            error_score='raise',
                            return_train_score='warn')


#GridSearch fit
grid_search.fit(X_train,y_train)



Fitting 3 folds for each of 18 candidates, totalling 54 fits
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100, score=0.9342105263157895, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.1s remaining:    0.0s


[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100, score=0.9342105263157895, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=100, score=0.9403973509933775, total=   0.1s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200 


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.3s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.4s remaining:    0.0s


[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200, score=0.9342105263157895, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200, score=0.9342105263157895, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=200, score=0.9403973509933775, total=   0.1s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=1000 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=1000, score=0.9276315789473685, total=   0.1s
[CV] criterion=friedman_mse, learning_rate=0.1, loss=deviance, max_depth=None, n_estimators=1000 
[CV]  criterion=friedman_mse, learning_rate=0.1, loss=dev

[CV]  criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=100, score=0.9403973509933775, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200 
[CV]  criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200, score=0.9407894736842105, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200 
[CV]  criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200, score=0.9342105263157895, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200 
[CV]  criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=200, score=0.9337748344370861, total=   0.0s
[CV] criterion=friedman_mse, learning_rate=0.8, loss=deviance, max_depth=None, n_estimators=1000 
[CV]  criterion=friedman_mse, learning_rate=0.8, loss=devia

[Parallel(n_jobs=1)]: Done  54 out of  54 | elapsed:    9.9s finished


GridSearchCV(cv=None, error_score='raise',
       estimator=GradientBoostingClassifier(criterion='friedman_mse', init=None,
              learning_rate=0.1, loss='deviance', max_depth=3,
              max_features=None, max_leaf_nodes=None,
              min_impurity_decrease=0.0, min_impurity_split=None,
              min_samples_leaf=1, min_samples_split=2,
              min_weight_fraction_leaf=0.0, n_estimators=100,
              presort='auto', random_state=None, subsample=1.0, verbose=0,
              warm_start=False),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'loss': ['deviance'], 'learning_rate': [0.1, 0.5, 0.8], 'n_estimators': [100, 200, 1000], 'criterion': ['friedman_mse'], 'max_depth': [None, 100]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring=None, verbose=5)

In [146]:
#Best GridSearch Paramters
print(grid_search.best_params_)

#Best GridSearch estimator
best_gridsearch1 = grid_search.best_estimator_

#Fit model with best estimator
#best_gridsearch.fit(X_train, y_test)

{'criterion': 'friedman_mse', 'learning_rate': 0.5, 'loss': 'deviance', 'max_depth': 100, 'n_estimators': 100}


In [147]:
#### XGB model
#model
model = XGBClassifier()

#grid search parmaters
para = {'max_depth':[3,10,100],
        'learning_rate':[0.1, 0.5], #= "eta"
        'n_estimators':[100,350, 1000],
        'objective':['binary:logistic'],  
        'booster':['gbtree'],
        'gamma':[0,0.5,1]}

#grid search
grid_search = GridSearchCV(estimator=model,
                            param_grid=para,
                            scoring=None,
                            fit_params=None,
                            n_jobs=1, iid=True,
                            refit=True,
                            cv=None,
                            verbose=5,
                            pre_dispatch='2*n_jobs',
                            error_score='raise',
                            return_train_score='warn')


#GridSearch fit
grid_search.fit(X_train,y_train)

Fitting 3 folds for each of 54 candidates, totalling 162 fits
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9276315789473685, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s


[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9736842105263158, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.1s remaining:    0.0s


[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9735099337748344, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic, score=0.9342105263157895, total=   0.7s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic, score=0.9736842105263158, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic 


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.9s remaining:    0.0s


[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=350, objective=binary:logistic, score=0.9735099337748344, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9342105263157895, total=   0.2s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9736842105263158, total=   0.2s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.1, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9735099337748344, total=   0.2s
[CV] booster=gbtree, gamma=0, learning_rate=0.1, max_depth=10, n_estimators=100, 

[CV]  booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=350, objective=binary:logistic, score=0.9736842105263158, total=   0.0s
[CV] booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=350, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=350, objective=binary:logistic, score=0.9602649006622517, total=   0.1s
[CV] booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=1000, objective=binary:logistic, score=0.9539473684210527, total=   0.2s
[CV] booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=1000, objective=binary:logistic, score=0.9736842105263158, total=   0.1s
[CV] booster=gbtree, gamma=0, learning_rate=0.5, max_depth=10, n_estimators=

[CV]  booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=350, objective=binary:logistic, score=0.9276315789473685, total=   0.1s
[CV] booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=350, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=350, objective=binary:logistic, score=0.9671052631578947, total=   0.1s
[CV] booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=350, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=350, objective=binary:logistic, score=0.9536423841059603, total=   0.0s
[CV] booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=0.5, learning_rate=0.1, max_depth=100, n_estimators=1000, objective=binary:logistic, score=0.9276315789473685, total=   0.4s
[CV] booster=gbtree, gamma=0.5, learning_rate=0.1, max_de

[CV]  booster=gbtree, gamma=0.5, learning_rate=0.5, max_depth=100, n_estimators=1000, objective=binary:logistic, score=0.9536423841059603, total=   0.4s
[CV] booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9144736842105263, total=   0.0s
[CV] booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9671052631578947, total=   0.0s
[CV] booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=100, objective=binary:logistic, score=0.9602649006622517, total=   0.0s
[CV] booster=gbtree, gamma=1, learning_rate=0.1, max_depth=3, n_estimators=350, ob

[CV]  booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=350, objective=binary:logistic, score=0.9470198675496688, total=   0.1s
[CV] booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9407894736842105, total=   1.1s
[CV] booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9671052631578947, total=   0.6s
[CV] booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic 
[CV]  booster=gbtree, gamma=1, learning_rate=0.5, max_depth=3, n_estimators=1000, objective=binary:logistic, score=0.9470198675496688, total=   0.5s
[CV] booster=gbtree, gamma=1, learning_rate=0.5, max_depth=10, n_estimators=100, 

[Parallel(n_jobs=1)]: Done 162 out of 162 | elapsed:   44.5s finished


GridSearchCV(cv=None, error_score='raise',
       estimator=XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, objective='binary:logistic', random_state=0,
       reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
       silent=True, subsample=1),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'max_depth': [3, 10, 100], 'learning_rate': [0.1, 0.5], 'n_estimators': [100, 350, 1000], 'objective': ['binary:logistic'], 'booster': ['gbtree'], 'gamma': [0, 0.5, 1]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring=None, verbose=5)

In [148]:
#Best GridSearch Paramters
print(grid_search.best_params_)

#Best GridSearch estimator
best_gridsearch2 = grid_search.best_estimator_

#Fit model with best estimator
#best_gridsearch.fit(X_train, y_test)

{'booster': 'gbtree', 'gamma': 0, 'learning_rate': 0.5, 'max_depth': 3, 'n_estimators': 100, 'objective': 'binary:logistic'}


In [149]:
#### Random Forest model
#model
model = RandomForestClassifier()

#grid search parmaters
para = para =  {'n_estimators':[10, 20, 100, 1000],
         'criterion':['gini'], 
         'max_depth':[None, 100, 200],              
         'min_samples_split':[2,4],
         'min_samples_leaf':[1,30]}

#grid search
grid_search = GridSearchCV(estimator=model,
                            param_grid=para,
                            scoring=None,
                            fit_params=None,
                            n_jobs=1, iid=True,
                            refit=True,
                            cv=None,
                            verbose=5,
                            pre_dispatch='2*n_jobs',
                            error_score='raise',
                            return_train_score='warn')


#GridSearch fit
grid_search.fit(X_train,y_train)

Fitting 3 folds for each of 48 candidates, totalling 144 fits
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10, score=0.9144736842105263, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10, score=0.9671052631578947, total=   0.0s

[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.0s remaining:    0.0s



[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10, score=0.9470198675496688, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20 


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.1s remaining:    0.0s


[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20, score=0.9210526315789473, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20 


[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.2s remaining:    0.0s


[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20, score=0.9802631578947368, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=20, score=0.9735099337748344, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100, score=0.9276315789473685, total=   0.3s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100, score=0.9671052631578947, total=   0.1s
[CV] criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=1, min_samples_s

[CV]  criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=100, score=0.9013157894736842, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=100 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=100, score=0.9473684210526315, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=100 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=100, score=0.9337748344370861, total=   0.0s
[CV] criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=1000 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=1000, score=0.8947368421052632, total=   0.9s
[CV] criterion=gini, max_depth=None, min_samples_leaf=30, min_samples_split=4, n_estimators=1000 
[CV]  criterion=gini, max_depth=None, min_samples_leaf=30

[CV]  criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=100, score=0.8947368421052632, total=   0.0s
[CV] criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=100 
[CV]  criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=100, score=0.9539473684210527, total=   0.0s
[CV] criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=100 
[CV]  criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=100, score=0.9403973509933775, total=   0.0s
[CV] criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=1000 
[CV]  criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=1000, score=0.8881578947368421, total=   1.4s
[CV] criterion=gini, max_depth=100, min_samples_leaf=30, min_samples_split=2, n_estimators=1000 
[CV]  criterion=gini, max_depth=100, min_samples_leaf=30, min_sam

[CV]  criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=100, score=0.9342105263157895, total=   0.1s
[CV] criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=100 
[CV]  criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=100, score=0.9802631578947368, total=   0.1s
[CV] criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=100 
[CV]  criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=100, score=0.9337748344370861, total=   0.1s
[CV] criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=1000 
[CV]  criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=1000, score=0.9342105263157895, total=   1.4s
[CV] criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_split=4, n_estimators=1000 
[CV]  criterion=gini, max_depth=200, min_samples_leaf=1, min_samples_spli

[Parallel(n_jobs=1)]: Done 144 out of 144 | elapsed:  1.2min finished


GridSearchCV(cv=None, error_score='raise',
       estimator=RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'n_estimators': [10, 20, 100, 1000], 'criterion': ['gini'], 'max_depth': [None, 100, 200], 'min_samples_split': [2, 4], 'min_samples_leaf': [1, 30]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring=None, verbose=5)

In [150]:
#Best GridSearch Paramters
print(grid_search.best_params_)

#Best GridSearch estimator
best_gridsearch3= grid_search.best_estimator_

#Fit model with best estimator
#best_gridsearch.fit(X_train, y_test)

{'criterion': 'gini', 'max_depth': 100, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 20}


In [151]:
#### LightGBM
#model
model = LGBMClassifier()

#grid search parmaters
para = para =  {'boosting_type':['gbdt', 'dart'],
        'num_leaves':[31, 50],
        'max_depth':[-1],
        'learning_rate':[0.1, 0.5],
        'n_estimators':[100,300, 1000]}

#grid search
grid_search = GridSearchCV(estimator=model,
                            param_grid=para,
                            scoring=None,
                            fit_params=None,
                            n_jobs=1, iid=True,
                            refit=True,
                            cv=None,
                            verbose=5,
                            pre_dispatch='2*n_jobs',
                            error_score='raise',
                            return_train_score='warn')


#GridSearch fit
grid_search.fit(X_train,y_train)

Fitting 3 folds for each of 24 candidates, totalling 72 fits
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31, score=0.9210526315789473, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31, score=0.9868421052631579, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.0s remaining:    0.0s


[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=31, score=0.9602649006622517, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50 


[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.1s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.1s remaining:    0.0s


[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50, score=0.9210526315789473, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50, score=0.9868421052631579, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=100, num_leaves=50, score=0.9602649006622517, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31, score=0.9342105263157895, total=   0.0s
[CV] boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31 
[CV]  boosting_type=gbdt, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31, score=0.97368421

[CV]  boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31, score=0.9276315789473685, total=   0.1s
[CV] boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31 
[CV]  boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31, score=0.9802631578947368, total=   0.1s
[CV] boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31 
[CV]  boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=31, score=0.9668874172185431, total=   0.1s
[CV] boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=50 
[CV]  boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=50, score=0.9276315789473685, total=   0.6s
[CV] boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=50 
[CV]  boosting_type=dart, learning_rate=0.1, max_depth=-1, n_estimators=300, num_leaves=50, score=0.98026315

[Parallel(n_jobs=1)]: Done  72 out of  72 | elapsed:   16.1s finished


GridSearchCV(cv=None, error_score='raise',
       estimator=LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
        learning_rate=0.1, max_depth=-1, min_child_samples=20,
        min_child_weight=0.001, min_split_gain=0.0, n_estimators=100,
        n_jobs=-1, num_leaves=31, objective=None, random_state=None,
        reg_alpha=0.0, reg_lambda=0.0, silent=True, subsample=1.0,
        subsample_for_bin=200000, subsample_freq=1),
       fit_params=None, iid=True, n_jobs=1,
       param_grid={'boosting_type': ['gbdt', 'dart'], 'num_leaves': [31, 50], 'max_depth': [-1], 'learning_rate': [0.1, 0.5], 'n_estimators': [100, 300, 1000]},
       pre_dispatch='2*n_jobs', refit=True, return_train_score='warn',
       scoring=None, verbose=5)

In [152]:
#Best GridSearch Paramters
print(grid_search.best_params_)

#Best GridSearch estimator
best_gridsearch4= grid_search.best_estimator_

#Fit model with best estimator
#best_gridsearch.fit(X_train, y_test)

{'boosting_type': 'dart', 'learning_rate': 0.5, 'max_depth': -1, 'n_estimators': 1000, 'num_leaves': 31}
