This Notebook will be used as documentation for the changes across the models. 

We'll look at changes across:

The Feature Importances, top 10 feature list, parameter space, and pdp plots. 

See the cell after the next one for commonalities among the sets below. 

In [2]:
# all features in order of importance

# values from the paper 

paper_top10 = ["mindaOverRH", "daOverRH2", "maxdaOverRH",  
                "min_ecross3", "norm_LyapunovTime", "da_overRH1",
                "min_ecross2", "norm_a3_slope", "min_ecross1", "norm_a1_slope"]

paper_params = {'base_score':0.5, 'colsample_bylevel':1, 'colsample_bytree':1.0,
       'gamma':0, 'learning_rate':0.002, 'max_delta_step':0, 'max_depth':8,
       'min_child_weight':1.2, 'missing':None, 'n_estimators':5000, 'nthread':-1,
       'objective':'binary:logistic', 'reg_alpha':0, 'reg_lambda':1,
       'scale_pos_weight':1, 'seed':27, 'silent':True, 'subsample':0.5}

# non-tuned top 10 : 
non_tuned_top10 = ['mindaOverRH', 'maxdaOverRH', 'min_ecross3',
 'min_ecross2', 'norm_LyapunovTime', 'daOverRH2',
 'daOverRH1', 'norm_max_a2', 'norm_max_window10_a3', 'min_ecross1']

non_tuned_params = {'criterion':'friedman_mse', 'init':None,
              'learning_rate':0.1, 'loss':'deviance', 'max_depth':3,
              'max_features':None, 'max_leaf_nodes':None,
              'min_impurity_split':1e-07, 'min_samples_leaf':1,
              'min_samples_split':2,'min_weight_fraction_leaf':0.0,
              'n_estimators':5000, 'presort':'auto', 'random_state':None,
              'subsample':1.0, 'verbose':0, 'warm_start':False}

# partial top 10 features:
partial_top10 = ['norm_max_a2', 'mindaOverRH', 'maxdaOverRH',
 'norm_std_a2', 'daOverRH2', 'norm_LyapunovTime', 
 'daOverRH1', 'norm_std_a3', 'min_ecross3', 'norm_max_window10_a3']

partial_params = {'criterion':'friedman_mse', 'init':None,
              'learning_rate':0.01, 'loss':'deviance', 'max_depth':9,
              'max_features':0.5, 'max_leaf_nodes':None,
              'min_impurity_split':1e-07, 'min_samples_leaf':1,
              'min_samples_split':2, 'min_weight_fraction_leaf':0.0,
              'n_estimators':5900, 'presort':'auto', 'random_state':None,
              'subsample':0.8, 'verbose':0, 'warm_start':False}


# Full parameter space top 10:
full_top10 = ['norm_max_a2', 'maxdaOverRH', 'mindaOverRH', 
 'norm_std_a2', 'daOverRH2', 'norm_LyapunovTime',
 'norm_std_a3', 'daOverRH1', 'norm_max_a1', 'max_ecross2']

full_params = {'criterion':'friedman_mse', 'init':None,
              'learning_rate':0.029, 'loss':'deviance', 'max_depth':9,
              'max_features':0.59, 'max_leaf_nodes':13,
              'min_impurity_split':1e-07, 'min_samples_leaf':4,
              'min_samples_split':20, 'min_weight_fraction_leaf':0.0,
              'n_estimators':2300, 'presort':'auto', 'random_state':None,
              'subsample':0.9, 'verbose':0, 'warm_start':False}

In [15]:
commonalities = (set(non_tuned_top10) & set(original_top10) & 
                 set(partial_top10) & set(full_top10) & set(paper_top10))

In [16]:
print commonalities

set(['maxdaOverRH', 'mindaOverRH', 'daOverRH2', 'norm_LyapunovTime'])


without the xgboost list:

In [17]:
commonalities = (set(non_tuned_top10) & set(original_top10) & 
                 set(partial_top10) & set(full_top10))
print commonalities

set(['mindaOverRH', 'daOverRH1', 'daOverRH2', 'norm_LyapunovTime', 'maxdaOverRH', 'norm_max_a2'])


taking order into account: 

In [21]:
commonalities_order = [ i for i, j, k, l, m in zip(non_tuned_top10, original_top10, 
                 partial_top10, full_top10, paper_top10) if i == j 
                      == k == l == m ]

print commonalities_order

[]


again without the paper 

In [25]:
commonalities_order = [ i for i, j, k, l in zip(non_tuned_top10, original_top10, 
                 partial_top10, full_top10) if i == j 
                      == k == l ]

print commonalities_order

[]


The scores across models:


From the paper: 

Accuracy : 0.89

PR_AUC Score (Test): 0.901043

ROC_AUC_Score:  0.949938574939

From the replication: 

Accuracy : 0.882

AUC Score (Test): 0.895542

ROC_AUC_Score:  0.945628156566 


Partial parameter space hyperopt: 

Accuracy : 0.8927

AUC Score (Test): 0.898141

ROC_AUC_Score:  0.950990052553

Full Space hyperopt

Accuracy : 0.8933

AUC Score (Test): 0.899324

ROC_AUC_Score:  0.950904739967

In [1]:
# only the features that are common and different across the models

non_tuned_params = {
              'learning_rate':0.1, 'max_depth':3,
              'max_features':None, 'max_leaf_nodes':None,
              'min_impurity_split':1e-07, 'min_samples_leaf':1,
              'min_samples_split':2, 'n_estimators':5000, 
              'subsample':1.0}

original_top10_params = {
              'learning_rate':0.002,  'max_depth':8,
              'max_features':0.5, 'max_leaf_nodes':8, 
                'min_impurity_split':1e-07, 'min_samples_leaf':1,
                'min_samples_split':2, 'n_estimators':5000,
                'subsample':0.5}

partial_params = {
              'learning_rate':0.01,  'max_depth':9,
              'max_features':0.5, 'max_leaf_nodes':None,
              'min_impurity_split':1e-07, 'min_samples_leaf':1,
              'min_samples_split':2, 'n_estimators':5900,
            'subsample':0.8}

full_params = {
              'learning_rate':0.029,  'max_depth':9,
              'max_features':0.59, 'max_leaf_nodes':13,
              'min_impurity_split':1e-07, 'min_samples_leaf':4,
              'min_samples_split':20, 'n_estimators':2300,
                'subsample':0.9}


Below are three of the feature importance

XGBoost from paper 

![title](Feature_Importance_XGBoost.png) 
    
Partial Parameter Space Hyperopt

![title](Partial_Parameter_Space_Importances.png)

Full Parameter Space Hyperopt

![title](Full_Parameter_Space_Importances.png)


Looking at the figures, there are some significant differences. For example daoverRH2 and maxdaOverRH:


![title](daoverrh2_maxdaoverrhp.png) ![title](daoverrh2_maxdaoverrhF.png)

mindaOverRH and daOverRH2

![title](mindaoverrh_daoverrh2P.png) ![title](mindaoverrh_daoverrh2F.png)

minDaOverRH and normMaxA2

![title](mindaOverRH_normaxa2P.png) ![title](mindaOverRH_normaxa2F.png)

normMaxa2 and daoverrh2

![title](normmaxa2_daoverrh2P.png) ![title](normmaxa2_daoverrh2F.png)

normmaxa2 and minecross3:


![title](normmaxa2_minecross3P.png) ![title](normmaxa2_minecross3F.png)

normmaxa2 and normstda3:


![title](normmaxa2_normstda3P.png) ![title](normmaxa2_normstda3.png)