<a href="https://colab.research.google.com/github/parulnith/Demo1/blob/master/Selecting_a_Better_XGBoost_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 8: Debugging XGBoost

### Selecting a Better XGBoost Model

we want to start our debugging exercise on
solid footing by selecting a highly stable, generalizable, and valuable model. To do
that, we won’t just rely on grid search. Instead we’ll select a model as inspired by the
[Caruana et al. cross-validation ranking approach](https://www.cs.cornell.edu/people/tj/publications/caruana_etal_04a.pdf) used in the **2004 Knowledge Discovery in Databases (KDD) Cup**. We’ll also compare these results to a standard ran‐
dom grid search so we can get an idea of the difference between a grid search and the
cross-validation ranking procedure described in this section. Then, before moving
onto sensitivity analysis, we’ll do a basic estimation of our model’s business value to
sanity check that we’re not wasting money.

## Setting the Environment

Download the [zipped **data_and_package** folder](https://github.com/ml-for-high-risk-apps-book/Machine-Learning-for-High-Risk-Applications-Book/blob/main/code/data_and_package.zip) onto your local system and save it as `Data.zip`. 

In [1]:
# Upload the downloaded zipped file from your system to the colab environment. 
from google.colab import files
uploaded = files.upload()

Saving Data.zip to Data.zip


In [2]:
!unzip -q "/content/Data.zip" 
%cd /content/Data

/content/Data


In [3]:
# Installing the libraries
%pip install 'XGBoost==1.6'

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting XGBoost==1.6
  Downloading xgboost-1.6.0-py3-none-manylinux2014_x86_64.whl (193.7 MB)
[K     |████████████████████████████████| 193.7 MB 37 kB/s 
Installing collected packages: XGBoost
  Attempting uninstall: XGBoost
    Found existing installation: xgboost 0.90
    Uninstalling xgboost-0.90:
      Successfully uninstalled xgboost-0.90
Successfully installed XGBoost-1.6.0


### Global hyperpameters

In [4]:
SEED = 12345 # global random seed for better reproducibility
ROUND = 3 # insane precision is unhelpful
N_MODELS = 50 # should be less than 100

### Python imports and inits

In [5]:
# suppres Pandas future warning ... they are deprecating `append` ... thanks
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import numpy as np   # array, vector, matrix calculations
import pandas as pd  # DataFrame handling
import xgboost as xgb 
import matplotlib.pyplot as plt # general plotting
pd.options.display.max_columns = 999 # enable display of all columns in notebook

# for grid search custom functions
import itertools
import json

# for model eval
from sklearn.metrics import accuracy_score, f1_score, log_loss, mean_squared_error, roc_auc_score

# display plots in-notebook
%matplotlib inline   

# set numpy random seed
np.random.seed(SEED)

### Importing dataset 

In [7]:
data = pd.read_csv('../Data/Data/credit_line_increase.csv')
data.head()

Unnamed: 0,ID,LIMIT_BAL,SEX,RACE,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,DELINQ_NEXT
0,1,20000,2,1.0,2,1,24,2,2,-1,-1,-2,-2,3913,3102,689,0,0,0,0,689,0,0,0,0,1
1,2,120000,2,2.0,2,2,26,-1,2,0,0,0,2,2682,1725,2682,3272,3455,3261,0,1000,1000,1000,0,2000,1
2,3,90000,2,3.0,2,2,34,0,0,0,0,0,0,29239,14027,13559,14331,14948,15549,1518,1500,1000,1000,1000,5000,0
3,4,50000,2,4.0,2,1,37,0,0,0,0,0,0,46990,48233,49291,28314,28959,29547,2000,2019,1200,1100,1069,1000,0
4,5,50000,1,3.0,2,1,57,-1,0,-1,0,0,0,8617,5670,35835,20940,19146,19131,2000,36681,10000,9000,689,679,0


### Assign target and inputs for models
Note that Demographic features are not used as model inputs.

In [10]:
id_col = 'ID'
groups = ['SEX', 'RACE', 'EDUCATION', 'MARRIAGE', 'AGE']
target = 'DELINQ_NEXT'

In [11]:
np.random.seed(SEED)

split_train_test = 2/3

split = np.random.rand(len(data)) < split_train_test
train = data[split].copy()
test = data[~split].copy()

split_test_valid = 1/2

split = np.random.rand(len(test)) < split_test_valid
valid = test[split].copy()
test = test[~split].copy()

del data

print(f"Train/Validation/Test sizes: {len(train)}/{len(valid)}/{len(test)}")

Train/Validation/Test sizes: 19919/5045/5036


In [12]:
target = 'DELINQ_NEXT'
demographic_cols = ['SEX', 'RACE','EDUCATION', 'MARRIAGE', 'AGE']
features = [col for col in train.columns if col not in demographic_cols + ['ID',target]]

print('target =', target)
print('predictors =', features)

target = DELINQ_NEXT
predictors = ['LIMIT_BAL', 'PAY_0', 'PAY_2', 'PAY_3', 'PAY_4', 'PAY_5', 'PAY_6', 'BILL_AMT1', 'BILL_AMT2', 'BILL_AMT3', 'BILL_AMT4', 'BILL_AMT5', 'BILL_AMT6', 'PAY_AMT1', 'PAY_AMT2', 'PAY_AMT3', 'PAY_AMT4', 'PAY_AMT5', 'PAY_AMT6']


In [13]:
# Converting Pandas dataframe into DMatrix 
training_frame = xgb.DMatrix(train[features], train[target])
validation_frame = xgb.DMatrix(valid[features], valid[target])
test_frame = xgb.DMatrix(test[features], test[target])

## Training XGBoost model

#### Utility Functions For Training 
Using random grid search to find the best hyperparameter values

In [8]:
def _train(_dtrain, _dvalid, _mono_constraints=None, _xgb_params=None, _ntree=None,
          _early_stopping_rounds=None, _verbose=None, _seed=None, _logger=None):

    """ Wrapper for XGBoost train method.

    :param _dtrain: Training data in LightSVM format.
    :param _dvalid: Validation data in LightSVM format.
    :param _mono_constraints: User-supplied monotonicity constraints.
    :param _xgb_params: XGBoost hyperparameters.
    :param _ntree: Number of trees in XGBoost model, default 250.
    :param _early_stopping_rounds: XGBoost early stopping rounds, default 100.
    :param _verbose: Whether to display training iterations, default True.
    :param _seed: Random seed for better interpretability, defaults to global SEED.
    :return: Trained XGBoost model.

    """

    if _mono_constraints is not None:
        _xgb_params['monotone_constraints'] = _mono_constraints

    # must train on AUC
    xgb_params['eval_metric'] = 'auc'
        
    print('Training with parameters:')
    print(json.dumps(_xgb_params, indent=2))        
        
    watchlist = [(_dtrain, 'train'), (_dvalid, 'eval')]

    # train
    model = xgb.train(_xgb_params,
                      _dtrain,
                      _ntree,
                      early_stopping_rounds=_early_stopping_rounds,
                      evals=watchlist,
                      verbose_eval=_verbose)

    return model


def random_grid_train(_dtrain, _dvalid, _mono_constraints=None, _xgb_params=None, 
                      _cv_params=None, _n_models=None, _ntree=None, 
                      _early_stopping_rounds=None, _verbose=None,
                      _seed=None):
    
    """ Performs a random grid search over _n_models and _cv_params.

    :param _dtrain: Training data in LightSVM format.
    :param _dvalid: Validation data in LightSVM format.
    :param _mono_constraints: User-supplied monotonicity constraints.
    :param _xgb_params: XGBoost hyperparameters.
    :param _cv_params: Dictionary of lists of potential XGBoost parameters over which to search.
    :param _n_models: Number of random models to evaluate.
    :param _ntree: Number of trees in XGBoost model, default 250.
    :param _early_stopping_rounds: XGBoost early stopping rounds, default 100.
    :param _verbose: Whether to display training iterations, default True.
    :param _seed: Random seed for better interpretability, defaults to global SEED.
    :return: tuple of (best candidate model from random grid search, entire grid of models)

    """

    print('Starting random grid search over %d models.' % int(_n_models))

    # cartesian product of _cv_params
    keys, values = zip(*_cv_params.items())
    experiments = [dict(zip(keys, v)) for v in itertools.product(*values)]

    # select randomly from cartesian product space
    selected_experiments = np.random.choice(len(experiments), _n_models)

    # pull in global params for objective, monotonicity etc.
    _params: dict = _xgb_params

    # init grid search loop conditional on eval_metric
    best_candidate = None
    best_score = 0

    # full dict of grid candidates
    candidates = {}
    
    # grid search loop
    for i, exp in enumerate(selected_experiments):

        _params.update(experiments[exp])  # override global params with current grid run params

        print('Grid search run %d/%d.' % (int(i + 1), int(_n_models)))

        # train on current params
        candidate = _train(_dtrain, _dvalid, _mono_constraints=_mono_constraints, _ntree=_ntree,
                           _xgb_params=xgb_params, _early_stopping_rounds=_early_stopping_rounds,
                           _verbose=_verbose, _seed=_seed)

        candidates[i] = {'model': candidate,  'score': candidate.best_score}
        
        if candidate.best_score > best_score:
            best_candidate = candidate
            best_score = candidate.best_score
            print('Grid search new best score discovered at iteration %d/%d: %.4f.' %
                     (int(i + 1), int(_n_models), candidate.best_score))

        print()
        print('----------- ------------')
        print()
    
    return best_candidate, candidates


#### Train random grid search 

In [14]:
# default params
xgb_params = {'booster': 'gbtree',
              'colsample_bytree': 0.6,
              'eta': 0.001,
              'max_depth': 5,
              'objective': 'binary:logistic',
              'reg_alpha': 0.005,
              'reg_lambda': 0.005,
              'seed': SEED,
              'subsample': 0.6}

gs_params = {'colsample_bytree': [0.3, 0.5, 0.7],
             'eta': [0.005, 0.05, 0.3],
             'max_depth': [3, 5, 7],
             'reg_alpha': [0.0005, 0.005, 0.05],
             'reg_lambda': [0.0005, 0.005, 0.05],
             'subsample': [0.3, 0.5, 0.7]}

# grid search prelims 
xgb_params['nthread'] = 16
train_mean_y = float(train[target].mean())
xgb_params['base_score'] = train_mean_y  # mean of y

# +1 positive correlation to target
# 0 no correlation to target
# -1 negative correlation to target
mono_constraints = {'LIMIT_BAL': -1,
                    'PAY_0': 1,
                    'PAY_2': 1,
                    'PAY_3': 1,
                    'PAY_4': 1,
                    'PAY_5': 1,
                    'PAY_6': 1,
                    'BILL_AMT1': -1,
                    'BILL_AMT2': -1,
                    'BILL_AMT3': -1,
                    'BILL_AMT4': -1,
                    'BILL_AMT5': -1,
                    'BILL_AMT6': -1,
                    'PAY_AMT1': -1,
                    'PAY_AMT2': -1,
                    'PAY_AMT3': -1,
                    'PAY_AMT4': -1,
                    'PAY_AMT5': -1,
                    'PAY_AMT6': -1}

n_gs_models = 50
ntree = 1000
early_stopping_rounds = 50
verbose = False

# train
best_xgb, grid = random_grid_train(training_frame, validation_frame, _mono_constraints=mono_constraints, 
                                   _xgb_params=xgb_params, _cv_params=gs_params, _n_models=n_gs_models,
                                   _ntree=ntree, _early_stopping_rounds=early_stopping_rounds, 
                                   _verbose=verbose, _seed=SEED)


Starting random grid search over 50 models.
Grid search run 1/50.
Training with parameters:
{
  "booster": "gbtree",
  "colsample_bytree": 0.7,
  "eta": 0.05,
  "max_depth": 5,
  "objective": "binary:logistic",
  "reg_alpha": 0.005,
  "reg_lambda": 0.0005,
  "seed": 12345,
  "subsample": 0.3,
  "nthread": 16,
  "base_score": 0.22029218334253728,
  "monotone_constraints": {
    "LIMIT_BAL": -1,
    "PAY_0": 1,
    "PAY_2": 1,
    "PAY_3": 1,
    "PAY_4": 1,
    "PAY_5": 1,
    "PAY_6": 1,
    "BILL_AMT1": -1,
    "BILL_AMT2": -1,
    "BILL_AMT3": -1,
    "BILL_AMT4": -1,
    "BILL_AMT5": -1,
    "BILL_AMT6": -1,
    "PAY_AMT1": -1,
    "PAY_AMT2": -1,
    "PAY_AMT3": -1,
    "PAY_AMT4": -1,
    "PAY_AMT5": -1,
    "PAY_AMT6": -1
  },
  "eval_metric": "auc"
}
Grid search new best score discovered at iteration 1/50: 0.7781.

----------- ------------

Grid search run 2/50.
Training with parameters:
{
  "booster": "gbtree",
  "colsample_bytree": 0.7,
  "eta": 0.005,
  "max_depth": 7,
  "obj

## Overall Rank after Grid Search based on AUC

In [15]:
models_rank = pd.DataFrame().from_dict(grid, orient='index')
models_rank.sort_values(by='score', ascending=False, inplace=True)
models_rank.reset_index(inplace=True)
models_rank

# index is index in grid dictionary
# "model 0 " = grid[35]

Unnamed: 0,index,model,score
0,35,<xgboost.core.Booster object at 0x7fc2e575f610>,0.779246
1,12,<xgboost.core.Booster object at 0x7fc2e5fde990>,0.778798
2,2,<xgboost.core.Booster object at 0x7fc2e5fdad90>,0.778757
3,24,<xgboost.core.Booster object at 0x7fc2e56f3890>,0.778717
4,49,<xgboost.core.Booster object at 0x7fc2e5fdac10>,0.778612
5,43,<xgboost.core.Booster object at 0x7fc2e575fd10>,0.778604
6,3,<xgboost.core.Booster object at 0x7fc2e575fb90>,0.778585
7,13,<xgboost.core.Booster object at 0x7fc2e5fdeed0>,0.778562
8,1,<xgboost.core.Booster object at 0x7fc2e5fdef10>,0.778538
9,10,<xgboost.core.Booster object at 0x7fc2e56e6490>,0.778391


## Cross-Validation(CV) Rank Model Selection

In [16]:
scores_frame = pd.DataFrame(columns=['fold', target] + ['grid_' + str(i) for i in range(0, len(grid))])
scores_frame['fold'] = np.random.choice(5, valid.shape[0])
scores_frame[target] = valid[target].reset_index(drop=True) 

for i in range(0, len(grid)):
    scores_frame['grid_' + str(i)] = grid[i]['model'].predict(validation_frame, iteration_range=(0, grid[i]['model'].best_iteration))
    
scores_frame

Unnamed: 0,fold,DELINQ_NEXT,grid_0,grid_1,grid_2,grid_3,grid_4,grid_5,grid_6,grid_7,grid_8,grid_9,grid_10,grid_11,grid_12,grid_13,grid_14,grid_15,grid_16,grid_17,grid_18,grid_19,grid_20,grid_21,grid_22,grid_23,grid_24,grid_25,grid_26,grid_27,grid_28,grid_29,grid_30,grid_31,grid_32,grid_33,grid_34,grid_35,grid_36,grid_37,grid_38,grid_39,grid_40,grid_41,grid_42,grid_43,grid_44,grid_45,grid_46,grid_47,grid_48,grid_49
0,1,0,0.221286,0.208298,0.217799,0.219338,0.206882,0.220918,0.223615,0.222002,0.205431,0.222171,0.204398,0.215501,0.214513,0.208342,0.216655,0.225181,0.222008,0.197121,0.212810,0.221339,0.202247,0.225816,0.239688,0.191390,0.210892,0.204824,0.217636,0.199058,0.221288,0.203353,0.202247,0.221706,0.216862,0.203323,0.202581,0.211211,0.212053,0.222010,0.223613,0.202154,0.222550,0.216655,0.204824,0.219209,0.223203,0.221077,0.200450,0.190231,0.221749,0.216441
1,4,0,0.077636,0.073772,0.056054,0.067098,0.058368,0.170727,0.215132,0.215671,0.068296,0.196531,0.075092,0.075746,0.064782,0.073579,0.088214,0.174997,0.215623,0.072462,0.094049,0.169439,0.059403,0.169611,0.064231,0.093251,0.063547,0.087981,0.063163,0.054343,0.077647,0.076187,0.059403,0.214552,0.174381,0.059833,0.069265,0.060835,0.074593,0.177467,0.215132,0.091235,0.213308,0.088214,0.087976,0.067392,0.052601,0.194652,0.072231,0.066836,0.216111,0.106648
2,0,1,0.406944,0.431426,0.434341,0.437330,0.448536,0.284362,0.226468,0.225673,0.432211,0.232619,0.456742,0.432793,0.418810,0.431587,0.475254,0.278141,0.225613,0.413683,0.417919,0.284539,0.421660,0.279563,0.465336,0.420028,0.430973,0.415757,0.427744,0.399570,0.406956,0.426181,0.421660,0.227728,0.277449,0.422064,0.451932,0.428331,0.411904,0.248119,0.226467,0.419707,0.228090,0.475254,0.415764,0.438980,0.411743,0.236143,0.433100,0.417296,0.224778,0.389346
3,0,0,0.040661,0.034568,0.027171,0.031559,0.007832,0.154167,0.215132,0.213611,0.032905,0.196531,0.043770,0.005596,0.039803,0.035029,0.043473,0.166071,0.214319,0.059815,0.039699,0.155237,0.027988,0.163143,0.048702,0.039050,0.028543,0.054053,0.026823,0.027765,0.040688,0.016499,0.027988,0.213932,0.169084,0.030948,0.036690,0.025779,0.044255,0.176718,0.215132,0.058730,0.211661,0.043473,0.054037,0.031561,0.011327,0.190565,0.020672,0.033850,0.215524,0.043698
4,2,0,0.309868,0.324695,0.332560,0.333235,0.297807,0.247037,0.221846,0.222354,0.333370,0.224659,0.326723,0.326726,0.328874,0.324826,0.312709,0.241993,0.222354,0.296158,0.344369,0.247079,0.320216,0.242400,0.310329,0.298708,0.331873,0.304848,0.327833,0.325438,0.309877,0.335210,0.320216,0.222540,0.235593,0.314679,0.327842,0.335203,0.312245,0.240906,0.221845,0.306495,0.224318,0.312709,0.304851,0.332854,0.328171,0.232019,0.320167,0.325354,0.220699,0.338281
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5040,2,0,0.186945,0.180367,0.189749,0.199419,0.243228,0.217183,0.217785,0.219667,0.175712,0.200119,0.164001,0.236919,0.183062,0.180386,0.151602,0.206935,0.219234,0.208001,0.223314,0.217199,0.206014,0.208600,0.154258,0.204571,0.210215,0.181563,0.213037,0.237838,0.186948,0.200728,0.206014,0.219203,0.211900,0.203741,0.208719,0.197417,0.193508,0.198664,0.217785,0.170188,0.219723,0.151602,0.181565,0.200048,0.229623,0.207624,0.199828,0.179790,0.218568,0.203866
5041,2,0,0.173339,0.161411,0.154168,0.156192,0.140643,0.207456,0.219049,0.218779,0.153228,0.209937,0.167489,0.136511,0.152864,0.161688,0.148556,0.205109,0.218948,0.163112,0.172771,0.207725,0.165573,0.206002,0.204336,0.162788,0.154820,0.167036,0.152280,0.153156,0.173342,0.146797,0.165573,0.218717,0.206048,0.166885,0.155528,0.156444,0.164883,0.208921,0.219050,0.165352,0.218816,0.148556,0.167031,0.155164,0.149390,0.212636,0.166154,0.156890,0.218260,0.192109
5042,1,1,0.801150,0.800072,0.815356,0.817892,0.888723,0.416857,0.242196,0.241190,0.820270,0.331959,0.839424,0.880192,0.804500,0.800086,0.830861,0.418247,0.241025,0.833151,0.848454,0.416624,0.863921,0.418070,0.831435,0.846099,0.833838,0.800167,0.828767,0.865696,0.801160,0.836198,0.863921,0.242052,0.419350,0.861856,0.833500,0.824707,0.817480,0.392248,0.242192,0.791377,0.246163,0.830861,0.800174,0.818735,0.876255,0.326370,0.818963,0.816505,0.240034,0.823890
5043,3,0,0.214329,0.215937,0.220764,0.229855,0.217794,0.212693,0.219412,0.219569,0.219879,0.216300,0.222394,0.223151,0.219256,0.215914,0.227699,0.214807,0.219255,0.219475,0.197215,0.212701,0.217300,0.215447,0.182706,0.209647,0.220017,0.217051,0.217141,0.221483,0.214334,0.211717,0.217300,0.219646,0.208452,0.218146,0.208467,0.204676,0.214402,0.217782,0.219412,0.214952,0.218850,0.227699,0.217049,0.229344,0.216311,0.218112,0.206691,0.213261,0.219436,0.201225


#### Utiltity Function for max. acc.

In [17]:
def max_acc(y, phat, res=0.01): 

    """ Utility function for finding max. accuracy at some cutoff. 
    
        :param y: Known y values.
        :param phat: Model scores.
        :param res: Resolution over which to search for max. accuracy, default 0.01.
        :return: Max. accuracy for model scores.
    
    """
    
    # init frame to store acc at different cutoffs
    acc_frame = pd.DataFrame(columns=['cut', 'acc'])
    
    # copy known y and score values into a temporary frame
    temp_df = pd.concat([y, phat], axis=1)
    
    # find accuracy at different cutoffs and store in acc_frame
    for cut in np.arange(0, 1 + res, res):
        temp_df['decision'] = np.where(temp_df.iloc[:, 1] > cut, 1, 0)
        acc = accuracy_score(temp_df.iloc[:, 0], temp_df['decision'])
        acc_frame = acc_frame.append({'cut': cut,
                                      'acc': acc},
                                     ignore_index=True)

    # find max accurcay across all cutoffs
    max_acc = acc_frame['acc'].max()
    
    # house keeping
    del acc_frame, temp_df
    
    return max_acc

#### Utility function for max. F1

In [18]:
def max_f1(y, phat, res=0.01): 
    
    """ Utility function for finding max. F1 at some cutoff. 
    
        :param y: Known y values.
        :param phat: Model scores.
        :param res: Resolution over which to search for max. F1, default 0.01.
        :return: Max. F1 for model scores.
    
    """
    
    # init frame to store f1 at different cutoffs
    f1_frame = pd.DataFrame(columns=['cut', 'f1'])
    
    # copy known y and score values into a temporary frame
    temp_df = pd.concat([y, phat], axis=1)
    
    # find f1 at different cutoffs and store in acc_frame
    for cut in np.arange(0, 1 + res, res):
        temp_df['decision'] = np.where(temp_df.iloc[:, 1] > cut, 1, 0)
        f1 = f1_score(temp_df.iloc[:, 0], temp_df['decision'])
        f1_frame = f1_frame.append({'cut': cut,
                                    'f1': f1},
                                    ignore_index=True)
        
    # find max f1 across all cutoffs
    max_f1 = f1_frame['f1'].max()
    
     # house keeping
    del f1_frame, temp_df
    
    return max_f1

#### Rank all scores

In [19]:
eval_frame = pd.DataFrame() # init frame to hold score ranking
metric_list = ['acc', 'auc', 'f1', 'logloss', 'mse'] # metric to use for evaluation

# create eval frame row-by-row
for fold in sorted(scores_frame['fold'].unique()): # loop through folds 
    for metric_name in metric_list: # loop through metrics
        
        # init row dict to hold each rows values
        row_dict = {'fold': fold,
                    'metric': metric_name}
        
        # cache known y values for fold
        fold_y = scores_frame.loc[scores_frame['fold'] == fold, target]
        
        for col_name in scores_frame.columns[2:]:
            
            # cache fold scores
            fold_scores = scores_frame.loc[scores_frame['fold'] == fold, col_name]
            
            # calculate evaluation metric for fold
            # with reasonable precision 
            
            if metric_name == 'acc':
                row_dict[col_name] = np.round(max_acc(fold_y, fold_scores), ROUND)
                
            if metric_name == 'auc':
                row_dict[col_name] = np.round(roc_auc_score(fold_y, fold_scores), ROUND)
                
            if metric_name == 'f1':
                row_dict[col_name] = np.round(max_f1(fold_y, fold_scores), ROUND) 
                
            if metric_name == 'logloss':
                row_dict[col_name] = np.round(log_loss(fold_y, fold_scores), ROUND)
                
            if metric_name == 'mse':
                row_dict[col_name] = np.round(mean_squared_error(fold_y, fold_scores), ROUND)
        
        # append row values to eval_frame
        eval_frame = eval_frame.append(row_dict, ignore_index=True)

# init a temporary frame to hold rank information
rank_names = [name + '_rank' for name in eval_frame.columns if name not in ['fold', 'metric']]
rank_frame = pd.DataFrame(columns=rank_names)        

# set columns to necessary order
eval_frame = eval_frame[['fold', 'metric'] + [name for name in sorted(eval_frame.columns) if name not in ['fold', 'metric']]]

# determine score ranks row-by-row
for i in range(0, eval_frame.shape[0]):
        
        # get ranks for row based on metric
        metric_name = eval_frame.loc[i, 'metric']
        if metric_name in ['logloss', 'mse']:
            ranks = eval_frame.iloc[i, 2:].rank().values
        else:
            ranks = eval_frame.iloc[i, 2:].rank(ascending=False).values
        
        # create single-row frame and append to rank_frame
        row_frame = pd.DataFrame(ranks.reshape(1, ranks.shape[0]), columns=rank_names)
        rank_frame = rank_frame.append(row_frame, ignore_index=True)
        
        # house keeping
        del row_frame

# merge ranks onto eval_frame
eval_frame = pd.concat([eval_frame, rank_frame], axis=1)

# house keeping
del rank_frame
        
eval_frame

Unnamed: 0,fold,metric,grid_0,grid_1,grid_10,grid_11,grid_12,grid_13,grid_14,grid_15,grid_16,grid_17,grid_18,grid_19,grid_2,grid_20,grid_21,grid_22,grid_23,grid_24,grid_25,grid_26,grid_27,grid_28,grid_29,grid_3,grid_30,grid_31,grid_32,grid_33,grid_34,grid_35,grid_36,grid_37,grid_38,grid_39,grid_4,grid_40,grid_41,grid_42,grid_43,grid_44,grid_45,grid_46,grid_47,grid_48,grid_49,grid_5,grid_6,grid_7,grid_8,grid_9,grid_0_rank,grid_1_rank,grid_2_rank,grid_3_rank,grid_4_rank,grid_5_rank,grid_6_rank,grid_7_rank,grid_8_rank,grid_9_rank,grid_10_rank,grid_11_rank,grid_12_rank,grid_13_rank,grid_14_rank,grid_15_rank,grid_16_rank,grid_17_rank,grid_18_rank,grid_19_rank,grid_20_rank,grid_21_rank,grid_22_rank,grid_23_rank,grid_24_rank,grid_25_rank,grid_26_rank,grid_27_rank,grid_28_rank,grid_29_rank,grid_30_rank,grid_31_rank,grid_32_rank,grid_33_rank,grid_34_rank,grid_35_rank,grid_36_rank,grid_37_rank,grid_38_rank,grid_39_rank,grid_40_rank,grid_41_rank,grid_42_rank,grid_43_rank,grid_44_rank,grid_45_rank,grid_46_rank,grid_47_rank,grid_48_rank,grid_49_rank
0,0.0,acc,0.835,0.832,0.833,0.828,0.832,0.832,0.832,0.819,0.82,0.828,0.827,0.824,0.832,0.826,0.82,0.829,0.83,0.83,0.833,0.832,0.823,0.835,0.828,0.83,0.826,0.818,0.821,0.825,0.834,0.829,0.833,0.827,0.819,0.83,0.825,0.817,0.832,0.833,0.83,0.827,0.829,0.828,0.832,0.821,0.825,0.824,0.819,0.821,0.832,0.826,1.5,12.0,5.5,26.5,12.0,12.0,12.0,47.0,44.5,26.5,30.0,38.5,12.0,33.0,44.5,23.0,19.0,19.0,5.5,12.0,40.0,1.5,26.5,19.0,33.0,49.0,42.0,36.0,3.0,23.0,5.5,30.0,47.0,19.0,36.0,50.0,12.0,5.5,19.0,30.0,23.0,26.5,12.0,42.0,36.0,38.5,47.0,42.0,12.0,33.0
1,0.0,auc,0.794,0.795,0.795,0.793,0.797,0.795,0.79,0.79,0.794,0.791,0.796,0.79,0.796,0.794,0.791,0.794,0.793,0.795,0.794,0.792,0.797,0.794,0.791,0.795,0.794,0.79,0.789,0.794,0.794,0.795,0.793,0.793,0.786,0.793,0.792,0.792,0.79,0.794,0.795,0.794,0.794,0.793,0.794,0.787,0.795,0.79,0.786,0.794,0.795,0.787,20.5,9.0,9.0,30.5,1.5,9.0,42.5,42.5,20.5,38.0,3.5,42.5,3.5,20.5,38.0,20.5,30.5,9.0,20.5,35.0,1.5,20.5,38.0,9.0,20.5,42.5,46.0,20.5,20.5,9.0,30.5,30.5,49.5,30.5,35.0,35.0,42.5,20.5,9.0,20.5,20.5,30.5,20.5,47.5,9.0,42.5,49.5,20.5,9.0,47.5
2,0.0,f1,0.581,0.576,0.58,0.575,0.581,0.576,0.573,0.571,0.573,0.578,0.578,0.567,0.58,0.585,0.567,0.583,0.579,0.583,0.577,0.573,0.587,0.581,0.57,0.575,0.585,0.554,0.571,0.585,0.577,0.578,0.575,0.573,0.568,0.577,0.577,0.554,0.573,0.577,0.575,0.585,0.572,0.579,0.573,0.574,0.584,0.567,0.568,0.563,0.574,0.567,10.0,24.5,12.5,27.5,10.0,24.5,34.5,39.5,34.5,17.0,17.0,45.5,12.5,3.5,45.5,7.5,14.5,7.5,21.0,34.5,1.0,10.0,41.0,27.5,3.5,49.5,39.5,3.5,21.0,17.0,27.5,34.5,42.5,21.0,21.0,49.5,34.5,21.0,27.5,3.5,38.0,14.5,34.5,30.5,6.0,45.5,42.5,48.0,30.5,45.5
3,0.0,logloss,0.424,0.424,0.423,0.43,0.423,0.424,0.426,0.48,0.527,0.427,0.424,0.479,0.424,0.426,0.48,0.426,0.426,0.424,0.424,0.425,0.427,0.424,0.427,0.424,0.426,0.527,0.482,0.426,0.424,0.424,0.424,0.482,0.527,0.424,0.431,0.525,0.426,0.424,0.424,0.43,0.499,0.427,0.424,0.528,0.425,0.479,0.527,0.527,0.423,0.5,11.5,11.5,2.0,33.5,2.0,11.5,25.0,38.5,47.0,30.5,11.5,36.5,11.5,25.0,38.5,25.0,25.0,11.5,11.5,20.5,30.5,11.5,30.5,11.5,25.0,47.0,40.5,25.0,11.5,11.5,11.5,40.5,47.0,11.5,35.0,44.0,25.0,11.5,11.5,33.5,42.0,30.5,11.5,50.0,20.5,36.5,47.0,47.0,2.0,43.0
4,0.0,mse,0.131,0.131,0.13,0.131,0.131,0.131,0.132,0.154,0.172,0.133,0.131,0.154,0.131,0.132,0.154,0.132,0.132,0.131,0.132,0.131,0.132,0.131,0.132,0.131,0.132,0.172,0.155,0.132,0.131,0.131,0.131,0.155,0.172,0.132,0.133,0.172,0.132,0.132,0.131,0.132,0.162,0.132,0.131,0.173,0.132,0.154,0.172,0.172,0.131,0.162,10.0,10.0,1.0,10.0,10.0,10.0,26.0,37.5,46.5,34.5,10.0,37.5,10.0,26.0,37.5,26.0,26.0,10.0,26.0,10.0,26.0,10.0,26.0,10.0,26.0,46.5,40.5,26.0,10.0,10.0,10.0,40.5,46.5,26.0,34.5,46.5,26.0,26.0,10.0,26.0,42.5,26.0,10.0,50.0,26.0,37.5,46.5,46.5,10.0,42.5
5,1.0,acc,0.794,0.794,0.796,0.787,0.795,0.794,0.795,0.792,0.793,0.792,0.791,0.784,0.795,0.794,0.791,0.794,0.791,0.793,0.794,0.791,0.79,0.794,0.794,0.792,0.794,0.783,0.784,0.793,0.792,0.795,0.793,0.793,0.79,0.792,0.787,0.788,0.795,0.794,0.793,0.791,0.793,0.794,0.793,0.788,0.792,0.785,0.79,0.795,0.796,0.79,14.0,14.0,1.5,45.5,5.5,14.0,5.5,30.5,23.5,30.5,36.0,48.5,5.5,14.0,36.0,14.0,36.0,23.5,14.0,36.0,40.5,14.0,14.0,30.5,14.0,50.0,48.5,23.5,30.5,5.5,23.5,23.5,40.5,30.5,45.5,43.5,5.5,14.0,23.5,36.0,23.5,14.0,23.5,43.5,30.5,47.0,40.5,5.5,1.5,40.5
6,1.0,auc,0.754,0.756,0.753,0.752,0.755,0.756,0.754,0.747,0.747,0.752,0.753,0.743,0.755,0.752,0.748,0.749,0.748,0.754,0.752,0.753,0.752,0.754,0.753,0.753,0.752,0.746,0.741,0.752,0.753,0.755,0.753,0.748,0.734,0.752,0.747,0.747,0.754,0.752,0.753,0.751,0.749,0.751,0.754,0.743,0.752,0.744,0.734,0.748,0.755,0.74,9.5,1.5,16.5,25.5,4.5,1.5,9.5,40.5,40.5,25.5,16.5,45.5,4.5,25.5,36.5,33.5,36.5,9.5,25.5,16.5,25.5,9.5,16.5,16.5,25.5,43.0,47.0,25.5,16.5,4.5,16.5,36.5,49.5,25.5,40.5,40.5,9.5,25.5,16.5,31.5,33.5,31.5,9.5,45.5,25.5,44.0,49.5,36.5,4.5,48.0
7,1.0,f1,0.545,0.55,0.547,0.558,0.545,0.548,0.558,0.537,0.524,0.543,0.556,0.525,0.545,0.544,0.533,0.543,0.546,0.544,0.547,0.546,0.543,0.545,0.534,0.543,0.544,0.527,0.53,0.542,0.542,0.538,0.549,0.539,0.526,0.545,0.538,0.53,0.558,0.547,0.544,0.542,0.536,0.545,0.539,0.521,0.545,0.525,0.526,0.528,0.539,0.536,16.0,5.0,9.0,2.0,16.0,7.0,2.0,36.0,49.0,25.5,4.0,47.5,16.0,21.5,40.0,25.5,11.5,21.5,9.0,11.5,25.5,16.0,39.0,25.5,21.5,44.0,41.5,29.0,29.0,34.5,6.0,32.0,45.5,16.0,34.5,41.5,2.0,9.0,21.5,29.0,37.5,16.0,32.0,50.0,16.0,47.5,45.5,43.0,32.0,37.5
8,1.0,logloss,0.482,0.483,0.486,0.499,0.484,0.483,0.484,0.517,0.558,0.484,0.485,0.517,0.486,0.489,0.516,0.488,0.487,0.486,0.482,0.487,0.491,0.482,0.489,0.485,0.489,0.557,0.519,0.488,0.487,0.486,0.484,0.52,0.557,0.481,0.497,0.556,0.484,0.482,0.485,0.496,0.534,0.489,0.485,0.558,0.484,0.517,0.557,0.557,0.485,0.534,3.5,6.5,20.5,35.0,10.5,6.5,10.5,38.0,49.5,10.5,16.0,38.0,20.5,29.5,36.0,26.5,24.0,20.5,3.5,24.0,32.0,3.5,29.5,16.0,29.5,46.5,40.0,26.5,24.0,20.5,10.5,41.0,46.5,1.0,34.0,44.0,10.5,3.5,16.0,33.0,42.5,29.5,16.0,49.5,10.5,38.0,46.5,46.5,16.0,42.5
9,1.0,mse,0.154,0.154,0.155,0.158,0.154,0.154,0.154,0.17,0.186,0.155,0.155,0.17,0.154,0.156,0.17,0.156,0.156,0.155,0.154,0.155,0.156,0.154,0.156,0.155,0.156,0.186,0.171,0.156,0.155,0.154,0.155,0.171,0.186,0.154,0.158,0.185,0.154,0.154,0.155,0.157,0.177,0.156,0.155,0.186,0.155,0.17,0.186,0.186,0.155,0.177,6.5,6.5,18.5,34.5,6.5,6.5,6.5,37.5,47.5,18.5,18.5,37.5,6.5,28.5,37.5,28.5,28.5,18.5,6.5,18.5,28.5,6.5,28.5,18.5,28.5,47.5,40.5,28.5,18.5,6.5,18.5,40.5,47.5,6.5,34.5,44.0,6.5,6.5,18.5,33.0,42.5,28.5,18.5,47.5,18.5,37.5,47.5,47.5,18.5,42.5


#### Display simple ranked list

In [20]:
rank_frame = eval_frame[[name for name in eval_frame.columns if name.endswith('rank')]].mean().sort_values()
rank_frame 

grid_2_rank     10.32
grid_5_rank     10.52
grid_1_rank     10.68
grid_4_rank     11.28
grid_12_rank    12.60
grid_0_rank     12.98
grid_21_rank    12.98
grid_48_rank    13.58
grid_30_rank    13.86
grid_29_rank    14.56
grid_17_rank    14.68
grid_18_rank    15.16
grid_37_rank    15.16
grid_33_rank    16.20
grid_42_rank    16.22
grid_38_rank    16.52
grid_23_rank    16.82
grid_36_rank    17.98
grid_6_rank     17.98
grid_9_rank     19.04
grid_19_rank    20.06
grid_28_rank    20.18
grid_10_rank    20.90
grid_15_rank    22.40
grid_41_rank    22.46
grid_44_rank    22.48
grid_22_rank    23.28
grid_24_rank    23.88
grid_13_rank    23.88
grid_27_rank    23.90
grid_16_rank    24.84
grid_3_rank     27.56
grid_20_rank    27.68
grid_39_rank    30.08
grid_31_rank    34.14
grid_40_rank    35.58
grid_34_rank    35.90
grid_49_rank    38.48
grid_14_rank    38.96
grid_7_rank     39.40
grid_45_rank    40.38
grid_11_rank    40.66
grid_47_rank    40.68
grid_8_rank     41.72
grid_35_rank    41.92
grid_26_ra

## Test correlation with grid search

#### Join grid search rank and CV rank

In [21]:
corr_frame = pd.DataFrame(columns=['grid_search', 'cv_rank'])
corr_frame['grid_search'] = list(range(0, N_MODELS))

rank_list = []
for name in rank_frame.index:
    if len(name) == 11:
        rank_list.append(int(name[5]))
    else: 
        rank_list.append(int(name[5:7]))

corr_frame['cv_rank'] = rank_list
corr_frame

Unnamed: 0,grid_search,cv_rank
0,0,2
1,1,5
2,2,1
3,3,4
4,4,12
5,5,0
6,6,21
7,7,48
8,8,30
9,9,29


#### Display correlation between grid search rank and CV rank

In [22]:
corr_frame.corr()

Unnamed: 0,grid_search,cv_rank
grid_search,1.0,0.351261
cv_rank,0.351261,1.0


#### Best model grid search params

In [23]:
json.loads(grid[35]['model'].save_config())

{'learner': {'generic_param': {'fail_on_invalid_gpu_id': '0',
   'gpu_id': '-1',
   'n_jobs': '16',
   'nthread': '16',
   'random_state': '12345',
   'seed': '12345',
   'seed_per_iteration': '0',
   'validate_parameters': '1'},
  'gradient_booster': {'gbtree_model_param': {'num_parallel_tree': '1',
    'num_trees': '220',
    'size_leaf_vector': '0'},
   'gbtree_train_param': {'predictor': 'auto',
    'process_type': 'default',
    'tree_method': 'exact',
    'updater': 'grow_colmaker,prune',
    'updater_seq': 'grow_colmaker,prune'},
   'name': 'gbtree',
   'specified_updater': False,
   'updater': {'grow_colmaker': {'colmaker_train_param': {'default_direction': 'learn',
      'opt_dense_col': '1'},
     'train_param': {'alpha': '0.000500000024',
      'cache_opt': '1',
      'colsample_bylevel': '1',
      'colsample_bynode': '1',
      'colsample_bytree': '0.5',
      'eta': '0.0500000007',
      'gamma': '0',
      'grow_policy': 'depthwise',
      'interaction_constraints': '',


#### Best model CV rank params

In [24]:
json.loads(grid[2]['model'].save_config())

{'learner': {'generic_param': {'fail_on_invalid_gpu_id': '0',
   'gpu_id': '-1',
   'n_jobs': '16',
   'nthread': '16',
   'random_state': '12345',
   'seed': '12345',
   'seed_per_iteration': '0',
   'validate_parameters': '1'},
  'gradient_booster': {'gbtree_model_param': {'num_parallel_tree': '1',
    'num_trees': '178',
    'size_leaf_vector': '0'},
   'gbtree_train_param': {'predictor': 'auto',
    'process_type': 'default',
    'tree_method': 'exact',
    'updater': 'grow_colmaker,prune',
    'updater_seq': 'grow_colmaker,prune'},
   'name': 'gbtree',
   'specified_updater': False,
   'updater': {'grow_colmaker': {'colmaker_train_param': {'default_direction': 'learn',
      'opt_dense_col': '1'},
     'train_param': {'alpha': '0.000500000024',
      'cache_opt': '1',
      'colsample_bylevel': '1',
      'colsample_bynode': '1',
      'colsample_bytree': '0.699999988',
      'eta': '0.0500000007',
      'gamma': '0',
      'grow_policy': 'depthwise',
      'interaction_constraint

### Estimate business value 

In [25]:
def get_confusion_matrix(valid, y_name, yhat_name, by=None, level=None, cutoff=0.5):

    """ Creates confusion matrix from pandas DataFrame of y and yhat values, can be sliced
        by a variable and level.
        :param valid: Validation DataFrame of actual (y) and predicted (yhat) values.
        :param y_name: Name of actual value column.
        :param yhat_name: Name of predicted value column.
        :param by: By variable to slice frame before creating confusion matrix, default None.
        :param level: Value of by variable to slice frame before creating confusion matrix, default None.
        :param cutoff: Cutoff threshold for confusion matrix, default 0.5.
        :return: Confusion matrix as pandas DataFrame.
    """

    # determine levels of target (y) variable
    # sort for consistency
    level_list = list(valid[y_name].unique())
    level_list.sort(reverse=True)

    # init confusion matrix
    cm_frame = pd.DataFrame(columns=['actual: ' + str(i) for i in level_list],
                            index=['predicted: ' + str(i) for i in level_list])

    # don't destroy original data
    frame_ = valid.copy(deep=True)

    # convert numeric predictions to binary decisions using cutoff
    dname = 'd_' + str(y_name)
    frame_[dname] = np.where(frame_[yhat_name] > cutoff, 1, 0)

    # slice frame
    if (by is not None) & (level is not None):
        frame_ = frame_[valid[by] == level]

    # calculate size of each confusion matrix value
    for i, lev_i in enumerate(level_list):
        for j, lev_j in enumerate(level_list):
            cm_frame.iat[j, i] = frame_[(frame_[y_name] == lev_i) & (frame_[dname] == lev_j)].shape[0]
            # i, j vs. j, i nasty little bug ... updated 8/30/19

    return cm_frame

In [26]:
valid.reset_index(inplace=True, drop=True)
valid['p1'] = scores_frame['grid_2'].copy(deep=True)
valid

Unnamed: 0,ID,LIMIT_BAL,SEX,RACE,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,DELINQ_NEXT,p1
0,9,140000,2,2.0,3,1,28,0,0,2,0,0,0,11285,14096,12108,12211,11793,3719,3329,0,432,1000,1000,1000,0,0.217799
1,12,260000,2,3.0,1,2,51,-1,-1,-1,-1,-1,2,12261,21670,9966,8517,22287,13668,21818,9966,8583,22301,0,3640,0,0.056054
2,17,20000,1,2.0,1,2,24,0,0,2,2,2,2,15376,18010,17428,18338,17905,19104,3200,0,1500,0,1650,0,1,0.434341
3,18,320000,1,3.0,1,1,49,0,0,0,-1,-1,-1,253286,246536,194663,70074,5856,195599,10358,10000,75940,20000,195599,50000,0,0.027171
4,19,360000,2,3.0,1,1,49,1,-2,-2,-2,-2,-2,0,0,0,0,0,0,0,0,0,0,0,0,0,0.332560
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5040,29961,10000,1,3.0,2,1,29,0,0,0,0,0,0,9406,9968,9385,5163,780,0,3009,2000,2009,0,0,0,0,0.189749
5041,29962,260000,1,4.0,1,2,33,-2,-2,-2,-2,-2,-2,0,263,0,1368,101,955,263,0,1368,101,955,0,0,0.154168
5042,29977,40000,1,2.0,2,2,47,2,2,3,2,2,2,52358,54892,53415,51259,47151,46934,4000,0,2000,0,3520,0,1,0.815356
5043,29987,360000,1,1.0,1,2,35,-1,-1,-2,-2,-2,-2,2220,0,0,0,0,0,0,0,0,0,0,0,0,0.220764


In [27]:
cm18 = get_confusion_matrix(valid, target, 'p1', cutoff=0.18)
cm18

Unnamed: 0,actual: 1,actual: 0
predicted: 1,841,1282
predicted: 0,310,2612


In [28]:
TRUE_POSITIVE_AMOUNT    = 0       # revenue for rejecting a defaulting customer
TRUE_NEGATIVE_AMOUNT    = 23000   # revenue for accepting a paying customer, ~ customer LTV
FALSE_POSITIVE_AMOUNT   = -23000  # revenue for rejecting a paying customer, ~ -customer LTV 
FALSE_NEGATIVE_AMOUNT   = -85000 # revenue for accepting a defaulting customer, ~ -mean(LIMIT_BAL)

In [29]:
business_impact = cm18.iloc[0, 0]*TRUE_POSITIVE_AMOUNT +\
                  cm18.iloc[0, 1]*FALSE_POSITIVE_AMOUNT +\
                  cm18.iloc[1, 0]*FALSE_NEGATIVE_AMOUNT +\
                  cm18.iloc[1, 1]*TRUE_NEGATIVE_AMOUNT

print('Estimated business impact $%.2f' % business_impact)

Estimated business impact $4240000.00
