# Credit Card Lead Prediction
--------------
## Problem Statement

Happy Customer Bank is a mid-sized private bank that deals in all kinds of banking products, like Savings accounts, Current accounts, investment products, credit products, among other offerings. The bank also cross-sells products to its existing customers and to do so they use different kinds of communication like tele-calling, e-mails, recommendations on net banking, mobile banking, etc. In this case, the Happy Customer Bank wants to cross sell its credit cards to its existing customers. The bank has identified a set of customers that are eligible for taking these credit cards.

Now, the bank is looking for your help in identifying customers that could show higher intent towards a recommended credit card, given:
* Customer details (gender, age, region etc.)
* Details of his/her relationship with the bank (Channel_Code,Vintage, 'Avg_Asset_Value etc.)

**Link:** https://datahack.analyticsvidhya.com/contest/job-a-thon-2/?utm_source=sendinblue&utm_campaign=JobAThon__Now_Live__Registrations__05282021&utm_medium=email#LeaderBoard

## Evaluation
The evaluation metric for this competition is `roc_auc_score` across all entries in the test set.

In [1]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import LabelEncoder, MinMaxScaler

In [2]:
train_df = pd.read_csv('data/train_s3TEQDk.csv')
train_df.head()

Unnamed: 0,ID,Gender,Age,Region_Code,Occupation,Channel_Code,Vintage,Credit_Product,Avg_Account_Balance,Is_Active,Is_Lead
0,NNVBBKZB,Female,73,RG268,Other,X3,43,No,1045696,No,0
1,IDD62UNG,Female,30,RG277,Salaried,X1,32,No,581988,No,0
2,HD3DSEMC,Female,56,RG268,Self_Employed,X3,26,No,1484315,Yes,0
3,BF3NC7KV,Male,34,RG270,Salaried,X1,19,No,470454,No,0
4,TEASRWXV,Female,30,RG282,Salaried,X1,33,No,886787,No,0


In [3]:
train_df.shape

(245725, 11)

In [4]:
train_df_copy = train_df.copy()

In [5]:
train_df_copy.describe()

Unnamed: 0,Age,Vintage,Avg_Account_Balance,Is_Lead
count,245725.0,245725.0,245725.0,245725.0
mean,43.856307,46.959141,1128403.0,0.237208
std,14.828672,32.353136,852936.4,0.425372
min,23.0,7.0,20790.0,0.0
25%,30.0,20.0,604310.0,0.0
50%,43.0,32.0,894601.0,0.0
75%,54.0,73.0,1366666.0,0.0
max,85.0,135.0,10352010.0,1.0


In [6]:
train_df_copy['Gender'].value_counts()

Male      134197
Female    111528
Name: Gender, dtype: int64

In [7]:
train_df_copy['Region_Code'].value_counts()

RG268    35934
RG283    29416
RG254    26840
RG284    19320
RG277    12826
RG280    12775
RG269     7863
RG270     7720
RG261     7633
RG257     6101
RG251     5950
RG282     5829
RG274     5286
RG272     5252
RG281     5093
RG273     4497
RG252     4286
RG279     3976
RG263     3687
RG275     3245
RG260     3110
RG256     2847
RG264     2793
RG276     2764
RG259     2586
RG250     2496
RG255     2018
RG258     1951
RG253     1858
RG278     1822
RG262     1788
RG266     1578
RG265     1546
RG271     1542
RG267     1497
Name: Region_Code, dtype: int64

In [8]:
train_df_copy['Channel_Code'].value_counts()

X1    103718
X3     68712
X2     67726
X4      5569
Name: Channel_Code, dtype: int64

In [9]:
train_df_copy['Credit_Product'].value_counts()

No     144357
Yes     72043
Name: Credit_Product, dtype: int64

In [10]:
train_df_copy['Is_Active'].value_counts()

No     150290
Yes     95435
Name: Is_Active, dtype: int64

In [11]:
train_df_copy['Is_Lead'].value_counts()

0    187437
1     58288
Name: Is_Lead, dtype: int64

In [12]:
train_df_copy.dtypes

ID                     object
Gender                 object
Age                     int64
Region_Code            object
Occupation             object
Channel_Code           object
Vintage                 int64
Credit_Product         object
Avg_Account_Balance     int64
Is_Active              object
Is_Lead                 int64
dtype: object

In [13]:
train_df_copy.isnull().sum()

ID                         0
Gender                     0
Age                        0
Region_Code                0
Occupation                 0
Channel_Code               0
Vintage                    0
Credit_Product         29325
Avg_Account_Balance        0
Is_Active                  0
Is_Lead                    0
dtype: int64

In [14]:
train_df_copy['Credit_Product'] = train_df_copy['Credit_Product'].fillna(train_df_copy['Credit_Product'].mode()[0])

In [15]:
train_df_copy.isnull().sum()

ID                     0
Gender                 0
Age                    0
Region_Code            0
Occupation             0
Channel_Code           0
Vintage                0
Credit_Product         0
Avg_Account_Balance    0
Is_Active              0
Is_Lead                0
dtype: int64

In [16]:
train_df_copy['Credit_Product'].value_counts()

No     173682
Yes     72043
Name: Credit_Product, dtype: int64

In [34]:
X = train_df_copy.drop(['Is_Lead', 'ID'], axis=1)
y = train_df_copy['Is_Lead']

In [35]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)

In [36]:
categorical_cols = []
numerical_cols = []

for label, content in X_train.items():
    if pd.api.types.is_string_dtype(content):
        categorical_cols.append(label)
    else:
        numerical_cols.append(label)

In [37]:
categorical_cols

['Gender',
 'Region_Code',
 'Occupation',
 'Channel_Code',
 'Credit_Product',
 'Is_Active']

In [38]:
numerical_cols

['Age', 'Vintage', 'Avg_Account_Balance']

In [39]:
label_encoder = LabelEncoder()
minmax_scaler = MinMaxScaler()

for col in categorical_cols:
    label_encoder_fit = label_encoder.fit(X_train[col])
    X_train[col] = label_encoder_fit.transform(X_train[col])
    X_test[col] = label_encoder_fit.transform(X_test[col])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_train[col] = label_encoder_fit.transform(X_train[col])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_test[col] = label_encoder_fit.transform(X_test[col])


In [40]:
minmax_scaler_fit = minmax_scaler.fit(X_train[numerical_cols])
X_train[numerical_cols] = minmax_scaler_fit.transform(X_train[numerical_cols])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_train[numerical_cols] = minmax_scaler_fit.transform(X_train[numerical_cols])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s


In [41]:
X_train

Unnamed: 0,Gender,Age,Region_Code,Occupation,Channel_Code,Vintage,Credit_Product,Avg_Account_Balance,Is_Active
138403,0,0.112903,25,2,2,0.101562,0,0.163097,0
117015,0,0.516129,29,3,0,0.234375,0,0.024960,0
322,0,0.161290,11,2,0,0.046875,0,0.081589,0
64910,0,0.403226,18,3,2,0.664062,1,0.114638,1
39919,0,0.129032,24,1,0,0.093750,0,0.065166,0
...,...,...,...,...,...,...,...,...,...
119879,1,0.322581,19,3,1,0.093750,1,0.220621,0
103694,1,0.258065,19,3,3,0.046875,0,0.078487,0
131932,1,0.564516,1,3,1,0.437500,1,0.099225,1
146867,1,0.096774,4,1,0,0.195312,0,0.159047,0


In [42]:
X_test[numerical_cols] = minmax_scaler_fit.transform(X_test[numerical_cols])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_test[numerical_cols] = minmax_scaler_fit.transform(X_test[numerical_cols])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s


In [43]:
X_test

Unnamed: 0,Gender,Age,Region_Code,Occupation,Channel_Code,Vintage,Credit_Product,Avg_Account_Balance,Is_Active
241356,0,0.951613,1,1,1,0.343750,1,0.038406,0
150884,1,0.467742,30,3,2,0.429688,0,0.059732,0
43550,1,0.129032,34,2,0,0.140625,1,0.204343,0
62555,0,0.161290,18,3,0,0.156250,0,0.240550,0
147096,1,0.112903,1,2,0,0.093750,0,0.204813,0
...,...,...,...,...,...,...,...,...,...
109172,0,0.112903,9,2,0,0.250000,0,0.037882,0
66179,0,0.129032,10,2,0,0.203125,0,0.048161,0
198583,1,0.177419,22,2,0,0.109375,0,0.042292,0
122392,1,0.838710,32,1,2,0.765625,1,0.094089,0


In [44]:
from sklearn.ensemble import RandomForestClassifier
import xgboost as xgb
from sklearn.ensemble import GradientBoostingClassifier
from catboost import CatBoostClassifier
from sklearn.ensemble import AdaBoostClassifier

In [49]:
def _model_experimentation(models, X_train, X_test, y_train, y_test):
    '''
    Fit and Score the deep learning models without performing hyperparameter tuning
    '''
    train_scores = {}
    test_scores = {}
    for name, model in models.items():
        model.fit(X_train, y_train)
        train_preds= model.predict(X_train)
        test_preds = model.predict(X_test)
        train_scores[name] = accuracy_score(y_train, train_preds)
        test_scores[name] = accuracy_score(y_test, test_preds)
    return train_scores, test_scores

In [50]:
from sklearn.metrics import accuracy_score
deep_models = {'XGB': xgb.XGBClassifier(n_jobs=-1),
               'GBR': GradientBoostingClassifier(),
               'ADA': AdaBoostClassifier(),
               'CAT': CatBoostClassifier(),
               'RF': RandomForestClassifier(n_jobs=-1)}

train_scores, test_scores = _model_experimentation(deep_models, X_train, X_test, y_train, y_test)



Learning rate set to 0.092792
0:	learn: 0.6432985	total: 24.2ms	remaining: 24.2s
1:	learn: 0.6060412	total: 42.8ms	remaining: 21.3s
2:	learn: 0.5781547	total: 65.1ms	remaining: 21.6s
3:	learn: 0.5570945	total: 89.6ms	remaining: 22.3s
4:	learn: 0.5380085	total: 109ms	remaining: 21.6s
5:	learn: 0.5240366	total: 130ms	remaining: 21.5s
6:	learn: 0.5087407	total: 154ms	remaining: 21.9s
7:	learn: 0.4991872	total: 176ms	remaining: 21.8s
8:	learn: 0.4920722	total: 203ms	remaining: 22.3s
9:	learn: 0.4841629	total: 222ms	remaining: 22s
10:	learn: 0.4779237	total: 241ms	remaining: 21.7s
11:	learn: 0.4736936	total: 259ms	remaining: 21.3s
12:	learn: 0.4706426	total: 275ms	remaining: 20.9s
13:	learn: 0.4675280	total: 296ms	remaining: 20.9s
14:	learn: 0.4639370	total: 317ms	remaining: 20.8s
15:	learn: 0.4613778	total: 337ms	remaining: 20.7s
16:	learn: 0.4590740	total: 357ms	remaining: 20.7s
17:	learn: 0.4576483	total: 378ms	remaining: 20.6s
18:	learn: 0.4564856	total: 394ms	remaining: 20.4s
19:	learn

160:	learn: 0.4355140	total: 3.21s	remaining: 16.7s
161:	learn: 0.4354847	total: 3.23s	remaining: 16.7s
162:	learn: 0.4354014	total: 3.26s	remaining: 16.7s
163:	learn: 0.4353648	total: 3.27s	remaining: 16.7s
164:	learn: 0.4353421	total: 3.3s	remaining: 16.7s
165:	learn: 0.4352929	total: 3.32s	remaining: 16.7s
166:	learn: 0.4352562	total: 3.34s	remaining: 16.7s
167:	learn: 0.4352238	total: 3.36s	remaining: 16.6s
168:	learn: 0.4351904	total: 3.38s	remaining: 16.6s
169:	learn: 0.4351546	total: 3.4s	remaining: 16.6s
170:	learn: 0.4351231	total: 3.42s	remaining: 16.6s
171:	learn: 0.4350847	total: 3.44s	remaining: 16.6s
172:	learn: 0.4350602	total: 3.46s	remaining: 16.6s
173:	learn: 0.4350178	total: 3.48s	remaining: 16.5s
174:	learn: 0.4349904	total: 3.5s	remaining: 16.5s
175:	learn: 0.4349404	total: 3.52s	remaining: 16.5s
176:	learn: 0.4349085	total: 3.54s	remaining: 16.5s
177:	learn: 0.4348528	total: 3.56s	remaining: 16.4s
178:	learn: 0.4348320	total: 3.58s	remaining: 16.4s
179:	learn: 0.4

323:	learn: 0.4305803	total: 6.61s	remaining: 13.8s
324:	learn: 0.4305545	total: 6.63s	remaining: 13.8s
325:	learn: 0.4305308	total: 6.65s	remaining: 13.7s
326:	learn: 0.4304975	total: 6.67s	remaining: 13.7s
327:	learn: 0.4304740	total: 6.69s	remaining: 13.7s
328:	learn: 0.4304526	total: 6.71s	remaining: 13.7s
329:	learn: 0.4304342	total: 6.73s	remaining: 13.7s
330:	learn: 0.4304057	total: 6.75s	remaining: 13.6s
331:	learn: 0.4303767	total: 6.77s	remaining: 13.6s
332:	learn: 0.4303537	total: 6.79s	remaining: 13.6s
333:	learn: 0.4303351	total: 6.81s	remaining: 13.6s
334:	learn: 0.4303148	total: 6.83s	remaining: 13.6s
335:	learn: 0.4302963	total: 6.85s	remaining: 13.5s
336:	learn: 0.4302714	total: 6.88s	remaining: 13.5s
337:	learn: 0.4302485	total: 6.9s	remaining: 13.5s
338:	learn: 0.4302117	total: 6.92s	remaining: 13.5s
339:	learn: 0.4301817	total: 6.94s	remaining: 13.5s
340:	learn: 0.4301567	total: 6.96s	remaining: 13.4s
341:	learn: 0.4301303	total: 6.98s	remaining: 13.4s
342:	learn: 0

484:	learn: 0.4268176	total: 9.92s	remaining: 10.5s
485:	learn: 0.4267957	total: 9.94s	remaining: 10.5s
486:	learn: 0.4267668	total: 9.96s	remaining: 10.5s
487:	learn: 0.4267385	total: 9.99s	remaining: 10.5s
488:	learn: 0.4267217	total: 10s	remaining: 10.5s
489:	learn: 0.4266978	total: 10s	remaining: 10.4s
490:	learn: 0.4266801	total: 10s	remaining: 10.4s
491:	learn: 0.4266599	total: 10.1s	remaining: 10.4s
492:	learn: 0.4266423	total: 10.1s	remaining: 10.4s
493:	learn: 0.4266274	total: 10.1s	remaining: 10.4s
494:	learn: 0.4266111	total: 10.1s	remaining: 10.3s
495:	learn: 0.4265938	total: 10.1s	remaining: 10.3s
496:	learn: 0.4265740	total: 10.2s	remaining: 10.3s
497:	learn: 0.4265523	total: 10.2s	remaining: 10.3s
498:	learn: 0.4265355	total: 10.2s	remaining: 10.2s
499:	learn: 0.4265123	total: 10.2s	remaining: 10.2s
500:	learn: 0.4264847	total: 10.2s	remaining: 10.2s
501:	learn: 0.4264590	total: 10.3s	remaining: 10.2s
502:	learn: 0.4264376	total: 10.3s	remaining: 10.2s
503:	learn: 0.4264

648:	learn: 0.4233692	total: 13.3s	remaining: 7.2s
649:	learn: 0.4233578	total: 13.3s	remaining: 7.17s
650:	learn: 0.4233383	total: 13.3s	remaining: 7.15s
651:	learn: 0.4233253	total: 13.4s	remaining: 7.13s
652:	learn: 0.4233084	total: 13.4s	remaining: 7.11s
653:	learn: 0.4232865	total: 13.4s	remaining: 7.09s
654:	learn: 0.4232683	total: 13.4s	remaining: 7.07s
655:	learn: 0.4232539	total: 13.4s	remaining: 7.05s
656:	learn: 0.4232344	total: 13.5s	remaining: 7.03s
657:	learn: 0.4232157	total: 13.5s	remaining: 7.01s
658:	learn: 0.4231979	total: 13.5s	remaining: 6.99s
659:	learn: 0.4231787	total: 13.5s	remaining: 6.97s
660:	learn: 0.4231536	total: 13.5s	remaining: 6.95s
661:	learn: 0.4231279	total: 13.6s	remaining: 6.93s
662:	learn: 0.4231155	total: 13.6s	remaining: 6.91s
663:	learn: 0.4231030	total: 13.6s	remaining: 6.88s
664:	learn: 0.4230796	total: 13.6s	remaining: 6.86s
665:	learn: 0.4230544	total: 13.6s	remaining: 6.84s
666:	learn: 0.4230260	total: 13.7s	remaining: 6.82s
667:	learn: 0

810:	learn: 0.4202397	total: 16.7s	remaining: 3.88s
811:	learn: 0.4202205	total: 16.7s	remaining: 3.86s
812:	learn: 0.4202071	total: 16.7s	remaining: 3.84s
813:	learn: 0.4201890	total: 16.7s	remaining: 3.82s
814:	learn: 0.4201695	total: 16.8s	remaining: 3.8s
815:	learn: 0.4201491	total: 16.8s	remaining: 3.78s
816:	learn: 0.4201343	total: 16.8s	remaining: 3.76s
817:	learn: 0.4201093	total: 16.8s	remaining: 3.74s
818:	learn: 0.4200884	total: 16.8s	remaining: 3.72s
819:	learn: 0.4200772	total: 16.9s	remaining: 3.7s
820:	learn: 0.4200526	total: 16.9s	remaining: 3.68s
821:	learn: 0.4200296	total: 16.9s	remaining: 3.66s
822:	learn: 0.4200013	total: 16.9s	remaining: 3.64s
823:	learn: 0.4199832	total: 16.9s	remaining: 3.62s
824:	learn: 0.4199652	total: 17s	remaining: 3.6s
825:	learn: 0.4199395	total: 17s	remaining: 3.58s
826:	learn: 0.4199212	total: 17s	remaining: 3.56s
827:	learn: 0.4199002	total: 17s	remaining: 3.54s
828:	learn: 0.4198838	total: 17s	remaining: 3.52s
829:	learn: 0.4198617	tot

974:	learn: 0.4173363	total: 20.2s	remaining: 519ms
975:	learn: 0.4173237	total: 20.2s	remaining: 498ms
976:	learn: 0.4173038	total: 20.3s	remaining: 477ms
977:	learn: 0.4172893	total: 20.3s	remaining: 457ms
978:	learn: 0.4172664	total: 20.3s	remaining: 436ms
979:	learn: 0.4172445	total: 20.3s	remaining: 415ms
980:	learn: 0.4172301	total: 20.4s	remaining: 394ms
981:	learn: 0.4172136	total: 20.4s	remaining: 374ms
982:	learn: 0.4172037	total: 20.4s	remaining: 353ms
983:	learn: 0.4171857	total: 20.4s	remaining: 332ms
984:	learn: 0.4171634	total: 20.5s	remaining: 312ms
985:	learn: 0.4171454	total: 20.5s	remaining: 291ms
986:	learn: 0.4171319	total: 20.5s	remaining: 270ms
987:	learn: 0.4171254	total: 20.5s	remaining: 249ms
988:	learn: 0.4171148	total: 20.6s	remaining: 229ms
989:	learn: 0.4170985	total: 20.6s	remaining: 208ms
990:	learn: 0.4170842	total: 20.6s	remaining: 187ms
991:	learn: 0.4170680	total: 20.6s	remaining: 166ms
992:	learn: 0.4170479	total: 20.7s	remaining: 146ms
993:	learn: 

In [51]:
train_scores

{'XGB': 0.8083566366485085,
 'GBR': 0.7908631625457103,
 'ADA': 0.7820670089008005,
 'CAT': 0.8081647839913492,
 'RF': 0.9999418628311638}

In [52]:
test_scores

{'XGB': 0.7923437966304023,
 'GBR': 0.7904717979326623,
 'ADA': 0.7815187606826013,
 'CAT': 0.793293361187227,
 'RF': 0.7789278059632654}

In [54]:
from sklearn.utils import resample
total = train_df_copy[train_df_copy['Is_Lead'] == 1]
df_upsample = resample(total, replace=True, n_samples = 187437, random_state=42)

In [56]:
total_ = train_df_copy[train_df_copy['Is_Lead'] == 0]

In [57]:
train_df_copy = pd.concat([df_upsample, total_])

In [60]:
X = train_df_copy.drop(['Is_Lead', 'ID'], axis=1)
y = train_df_copy['Is_Lead']

In [61]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)

In [62]:
label_encoder = LabelEncoder()
minmax_scaler = MinMaxScaler()

for col in categorical_cols:
    label_encoder_fit = label_encoder.fit(X_train[col])
    X_train[col] = label_encoder_fit.transform(X_train[col])
    X_test[col] = label_encoder_fit.transform(X_test[col])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_train[col] = label_encoder_fit.transform(X_train[col])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_test[col] = label_encoder_fit.transform(X_test[col])


In [63]:
minmax_scaler_fit = minmax_scaler.fit(X_train[numerical_cols])
X_train[numerical_cols] = minmax_scaler_fit.transform(X_train[numerical_cols])
X_test[numerical_cols] = minmax_scaler_fit.transform(X_test[numerical_cols])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_train[numerical_cols] = minmax_scaler_fit.transform(X_train[numerical_cols])
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X_test[numerical_cols] = minmax_scaler_fit.transform(X_test[numerical_cols])
A value is trying to be set on

In [64]:
from sklearn.metrics import accuracy_score
deep_models = {'XGB': xgb.XGBClassifier(n_jobs=-1),
               'GBR': GradientBoostingClassifier(),
               'ADA': AdaBoostClassifier(),
               'CAT': CatBoostClassifier(),
               'RF': RandomForestClassifier(n_jobs=-1)}

train_scores, test_scores = _model_experimentation(deep_models, X_train, X_test, y_train, y_test)



Learning rate set to 0.111132
0:	learn: 0.6620887	total: 31.4ms	remaining: 31.3s
1:	learn: 0.6435959	total: 59.5ms	remaining: 29.7s
2:	learn: 0.6287595	total: 85.4ms	remaining: 28.4s
3:	learn: 0.6186170	total: 109ms	remaining: 27.1s
4:	learn: 0.6062495	total: 139ms	remaining: 27.7s
5:	learn: 0.5986978	total: 170ms	remaining: 28.2s
6:	learn: 0.5927875	total: 204ms	remaining: 29s
7:	learn: 0.5862743	total: 236ms	remaining: 29.2s
8:	learn: 0.5823876	total: 266ms	remaining: 29.3s
9:	learn: 0.5783017	total: 294ms	remaining: 29.1s
10:	learn: 0.5762293	total: 318ms	remaining: 28.6s
11:	learn: 0.5740500	total: 345ms	remaining: 28.4s
12:	learn: 0.5714455	total: 373ms	remaining: 28.3s
13:	learn: 0.5696185	total: 409ms	remaining: 28.8s
14:	learn: 0.5680714	total: 437ms	remaining: 28.7s
15:	learn: 0.5671692	total: 464ms	remaining: 28.5s
16:	learn: 0.5661633	total: 492ms	remaining: 28.5s
17:	learn: 0.5652500	total: 517ms	remaining: 28.2s
18:	learn: 0.5631231	total: 547ms	remaining: 28.3s
19:	learn:

154:	learn: 0.5457961	total: 4.27s	remaining: 23.3s
155:	learn: 0.5457453	total: 4.3s	remaining: 23.3s
156:	learn: 0.5456960	total: 4.33s	remaining: 23.3s
157:	learn: 0.5456461	total: 4.36s	remaining: 23.2s
158:	learn: 0.5455807	total: 4.39s	remaining: 23.2s
159:	learn: 0.5455274	total: 4.42s	remaining: 23.2s
160:	learn: 0.5454841	total: 4.45s	remaining: 23.2s
161:	learn: 0.5454276	total: 4.48s	remaining: 23.2s
162:	learn: 0.5453678	total: 4.51s	remaining: 23.1s
163:	learn: 0.5453248	total: 4.53s	remaining: 23.1s
164:	learn: 0.5452867	total: 4.56s	remaining: 23.1s
165:	learn: 0.5452487	total: 4.59s	remaining: 23.1s
166:	learn: 0.5451864	total: 4.63s	remaining: 23.1s
167:	learn: 0.5451543	total: 4.67s	remaining: 23.1s
168:	learn: 0.5450827	total: 4.7s	remaining: 23.1s
169:	learn: 0.5450365	total: 4.73s	remaining: 23.1s
170:	learn: 0.5449869	total: 4.76s	remaining: 23.1s
171:	learn: 0.5449409	total: 4.79s	remaining: 23.1s
172:	learn: 0.5449049	total: 4.82s	remaining: 23s
173:	learn: 0.54

313:	learn: 0.5382376	total: 8.91s	remaining: 19.5s
314:	learn: 0.5382030	total: 8.95s	remaining: 19.5s
315:	learn: 0.5381553	total: 8.97s	remaining: 19.4s
316:	learn: 0.5380992	total: 9s	remaining: 19.4s
317:	learn: 0.5380671	total: 9.03s	remaining: 19.4s
318:	learn: 0.5380421	total: 9.05s	remaining: 19.3s
319:	learn: 0.5380026	total: 9.08s	remaining: 19.3s
320:	learn: 0.5379641	total: 9.11s	remaining: 19.3s
321:	learn: 0.5379296	total: 9.14s	remaining: 19.2s
322:	learn: 0.5378933	total: 9.17s	remaining: 19.2s
323:	learn: 0.5378501	total: 9.2s	remaining: 19.2s
324:	learn: 0.5378111	total: 9.22s	remaining: 19.2s
325:	learn: 0.5377671	total: 9.25s	remaining: 19.1s
326:	learn: 0.5377406	total: 9.28s	remaining: 19.1s
327:	learn: 0.5377056	total: 9.31s	remaining: 19.1s
328:	learn: 0.5376765	total: 9.33s	remaining: 19s
329:	learn: 0.5376516	total: 9.36s	remaining: 19s
330:	learn: 0.5376165	total: 9.39s	remaining: 19s
331:	learn: 0.5375687	total: 9.41s	remaining: 18.9s
332:	learn: 0.5375267	

473:	learn: 0.5320717	total: 13.4s	remaining: 14.9s
474:	learn: 0.5320394	total: 13.4s	remaining: 14.8s
475:	learn: 0.5320254	total: 13.4s	remaining: 14.8s
476:	learn: 0.5319855	total: 13.5s	remaining: 14.8s
477:	learn: 0.5319510	total: 13.5s	remaining: 14.7s
478:	learn: 0.5319134	total: 13.5s	remaining: 14.7s
479:	learn: 0.5318797	total: 13.6s	remaining: 14.7s
480:	learn: 0.5318306	total: 13.6s	remaining: 14.7s
481:	learn: 0.5317900	total: 13.6s	remaining: 14.6s
482:	learn: 0.5317493	total: 13.6s	remaining: 14.6s
483:	learn: 0.5317011	total: 13.7s	remaining: 14.6s
484:	learn: 0.5316737	total: 13.7s	remaining: 14.5s
485:	learn: 0.5316333	total: 13.7s	remaining: 14.5s
486:	learn: 0.5315983	total: 13.8s	remaining: 14.5s
487:	learn: 0.5315597	total: 13.8s	remaining: 14.5s
488:	learn: 0.5315254	total: 13.8s	remaining: 14.4s
489:	learn: 0.5314815	total: 13.8s	remaining: 14.4s
490:	learn: 0.5314525	total: 13.9s	remaining: 14.4s
491:	learn: 0.5314154	total: 13.9s	remaining: 14.3s
492:	learn: 

635:	learn: 0.5265367	total: 18.1s	remaining: 10.4s
636:	learn: 0.5265023	total: 18.1s	remaining: 10.3s
637:	learn: 0.5264755	total: 18.2s	remaining: 10.3s
638:	learn: 0.5264413	total: 18.2s	remaining: 10.3s
639:	learn: 0.5264035	total: 18.2s	remaining: 10.2s
640:	learn: 0.5263641	total: 18.3s	remaining: 10.2s
641:	learn: 0.5263208	total: 18.3s	remaining: 10.2s
642:	learn: 0.5262887	total: 18.3s	remaining: 10.2s
643:	learn: 0.5262655	total: 18.3s	remaining: 10.1s
644:	learn: 0.5262374	total: 18.4s	remaining: 10.1s
645:	learn: 0.5261969	total: 18.4s	remaining: 10.1s
646:	learn: 0.5261624	total: 18.4s	remaining: 10.1s
647:	learn: 0.5261080	total: 18.5s	remaining: 10s
648:	learn: 0.5260806	total: 18.5s	remaining: 10s
649:	learn: 0.5260513	total: 18.5s	remaining: 9.98s
650:	learn: 0.5260265	total: 18.6s	remaining: 9.96s
651:	learn: 0.5259990	total: 18.6s	remaining: 9.93s
652:	learn: 0.5259830	total: 18.6s	remaining: 9.9s
653:	learn: 0.5259636	total: 18.7s	remaining: 9.88s
654:	learn: 0.525

795:	learn: 0.5215602	total: 23.1s	remaining: 5.91s
796:	learn: 0.5215195	total: 23.1s	remaining: 5.88s
797:	learn: 0.5214930	total: 23.1s	remaining: 5.85s
798:	learn: 0.5214651	total: 23.2s	remaining: 5.82s
799:	learn: 0.5214399	total: 23.2s	remaining: 5.79s
800:	learn: 0.5214025	total: 23.2s	remaining: 5.77s
801:	learn: 0.5213740	total: 23.2s	remaining: 5.74s
802:	learn: 0.5213396	total: 23.3s	remaining: 5.71s
803:	learn: 0.5213267	total: 23.3s	remaining: 5.68s
804:	learn: 0.5212985	total: 23.3s	remaining: 5.65s
805:	learn: 0.5212627	total: 23.4s	remaining: 5.62s
806:	learn: 0.5212370	total: 23.4s	remaining: 5.59s
807:	learn: 0.5212074	total: 23.4s	remaining: 5.57s
808:	learn: 0.5211738	total: 23.5s	remaining: 5.54s
809:	learn: 0.5211493	total: 23.5s	remaining: 5.51s
810:	learn: 0.5211159	total: 23.5s	remaining: 5.48s
811:	learn: 0.5210942	total: 23.6s	remaining: 5.45s
812:	learn: 0.5210699	total: 23.6s	remaining: 5.42s
813:	learn: 0.5210462	total: 23.6s	remaining: 5.4s
814:	learn: 0

958:	learn: 0.5167385	total: 28.1s	remaining: 1.2s
959:	learn: 0.5166988	total: 28.1s	remaining: 1.17s
960:	learn: 0.5166715	total: 28.2s	remaining: 1.14s
961:	learn: 0.5166489	total: 28.2s	remaining: 1.11s
962:	learn: 0.5166303	total: 28.2s	remaining: 1.08s
963:	learn: 0.5166009	total: 28.3s	remaining: 1.05s
964:	learn: 0.5165623	total: 28.3s	remaining: 1.03s
965:	learn: 0.5165281	total: 28.3s	remaining: 997ms
966:	learn: 0.5165008	total: 28.4s	remaining: 968ms
967:	learn: 0.5164706	total: 28.4s	remaining: 938ms
968:	learn: 0.5164397	total: 28.4s	remaining: 909ms
969:	learn: 0.5164143	total: 28.4s	remaining: 880ms
970:	learn: 0.5163938	total: 28.5s	remaining: 851ms
971:	learn: 0.5163455	total: 28.5s	remaining: 821ms
972:	learn: 0.5163040	total: 28.5s	remaining: 792ms
973:	learn: 0.5162780	total: 28.6s	remaining: 763ms
974:	learn: 0.5162416	total: 28.6s	remaining: 733ms
975:	learn: 0.5162071	total: 28.6s	remaining: 704ms
976:	learn: 0.5161920	total: 28.7s	remaining: 675ms
977:	learn: 0

In [65]:
train_scores

{'XGB': 0.7422021180514536,
 'GBR': 0.7161742457442714,
 'ADA': 0.6968038687402586,
 'CAT': 0.7457118794562728,
 'RF': 0.9999961891841425}

In [68]:
test_scores

{'XGB': 0.7303557614504326,
 'GBR': 0.7162978046112943,
 'ADA': 0.6974382685861128,
 'CAT': 0.7339658376532726,
 'RF': 0.8876074797933543}

In [None]:
for name, model in models.items():