# Competitive data analysis

## 1. Introduction

This notebook contains the project of the course by Geek University (https://geekbrains.ru, Faculty of Artificial Intelligence), which aims to demonstrate to students how they can take part in Kaggle competitions.


The task of the project is to predict the potential dept of the loan based on customer data from different sources (https://www.kaggle.com/c/geekbrains-competitive-data-analysis/overview).


__Metric:__ ROC-AUC


__Type of the task:__ binary classification 

## 2. Data description

There are two groups of data: __basic__ - for getting fit and predict, __additional__ - for making features. 

### Basic data:


__train.csv__ - application id + target

__test.csv__ - application id without target

### Additional data:

__applications_history.csv__ - client's previous applications 

__bki.csv__ - client's previous credits provided by other financial institutions 

__client_profile.csv__ - other information about clients 

__payments.csv__ - repayment history for the previously disbursed credits 

## 3. Importing libraries and data

In [1]:
from typing import List, Optional
from tqdm import tqdm

import numpy as np
import pandas as pd


import scipy.stats as st
from scipy.stats import probplot, ks_2samp
from sklearn.metrics import roc_auc_score, roc_curve, auc
from sklearn.ensemble import RandomForestRegressor

from imblearn.under_sampling import RandomUnderSampler 
from collections import Counter

from sklearn.model_selection import train_test_split, KFold, cross_val_score
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.utils.validation import check_is_fitted
import missingno as msno

import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt

import lightgbm as lgb
import xgboost as xgb
%matplotlib inline

In [2]:
PATH = 'data/'

In [109]:
train = pd.read_csv(PATH+"train.csv")
test = pd.read_csv(PATH+"test.csv")
applications_history = pd.read_csv(PATH+"applications_history.csv")
bki = pd.read_csv(PATH+"bki.csv")
client_profile = pd.read_csv(PATH+"client_profile.csv")
payments = pd.read_csv(PATH+"payments.csv")


print("train.shape = {} rows, {} cols".format(*train.shape))
print("test.shape = {} rows, {} cols".format(*test.shape))
print("applications_history.shape = {} rows, {} cols".format(*applications_history.shape))
print("bki.shape = {} rows, {} cols".format(*bki.shape))
print("client_profile.shape = {} rows, {} cols".format(*client_profile.shape))
print("payments.shape = {} rows, {} cols".format(*payments.shape))

train.shape = 110093 rows, 3 cols
test.shape = 165141 rows, 2 cols
applications_history.shape = 1670214 rows, 26 cols
bki.shape = 945234 rows, 17 cols
client_profile.shape = 250000 rows, 24 cols
payments.shape = 1023932 rows, 8 cols


## 4. Exploration data analysis

## 5. Solution strategy

As the purpose of the task is to achieve a high ROC-AUC score, it is important to comprehend what features are good or bad and what operations with data are useful (increase or decrease the score). For starting the process of the solution, checking and comparison results with different features I have decided to use LightGBM classifier with an internal categorical features preprocessor.


I'm going to get the first result using basic data (train.csv + client_profile) and compare it with other results through dropping and adding features by one. Since LightGBM is fast enough, it is possible to refit the model every time and use it as a validation 5-fold cross-validation without spending much time and resources.


Additionally, I will use XGBoost classifier encoding categorical features for stacking with LightGBM. 

## 6. Solution

### 6.1 First result

In [110]:
#join train.csv and client_profile.csv
train = pd.merge(train, client_profile, how='left', on = 'APPLICATION_NUMBER')

In [111]:
#join test.csv and client_profile.csv
test = pd.merge(test, client_profile, how='left', on = 'APPLICATION_NUMBER')

In [112]:
#for making categorical features for LightGBM categorical features preprocessor 
def cat_featutes(data):
  cat_features = list(data.select_dtypes(include=[np.object]).columns)
  cat_features = cat_features + ['AMT_REQ_CREDIT_BUREAU_HOUR', 'AMT_REQ_CREDIT_BUREAU_DAY', 'AMT_REQ_CREDIT_BUREAU_WEEK', 'AMT_REQ_CREDIT_BUREAU_MON', 'AMT_REQ_CREDIT_BUREAU_QRT']
  return cat_features

def cat_featutes_maker(data, cat_features):    
  data[cat_features] = data[cat_features].astype('category')
  return data

In [113]:
cat_features = cat_featutes(train)

In [114]:
train = cat_featutes_maker(train, cat_features)
test = cat_featutes_maker(test, cat_features)

In [115]:
#this is the best set of hyper parameters from the teacher of the course :)
params = {
    'boosting_type': 'gbdt',
    'n_estimators': 5555,
    'learning_rate': 0.005134,
    'num_leaves': 54,
    'max_depth': 10,
    'subsample_for_bin': 240000,
    'reg_alpha': 0.436193,
    'reg_lambda': 0.479169,
    'colsample_bytree': 0.508716,
    'min_split_gain': 0.024766,
    'subsample': 0.7,
    'is_unbalance': False,
    'random_state': 27,
    'silent': -1,
    'verbose': -1
}

In [92]:
model = lgb.LGBMClassifier(**params)

In [93]:
#5-fold cross-validation
def make_cross_validation(X: pd.DataFrame,
                          y: pd.Series,
                          estimator: object,
                          metric: callable,
                          cv_strategy):
    """
    Parameters
    ----------
    X: pd.DataFrame

    y: pd.Series

    estimator: callable

    metric: callable

    cv_strategy: cross-validation generator
        
        KFold or StratifiedKFold.

    Returns
    -------
    oof_score: float
        OOF-predictions metric.

    fold_train_scores: List[float]
        OOF-predictions metric on each train fold

    fold_valid_scores: List[float]
        OOF-predictions metric on each valid fold
        
    oof_predictions: np.array
        OOF-predictions

    """
    estimators, fold_train_scores, fold_valid_scores = [], [], []
    oof_predictions = np.zeros(X.shape[0])

    for fold_number, (train_idx, valid_idx) in enumerate(cv_strategy.split(X, y)):
        x_train, x_valid = X.loc[train_idx], X.loc[valid_idx]
        y_train, y_valid = y.loc[train_idx], y.loc[valid_idx]

        estimator.fit(x_train, y_train, eval_set=[(x_train, y_train), (x_valid, y_valid)], early_stopping_rounds=500, eval_metric="auc", verbose=-1)
        y_train_pred = estimator.predict_proba(x_train)[:,1]
        y_valid_pred = estimator.predict_proba(x_valid)[:,1]

        fold_train_scores.append(metric(y_train, y_train_pred))
        fold_valid_scores.append(metric(y_valid, y_valid_pred))
        oof_predictions[valid_idx] = y_valid_pred

        msg = (
            f"Fold: {fold_number+1}, train-observations = {len(train_idx)}, "
            f"valid-observations = {len(valid_idx)}\n"
            f"train-score = {round(fold_train_scores[fold_number], 4)}, "
            f"valid-score = {round(fold_valid_scores[fold_number], 4)}, "
            f"target_median = {np.median(y_valid_pred)}"
        )
        print(msg)
        print("="*69)
        estimators.append(estimator)

    oof_score = metric(y, oof_predictions)
    print(f"CV-results train: {round(np.mean(fold_train_scores), 4)} +/- {round(np.std(fold_train_scores), 3)}")
    print(f"CV-results valid: {round(np.mean(fold_valid_scores), 4)} +/- {round(np.std(fold_valid_scores), 3)}")
    print(f"OOF-score = {round(oof_score, 4)}")

    return estimators, oof_score, fold_train_scores, fold_valid_scores, oof_predictions

In [12]:
%%time
cv_strategy = KFold(n_splits=5, random_state=1)

estimators, oof_score, fold_train_scores, fold_valid_scores, oof_predictions = make_cross_validation(
    train.drop(["TARGET"], axis=1), train["TARGET"], model, metric=roc_auc_score, cv_strategy=cv_strategy
)



Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1449]	training's auc: 0.821575	training's binary_logloss: 0.227792	valid_1's auc: 0.720861	valid_1's binary_logloss: 0.258537
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8216, valid-score = 0.7209, target_median = 0.06634254136008258




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1266]	training's auc: 0.812883	training's binary_logloss: 0.22891	valid_1's auc: 0.716222	valid_1's binary_logloss: 0.263484
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8129, valid-score = 0.7162, target_median = 0.06590021240429883




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1373]	training's auc: 0.81646	training's binary_logloss: 0.230285	valid_1's auc: 0.727456	valid_1's binary_logloss: 0.252346
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8165, valid-score = 0.7275, target_median = 0.06782127933455603




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1556]	training's auc: 0.826173	training's binary_logloss: 0.226751	valid_1's auc: 0.72011	valid_1's binary_logloss: 0.254166
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8262, valid-score = 0.7201, target_median = 0.06520441579779819




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1097]	training's auc: 0.804073	training's binary_logloss: 0.236064	valid_1's auc: 0.727921	valid_1's binary_logloss: 0.244994
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8041, valid-score = 0.7279, target_median = 0.06991997992573797
CV-results train: 0.8162 +/- 0.008
CV-results valid: 0.7225 +/- 0.005
OOF-score = 0.7216
CPU times: user 10min 3s, sys: 2.73 s, total: 10min 6s
Wall time: 2min 48s


In [13]:
basic_oof_score = oof_score
basic_oof_score

0.7215856282876134

### 6.2 Importance of existing features

In [14]:
#DF for results
scores = pd.DataFrame(columns=["feature", "oof_score", "diff_with_basic", "useful"])

In [16]:
features = list(train.columns)
features.remove('TARGET')

In [17]:
#results without each existing feature 
%%time

for feature in features:
    print (f"Without {feature}")
    tr = train.drop([feature], axis=1)
    cv_strategy = KFold(n_splits=5, random_state=1)

    estimators, oof_score, fold_train_scores, fold_valid_scores, oof_predictions = make_cross_validation(
    tr.drop(["TARGET"], axis=1), tr["TARGET"], model, metric=roc_auc_score, cv_strategy=cv_strategy)
    
    diff_with_basic = round(basic_oof_score - oof_score, 4)
    
    useful = ''
    
    if diff_with_basic > 0:
        useful = 'YES'
    else:
        useful = 'NO'
    
    new_row = {"feature": feature, "oof_score": oof_score, "diff_with_basic": diff_with_basic, "useful": useful} 
    scores = scores.append(new_row, ignore_index=True)
    
    
    print ("*"*69)

Without APPLICATION_NUMBER




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1351]	training's auc: 0.812627	training's binary_logloss: 0.230182	valid_1's auc: 0.722294	valid_1's binary_logloss: 0.25842
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8126, valid-score = 0.7223, target_median = 0.0664727617262303




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1527]	training's auc: 0.818347	training's binary_logloss: 0.226578	valid_1's auc: 0.718854	valid_1's binary_logloss: 0.263167
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8183, valid-score = 0.7189, target_median = 0.06584059404491413




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1439]	training's auc: 0.814631	training's binary_logloss: 0.230363	valid_1's auc: 0.729075	valid_1's binary_logloss: 0.251998
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8146, valid-score = 0.7291, target_median = 0.06761831308958474




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1739]	training's auc: 0.828053	training's binary_logloss: 0.225777	valid_1's auc: 0.719168	valid_1's binary_logloss: 0.254099
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8281, valid-score = 0.7192, target_median = 0.06519595276938103




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1160]	training's auc: 0.802496	training's binary_logloss: 0.235859	valid_1's auc: 0.72774	valid_1's binary_logloss: 0.245067
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8025, valid-score = 0.7277, target_median = 0.06965247982283945
CV-results train: 0.8152 +/- 0.008
CV-results valid: 0.7234 +/- 0.004
OOF-score = 0.7224
*********************************************************************
Without NAME_CONTRACT_TYPE




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1400]	training's auc: 0.817343	training's binary_logloss: 0.228755	valid_1's auc: 0.719151	valid_1's binary_logloss: 0.258933
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8173, valid-score = 0.7192, target_median = 0.06982111688578997




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1094]	training's auc: 0.802817	training's binary_logloss: 0.231973	valid_1's auc: 0.715713	valid_1's binary_logloss: 0.263682
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8028, valid-score = 0.7157, target_median = 0.07019266415459756




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1585]	training's auc: 0.824203	training's binary_logloss: 0.227617	valid_1's auc: 0.725537	valid_1's binary_logloss: 0.252478
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8242, valid-score = 0.7255, target_median = 0.07092951773688085




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1267]	training's auc: 0.812624	training's binary_logloss: 0.231136	valid_1's auc: 0.716969	valid_1's binary_logloss: 0.254655
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8126, valid-score = 0.717, target_median = 0.06933371323798171




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[671]	training's auc: 0.776456	training's binary_logloss: 0.244602	valid_1's auc: 0.725284	valid_1's binary_logloss: 0.246187
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7765, valid-score = 0.7253, target_median = 0.07472526164133249
CV-results train: 0.8067 +/- 0.017
CV-results valid: 0.7205 +/- 0.004
OOF-score = 0.7189
*********************************************************************
Without GENDER




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1361]	training's auc: 0.81558	training's binary_logloss: 0.229779	valid_1's auc: 0.716546	valid_1's binary_logloss: 0.259342
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8156, valid-score = 0.7165, target_median = 0.0668545201169893




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1365]	training's auc: 0.814698	training's binary_logloss: 0.22847	valid_1's auc: 0.715337	valid_1's binary_logloss: 0.263738
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8147, valid-score = 0.7153, target_median = 0.06698400885188634




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1248]	training's auc: 0.80884	training's binary_logloss: 0.232813	valid_1's auc: 0.725177	valid_1's binary_logloss: 0.252776
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8088, valid-score = 0.7252, target_median = 0.06877490405428138




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1485]	training's auc: 0.821498	training's binary_logloss: 0.228624	valid_1's auc: 0.717322	valid_1's binary_logloss: 0.254766
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8215, valid-score = 0.7173, target_median = 0.06576045349622757




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[935]	training's auc: 0.792647	training's binary_logloss: 0.239696	valid_1's auc: 0.72549	valid_1's binary_logloss: 0.245466
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7926, valid-score = 0.7255, target_median = 0.07060134062903238
CV-results train: 0.8107 +/- 0.01
CV-results valid: 0.72 +/- 0.004
OOF-score = 0.7189
*********************************************************************
Without CHILDRENS




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1381]	training's auc: 0.81754	training's binary_logloss: 0.22898	valid_1's auc: 0.721199	valid_1's binary_logloss: 0.258508
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8175, valid-score = 0.7212, target_median = 0.06626434761791401




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1115]	training's auc: 0.804256	training's binary_logloss: 0.231533	valid_1's auc: 0.717288	valid_1's binary_logloss: 0.263382
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8043, valid-score = 0.7173, target_median = 0.06671336615638605




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1346]	training's auc: 0.814483	training's binary_logloss: 0.230857	valid_1's auc: 0.728607	valid_1's binary_logloss: 0.252101
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8145, valid-score = 0.7286, target_median = 0.06801091064823289




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1443]	training's auc: 0.821185	training's binary_logloss: 0.228607	valid_1's auc: 0.719329	valid_1's binary_logloss: 0.254309
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8212, valid-score = 0.7193, target_median = 0.06548139194434255




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[818]	training's auc: 0.786062	training's binary_logloss: 0.241386	valid_1's auc: 0.727048	valid_1's binary_logloss: 0.245478
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7861, valid-score = 0.727, target_median = 0.07096653380395332
CV-results train: 0.8087 +/- 0.013
CV-results valid: 0.7227 +/- 0.004
OOF-score = 0.7217
*********************************************************************
Without TOTAL_SALARY




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1247]	training's auc: 0.809746	training's binary_logloss: 0.231419	valid_1's auc: 0.720795	valid_1's binary_logloss: 0.258701
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8097, valid-score = 0.7208, target_median = 0.0665524424190858




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[651]	training's auc: 0.776595	training's binary_logloss: 0.240679	valid_1's auc: 0.716993	valid_1's binary_logloss: 0.264139
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.7766, valid-score = 0.717, target_median = 0.06866800721085677




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1357]	training's auc: 0.813401	training's binary_logloss: 0.231311	valid_1's auc: 0.728125	valid_1's binary_logloss: 0.25219
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8134, valid-score = 0.7281, target_median = 0.0680763057930966




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2034]	training's auc: 0.841296	training's binary_logloss: 0.221698	valid_1's auc: 0.718159	valid_1's binary_logloss: 0.254214
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8413, valid-score = 0.7182, target_median = 0.0646138034146949




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[875]	training's auc: 0.788692	training's binary_logloss: 0.240618	valid_1's auc: 0.726985	valid_1's binary_logloss: 0.245316
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7887, valid-score = 0.727, target_median = 0.07086155296398836
CV-results train: 0.8059 +/- 0.022
CV-results valid: 0.7222 +/- 0.005
OOF-score = 0.7214
*********************************************************************
Without AMOUNT_CREDIT




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1425]	training's auc: 0.813566	training's binary_logloss: 0.229889	valid_1's auc: 0.719141	valid_1's binary_logloss: 0.25911
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8136, valid-score = 0.7191, target_median = 0.06652483600796867




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[749]	training's auc: 0.781073	training's binary_logloss: 0.238975	valid_1's auc: 0.713948	valid_1's binary_logloss: 0.264543
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.7811, valid-score = 0.7139, target_median = 0.06801746961804067




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[915]	training's auc: 0.788741	training's binary_logloss: 0.238829	valid_1's auc: 0.72459	valid_1's binary_logloss: 0.253023
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.7887, valid-score = 0.7246, target_median = 0.06945376625380581




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1447]	training's auc: 0.816335	training's binary_logloss: 0.229955	valid_1's auc: 0.716088	valid_1's binary_logloss: 0.254876
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8163, valid-score = 0.7161, target_median = 0.06581537910286223




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[596]	training's auc: 0.769847	training's binary_logloss: 0.246933	valid_1's auc: 0.723061	valid_1's binary_logloss: 0.246598
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7698, valid-score = 0.7231, target_median = 0.07279501315031475
CV-results train: 0.7939 +/- 0.018
CV-results valid: 0.7194 +/- 0.004
OOF-score = 0.718
*********************************************************************
Without AMOUNT_ANNUITY




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1411]	training's auc: 0.813291	training's binary_logloss: 0.229945	valid_1's auc: 0.720579	valid_1's binary_logloss: 0.258832
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8133, valid-score = 0.7206, target_median = 0.06664172283784817




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1305]	training's auc: 0.809318	training's binary_logloss: 0.229873	valid_1's auc: 0.715689	valid_1's binary_logloss: 0.26358
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8093, valid-score = 0.7157, target_median = 0.06650805824600216




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1287]	training's auc: 0.807522	training's binary_logloss: 0.232908	valid_1's auc: 0.726549	valid_1's binary_logloss: 0.252492
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8075, valid-score = 0.7265, target_median = 0.06815050132423624




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1400]	training's auc: 0.815153	training's binary_logloss: 0.230453	valid_1's auc: 0.716632	valid_1's binary_logloss: 0.254876
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8152, valid-score = 0.7166, target_median = 0.06598423139162429




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[843]	training's auc: 0.785061	training's binary_logloss: 0.241718	valid_1's auc: 0.72402	valid_1's binary_logloss: 0.245818
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7851, valid-score = 0.724, target_median = 0.0709926143067034
CV-results train: 0.8061 +/- 0.011
CV-results valid: 0.7207 +/- 0.004
OOF-score = 0.7197
*********************************************************************
Without EDUCATION_LEVEL




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1431]	training's auc: 0.817916	training's binary_logloss: 0.229012	valid_1's auc: 0.721372	valid_1's binary_logloss: 0.258573
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8179, valid-score = 0.7214, target_median = 0.06692214268838449




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[951]	training's auc: 0.794075	training's binary_logloss: 0.234859	valid_1's auc: 0.714626	valid_1's binary_logloss: 0.264244
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.7941, valid-score = 0.7146, target_median = 0.06769484604162324




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1489]	training's auc: 0.819014	training's binary_logloss: 0.22938	valid_1's auc: 0.725614	valid_1's binary_logloss: 0.252841
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.819, valid-score = 0.7256, target_median = 0.06821143644799717




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1104]	training's auc: 0.802535	training's binary_logloss: 0.234215	valid_1's auc: 0.715893	valid_1's binary_logloss: 0.25523
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8025, valid-score = 0.7159, target_median = 0.06697834533761524




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[346]	training's auc: 0.754948	training's binary_logloss: 0.254102	valid_1's auc: 0.723727	valid_1's binary_logloss: 0.248859
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7549, valid-score = 0.7237, target_median = 0.07573774165424421
CV-results train: 0.7977 +/- 0.023
CV-results valid: 0.7202 +/- 0.004
OOF-score = 0.7173
*********************************************************************
Without FAMILY_STATUS




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1470]	training's auc: 0.820047	training's binary_logloss: 0.228118	valid_1's auc: 0.721796	valid_1's binary_logloss: 0.258512
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.82, valid-score = 0.7218, target_median = 0.06589975675142945




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1498]	training's auc: 0.82142	training's binary_logloss: 0.226177	valid_1's auc: 0.71631	valid_1's binary_logloss: 0.263526
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8214, valid-score = 0.7163, target_median = 0.06593880831118956




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1628]	training's auc: 0.826288	training's binary_logloss: 0.227237	valid_1's auc: 0.727416	valid_1's binary_logloss: 0.252185
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8263, valid-score = 0.7274, target_median = 0.06736170291565559




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1827]	training's auc: 0.835312	training's binary_logloss: 0.223973	valid_1's auc: 0.718307	valid_1's binary_logloss: 0.254149
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8353, valid-score = 0.7183, target_median = 0.06502772528989653




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1219]	training's auc: 0.807831	training's binary_logloss: 0.234612	valid_1's auc: 0.727687	valid_1's binary_logloss: 0.244997
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8078, valid-score = 0.7277, target_median = 0.06942496811799123
CV-results train: 0.8222 +/- 0.009
CV-results valid: 0.7223 +/- 0.005
OOF-score = 0.7216
*********************************************************************
Without REGION_POPULATION




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1650]	training's auc: 0.826312	training's binary_logloss: 0.22605	valid_1's auc: 0.721022	valid_1's binary_logloss: 0.25856
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8263, valid-score = 0.721, target_median = 0.0658393381158445




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1154]	training's auc: 0.804964	training's binary_logloss: 0.231381	valid_1's auc: 0.716858	valid_1's binary_logloss: 0.263488
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.805, valid-score = 0.7169, target_median = 0.06674126111733751




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1163]	training's auc: 0.804517	training's binary_logloss: 0.234039	valid_1's auc: 0.727624	valid_1's binary_logloss: 0.252151
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8045, valid-score = 0.7276, target_median = 0.06832030616657767




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1585]	training's auc: 0.82463	training's binary_logloss: 0.227401	valid_1's auc: 0.718558	valid_1's binary_logloss: 0.254332
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8246, valid-score = 0.7186, target_median = 0.06535250966248357




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[929]	training's auc: 0.791543	training's binary_logloss: 0.239704	valid_1's auc: 0.726769	valid_1's binary_logloss: 0.2452
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7915, valid-score = 0.7268, target_median = 0.0702595682355118
CV-results train: 0.8104 +/- 0.013
CV-results valid: 0.7222 +/- 0.004
OOF-score = 0.7212
*********************************************************************
Without AGE




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1562]	training's auc: 0.821382	training's binary_logloss: 0.22812	valid_1's auc: 0.719864	valid_1's binary_logloss: 0.259175
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8214, valid-score = 0.7199, target_median = 0.06642217562258562




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1625]	training's auc: 0.824104	training's binary_logloss: 0.225968	valid_1's auc: 0.71479	valid_1's binary_logloss: 0.26415
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8241, valid-score = 0.7148, target_median = 0.06648357938818503




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1403]	training's auc: 0.813479	training's binary_logloss: 0.231625	valid_1's auc: 0.724656	valid_1's binary_logloss: 0.252968
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8135, valid-score = 0.7247, target_median = 0.06852031622785229




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1368]	training's auc: 0.814589	training's binary_logloss: 0.231176	valid_1's auc: 0.717338	valid_1's binary_logloss: 0.254957
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8146, valid-score = 0.7173, target_median = 0.06618403935405373




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1017]	training's auc: 0.794782	training's binary_logloss: 0.238971	valid_1's auc: 0.725976	valid_1's binary_logloss: 0.245455
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7948, valid-score = 0.726, target_median = 0.07041896083683055
CV-results train: 0.8137 +/- 0.01
CV-results valid: 0.7205 +/- 0.004
OOF-score = 0.7195
*********************************************************************
Without DAYS_ON_LAST_JOB




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1648]	training's auc: 0.824425	training's binary_logloss: 0.226922	valid_1's auc: 0.719647	valid_1's binary_logloss: 0.258836
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8244, valid-score = 0.7196, target_median = 0.06641363273218646




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1477]	training's auc: 0.816797	training's binary_logloss: 0.227639	valid_1's auc: 0.715136	valid_1's binary_logloss: 0.263452
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8168, valid-score = 0.7151, target_median = 0.06615850382447008




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1060]	training's auc: 0.796281	training's binary_logloss: 0.236375	valid_1's auc: 0.726856	valid_1's binary_logloss: 0.25292
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.7963, valid-score = 0.7269, target_median = 0.0685300148986184




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1991]	training's auc: 0.837211	training's binary_logloss: 0.223141	valid_1's auc: 0.714831	valid_1's binary_logloss: 0.255079
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8372, valid-score = 0.7148, target_median = 0.06475642352385455




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1266]	training's auc: 0.806417	training's binary_logloss: 0.23505	valid_1's auc: 0.724935	valid_1's binary_logloss: 0.245189
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8064, valid-score = 0.7249, target_median = 0.06941561218429534
CV-results train: 0.8162 +/- 0.014
CV-results valid: 0.7203 +/- 0.005
OOF-score = 0.7194
*********************************************************************
Without OWN_CAR_AGE




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1681]	training's auc: 0.826877	training's binary_logloss: 0.225934	valid_1's auc: 0.719841	valid_1's binary_logloss: 0.258741
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8269, valid-score = 0.7198, target_median = 0.06607917712855958




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1354]	training's auc: 0.813991	training's binary_logloss: 0.22856	valid_1's auc: 0.714327	valid_1's binary_logloss: 0.263832
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.814, valid-score = 0.7143, target_median = 0.06618716296410916




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1778]	training's auc: 0.829715	training's binary_logloss: 0.225839	valid_1's auc: 0.72752	valid_1's binary_logloss: 0.252352
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8297, valid-score = 0.7275, target_median = 0.06729855565586088




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1797]	training's auc: 0.832666	training's binary_logloss: 0.22475	valid_1's auc: 0.717956	valid_1's binary_logloss: 0.254431
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8327, valid-score = 0.718, target_median = 0.06478177225972043




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1002]	training's auc: 0.795705	training's binary_logloss: 0.238533	valid_1's auc: 0.724906	valid_1's binary_logloss: 0.245546
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7957, valid-score = 0.7249, target_median = 0.07042673481732228
CV-results train: 0.8198 +/- 0.014
CV-results valid: 0.7209 +/- 0.005
OOF-score = 0.7201
*********************************************************************
Without FLAG_PHONE




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1386]	training's auc: 0.817332	training's binary_logloss: 0.228951	valid_1's auc: 0.721565	valid_1's binary_logloss: 0.258515
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8173, valid-score = 0.7216, target_median = 0.06600125295848795




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1017]	training's auc: 0.799615	training's binary_logloss: 0.233149	valid_1's auc: 0.716898	valid_1's binary_logloss: 0.26343
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.7996, valid-score = 0.7169, target_median = 0.06689500101889123




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1397]	training's auc: 0.816429	training's binary_logloss: 0.230151	valid_1's auc: 0.727297	valid_1's binary_logloss: 0.252182
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8164, valid-score = 0.7273, target_median = 0.067654932591425




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1517]	training's auc: 0.824278	training's binary_logloss: 0.227556	valid_1's auc: 0.719353	valid_1's binary_logloss: 0.254194
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8243, valid-score = 0.7194, target_median = 0.06519959559377775




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1033]	training's auc: 0.798977	training's binary_logloss: 0.237476	valid_1's auc: 0.726983	valid_1's binary_logloss: 0.245113
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.799, valid-score = 0.727, target_median = 0.06977887188995602
CV-results train: 0.8113 +/- 0.01
CV-results valid: 0.7224 +/- 0.004
OOF-score = 0.7216
*********************************************************************
Without FLAG_EMAIL




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1582]	training's auc: 0.825487	training's binary_logloss: 0.226386	valid_1's auc: 0.721924	valid_1's binary_logloss: 0.258384
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8255, valid-score = 0.7219, target_median = 0.06572505096537772




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1161]	training's auc: 0.80717	training's binary_logloss: 0.230773	valid_1's auc: 0.716586	valid_1's binary_logloss: 0.263454
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8072, valid-score = 0.7166, target_median = 0.06647799119611314




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1482]	training's auc: 0.820665	training's binary_logloss: 0.228974	valid_1's auc: 0.72884	valid_1's binary_logloss: 0.252063
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8207, valid-score = 0.7288, target_median = 0.06775347238261113




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1704]	training's auc: 0.831807	training's binary_logloss: 0.225112	valid_1's auc: 0.719689	valid_1's binary_logloss: 0.254069
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8318, valid-score = 0.7197, target_median = 0.06506423985453422




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1134]	training's auc: 0.804892	training's binary_logloss: 0.235692	valid_1's auc: 0.727115	valid_1's binary_logloss: 0.245046
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8049, valid-score = 0.7271, target_median = 0.06969273500289402
CV-results train: 0.818 +/- 0.01
CV-results valid: 0.7228 +/- 0.005
OOF-score = 0.7219
*********************************************************************
Without FAMILY_SIZE




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1473]	training's auc: 0.821303	training's binary_logloss: 0.227819	valid_1's auc: 0.722145	valid_1's binary_logloss: 0.258436
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8213, valid-score = 0.7221, target_median = 0.06585462005335856




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1202]	training's auc: 0.80929	training's binary_logloss: 0.230123	valid_1's auc: 0.716632	valid_1's binary_logloss: 0.263441
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8093, valid-score = 0.7166, target_median = 0.06650736912973773




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1627]	training's auc: 0.826138	training's binary_logloss: 0.227042	valid_1's auc: 0.728151	valid_1's binary_logloss: 0.252108
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8261, valid-score = 0.7282, target_median = 0.0672750465269193




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1505]	training's auc: 0.823167	training's binary_logloss: 0.227758	valid_1's auc: 0.720301	valid_1's binary_logloss: 0.254187
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8232, valid-score = 0.7203, target_median = 0.06547381286614638




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1330]	training's auc: 0.814021	training's binary_logloss: 0.232697	valid_1's auc: 0.727485	valid_1's binary_logloss: 0.244879
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.814, valid-score = 0.7275, target_median = 0.06908862363414794
CV-results train: 0.8188 +/- 0.006
CV-results valid: 0.7229 +/- 0.004
OOF-score = 0.722
*********************************************************************
Without EXTERNAL_SCORING_RATING_1




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1566]	training's auc: 0.820738	training's binary_logloss: 0.228498	valid_1's auc: 0.713648	valid_1's binary_logloss: 0.260573
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8207, valid-score = 0.7136, target_median = 0.06692913194466493




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1167]	training's auc: 0.803607	training's binary_logloss: 0.232811	valid_1's auc: 0.71141	valid_1's binary_logloss: 0.264695
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8036, valid-score = 0.7114, target_median = 0.06802174655707592




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1298]	training's auc: 0.808286	training's binary_logloss: 0.233738	valid_1's auc: 0.722578	valid_1's binary_logloss: 0.25332
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8083, valid-score = 0.7226, target_median = 0.0693977856431638




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1717]	training's auc: 0.827242	training's binary_logloss: 0.227137	valid_1's auc: 0.710709	valid_1's binary_logloss: 0.256307
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8272, valid-score = 0.7107, target_median = 0.06575237951220023




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[620]	training's auc: 0.768843	training's binary_logloss: 0.247554	valid_1's auc: 0.721137	valid_1's binary_logloss: 0.247494
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7688, valid-score = 0.7211, target_median = 0.0728924878057954
CV-results train: 0.8057 +/- 0.02
CV-results valid: 0.7159 +/- 0.005
OOF-score = 0.7145
*********************************************************************
Without EXTERNAL_SCORING_RATING_2




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1694]	training's auc: 0.820772	training's binary_logloss: 0.230351	valid_1's auc: 0.704278	valid_1's binary_logloss: 0.262881
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8208, valid-score = 0.7043, target_median = 0.06876143773862618




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1890]	training's auc: 0.8287	training's binary_logloss: 0.226389	valid_1's auc: 0.701058	valid_1's binary_logloss: 0.267322
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8287, valid-score = 0.7011, target_median = 0.06867463032654333




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1753]	training's auc: 0.82059	training's binary_logloss: 0.231004	valid_1's auc: 0.709822	valid_1's binary_logloss: 0.257057
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8206, valid-score = 0.7098, target_median = 0.07012847115484779




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2258]	training's auc: 0.842307	training's binary_logloss: 0.224128	valid_1's auc: 0.705569	valid_1's binary_logloss: 0.258129
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8423, valid-score = 0.7056, target_median = 0.06805455192546019




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1240]	training's auc: 0.800119	training's binary_logloss: 0.239346	valid_1's auc: 0.709418	valid_1's binary_logloss: 0.249635
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8001, valid-score = 0.7094, target_median = 0.07207490731405469
CV-results train: 0.8225 +/- 0.014
CV-results valid: 0.706 +/- 0.003
OOF-score = 0.7052
*********************************************************************
Without EXTERNAL_SCORING_RATING_3




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1659]	training's auc: 0.813433	training's binary_logloss: 0.232966	valid_1's auc: 0.695147	valid_1's binary_logloss: 0.264643
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8134, valid-score = 0.6951, target_median = 0.07122876188810616




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1210]	training's auc: 0.792959	training's binary_logloss: 0.237097	valid_1's auc: 0.689247	valid_1's binary_logloss: 0.269904
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.793, valid-score = 0.6892, target_median = 0.07044617437248285




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1603]	training's auc: 0.811784	training's binary_logloss: 0.234661	valid_1's auc: 0.698333	valid_1's binary_logloss: 0.259009
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8118, valid-score = 0.6983, target_median = 0.07193834628665646




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1156]	training's auc: 0.790614	training's binary_logloss: 0.240104	valid_1's auc: 0.69398	valid_1's binary_logloss: 0.260081
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.7906, valid-score = 0.694, target_median = 0.07144091437630211




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1667]	training's auc: 0.814139	training's binary_logloss: 0.235282	valid_1's auc: 0.690894	valid_1's binary_logloss: 0.252242
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8141, valid-score = 0.6909, target_median = 0.07350127619678101
CV-results train: 0.8046 +/- 0.011
CV-results valid: 0.6935 +/- 0.003
OOF-score = 0.6928
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_HOUR




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1746]	training's auc: 0.832564	training's binary_logloss: 0.224169	valid_1's auc: 0.721135	valid_1's binary_logloss: 0.258417
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8326, valid-score = 0.7211, target_median = 0.06559521940460637




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1801]	training's auc: 0.8334	training's binary_logloss: 0.222073	valid_1's auc: 0.716793	valid_1's binary_logloss: 0.263426
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8334, valid-score = 0.7168, target_median = 0.06531849827392214




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1701]	training's auc: 0.829027	training's binary_logloss: 0.226161	valid_1's auc: 0.728412	valid_1's binary_logloss: 0.252168
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.829, valid-score = 0.7284, target_median = 0.06716234401108881




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1440]	training's auc: 0.821184	training's binary_logloss: 0.228573	valid_1's auc: 0.719416	valid_1's binary_logloss: 0.254237
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8212, valid-score = 0.7194, target_median = 0.06558616475706716




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[857]	training's auc: 0.788924	training's binary_logloss: 0.240639	valid_1's auc: 0.726939	valid_1's binary_logloss: 0.245446
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7889, valid-score = 0.7269, target_median = 0.07091217683199771
CV-results train: 0.821 +/- 0.017
CV-results valid: 0.7225 +/- 0.004
OOF-score = 0.7214
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_DAY




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1509]	training's auc: 0.822805	training's binary_logloss: 0.227315	valid_1's auc: 0.721116	valid_1's binary_logloss: 0.258453
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8228, valid-score = 0.7211, target_median = 0.06596150676839396




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1042]	training's auc: 0.801213	training's binary_logloss: 0.232696	valid_1's auc: 0.71661	valid_1's binary_logloss: 0.26351
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8012, valid-score = 0.7166, target_median = 0.06695691171261373




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1847]	training's auc: 0.834986	training's binary_logloss: 0.22418	valid_1's auc: 0.728169	valid_1's binary_logloss: 0.252128
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.835, valid-score = 0.7282, target_median = 0.06702634770353831




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1786]	training's auc: 0.834506	training's binary_logloss: 0.224144	valid_1's auc: 0.719067	valid_1's binary_logloss: 0.254242
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8345, valid-score = 0.7191, target_median = 0.06500993120852505




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1288]	training's auc: 0.812578	training's binary_logloss: 0.233261	valid_1's auc: 0.727089	valid_1's binary_logloss: 0.245052
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8126, valid-score = 0.7271, target_median = 0.06939615318247958
CV-results train: 0.8212 +/- 0.013
CV-results valid: 0.7224 +/- 0.005
OOF-score = 0.7217
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_WEEK




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1640]	training's auc: 0.827543	training's binary_logloss: 0.225588	valid_1's auc: 0.720562	valid_1's binary_logloss: 0.258469
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8275, valid-score = 0.7206, target_median = 0.06568742636770501




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1331]	training's auc: 0.815049	training's binary_logloss: 0.228269	valid_1's auc: 0.716404	valid_1's binary_logloss: 0.263344
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.815, valid-score = 0.7164, target_median = 0.06614334788210331




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1750]	training's auc: 0.831166	training's binary_logloss: 0.225451	valid_1's auc: 0.728176	valid_1's binary_logloss: 0.252218
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8312, valid-score = 0.7282, target_median = 0.06733415660546246




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1929]	training's auc: 0.839733	training's binary_logloss: 0.222306	valid_1's auc: 0.720475	valid_1's binary_logloss: 0.254172
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8397, valid-score = 0.7205, target_median = 0.06484551327416993




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1166]	training's auc: 0.806902	training's binary_logloss: 0.235198	valid_1's auc: 0.727313	valid_1's binary_logloss: 0.245106
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8069, valid-score = 0.7273, target_median = 0.0695570372979177
CV-results train: 0.8241 +/- 0.012
CV-results valid: 0.7226 +/- 0.004
OOF-score = 0.7217
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_MON




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2224]	training's auc: 0.849318	training's binary_logloss: 0.218432	valid_1's auc: 0.721577	valid_1's binary_logloss: 0.258303
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8493, valid-score = 0.7216, target_median = 0.06513129876943499




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1752]	training's auc: 0.831061	training's binary_logloss: 0.222769	valid_1's auc: 0.716703	valid_1's binary_logloss: 0.263348
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8311, valid-score = 0.7167, target_median = 0.06557799953938168




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1573]	training's auc: 0.824171	training's binary_logloss: 0.227869	valid_1's auc: 0.727809	valid_1's binary_logloss: 0.252319
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8242, valid-score = 0.7278, target_median = 0.06750036709682267




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1902]	training's auc: 0.83833	training's binary_logloss: 0.222644	valid_1's auc: 0.719366	valid_1's binary_logloss: 0.254144
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8383, valid-score = 0.7194, target_median = 0.06467873124831205




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1457]	training's auc: 0.820561	training's binary_logloss: 0.230751	valid_1's auc: 0.72675	valid_1's binary_logloss: 0.24504
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8206, valid-score = 0.7268, target_median = 0.068791793533353
CV-results train: 0.8327 +/- 0.01
CV-results valid: 0.7224 +/- 0.004
OOF-score = 0.7216
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_QRT




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2012]	training's auc: 0.840998	training's binary_logloss: 0.221119	valid_1's auc: 0.720386	valid_1's binary_logloss: 0.25857
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.841, valid-score = 0.7204, target_median = 0.06551187966022594




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1204]	training's auc: 0.809273	training's binary_logloss: 0.230203	valid_1's auc: 0.716253	valid_1's binary_logloss: 0.263498
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8093, valid-score = 0.7163, target_median = 0.06661563407094727




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1957]	training's auc: 0.83809	training's binary_logloss: 0.223009	valid_1's auc: 0.728339	valid_1's binary_logloss: 0.252123
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8381, valid-score = 0.7283, target_median = 0.06707315704170258




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1797]	training's auc: 0.834342	training's binary_logloss: 0.224144	valid_1's auc: 0.719433	valid_1's binary_logloss: 0.254229
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8343, valid-score = 0.7194, target_median = 0.06474031598959565




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1197]	training's auc: 0.807822	training's binary_logloss: 0.234735	valid_1's auc: 0.72695	valid_1's binary_logloss: 0.245208
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8078, valid-score = 0.7269, target_median = 0.06970536961771155
CV-results train: 0.8261 +/- 0.014
CV-results valid: 0.7223 +/- 0.005
OOF-score = 0.7214
*********************************************************************
Without AMT_REQ_CREDIT_BUREAU_YEAR




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2112]	training's auc: 0.843798	training's binary_logloss: 0.220243	valid_1's auc: 0.720695	valid_1's binary_logloss: 0.258447
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8438, valid-score = 0.7207, target_median = 0.06512803581588945




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1333]	training's auc: 0.813421	training's binary_logloss: 0.22865	valid_1's auc: 0.716768	valid_1's binary_logloss: 0.263413
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8134, valid-score = 0.7168, target_median = 0.06576418696726159




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1152]	training's auc: 0.80391	training's binary_logloss: 0.234102	valid_1's auc: 0.727531	valid_1's binary_logloss: 0.252402
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8039, valid-score = 0.7275, target_median = 0.06813130199247623




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1594]	training's auc: 0.824824	training's binary_logloss: 0.227132	valid_1's auc: 0.719381	valid_1's binary_logloss: 0.254204
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8248, valid-score = 0.7194, target_median = 0.06517929920357468




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[860]	training's auc: 0.787858	training's binary_logloss: 0.240804	valid_1's auc: 0.726562	valid_1's binary_logloss: 0.245452
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.7879, valid-score = 0.7266, target_median = 0.07082871095121657
CV-results train: 0.8148 +/- 0.019
CV-results valid: 0.7222 +/- 0.004
OOF-score = 0.7209
*********************************************************************
CPU times: user 4h 1min 54s, sys: 47.4 s, total: 4h 2min 41s
Wall time: 1h 4min 45s


In [18]:
scores

Unnamed: 0,feature,oof_score,diff_with_basic,useful
0,APPLICATION_NUMBER,0.722438,-0.0009,NO
1,NAME_CONTRACT_TYPE,0.718877,0.0027,YES
2,GENDER,0.718944,0.0026,YES
3,CHILDRENS,0.721696,-0.0001,NO
4,TOTAL_SALARY,0.721375,0.0002,YES
5,AMOUNT_CREDIT,0.717975,0.0036,YES
6,AMOUNT_ANNUITY,0.719704,0.0019,YES
7,EDUCATION_LEVEL,0.717317,0.0043,YES
8,FAMILY_STATUS,0.721563,0.0,NO
9,REGION_POPULATION,0.72118,0.0004,YES


### 6.3 Feature transformation

Here I'm introducing only important new features and data transformation, which bring good results through the same approach (5-fold cross-validation).

Honestly, I have made about 35 new features finally, however just several of them increased the score.

In [116]:
def feature_transformation(data):
    
    data['CREDIT_TIME'] = data['AMOUNT_CREDIT'] / data['AMOUNT_ANNUITY']
    data["DAYS_ON_LAST_JOB"] = data["DAYS_ON_LAST_JOB"].replace(365243, np.nan)
    
    data['EXTERNAL_SCORING_RATING_1'].fillna(1, inplace=True)
    data['EXTERNAL_SCORING_RATING_2'].fillna(1, inplace=True)
    data['EXTERNAL_SCORING_RATING_3'].fillna(1, inplace=True)
    
    #for function_name in ["min", "max", "mean", "nanmedian", "var"]:
    for function_name in ["mean"]:
        feature_name = "EXTERNAL_SCORING_RATING_{}".format(function_name)
        data[feature_name] = eval("np.{}".format(function_name))(
            data[["EXTERNAL_SCORING_RATING_1", "EXTERNAL_SCORING_RATING_2", "EXTERNAL_SCORING_RATING_3"]], axis=1
        )
    return data        

In [117]:
train = feature_transformation(train)

In [118]:
test = feature_transformation(test)

In [119]:
def bki_features(data, bki_data):
    
    bki_active = bki_data[bki_data['CREDIT_ACTIVE'] == 'Active']
    bki_active['diff_1'] = bki_active['AMT_CREDIT_SUM'] / bki_active['DAYS_CREDIT']
    
    apps_groupby = bki_active.groupby("APPLICATION_NUMBER", as_index=False)
    
    previous_app_counts = apps_groupby["diff_1"].sum()
    previous_app_counts = previous_app_counts.rename(columns={
        "PREV_APPLICATION_NUMBER": "PREV_APPS_COUNT"
    })
    
    unknown_customers = list(set(data['APPLICATION_NUMBER']) - set(client_profile['APPLICATION_NUMBER']))
    
    previous_app_counts = previous_app_counts.loc[~previous_app_counts['APPLICATION_NUMBER'].isin(unknown_customers)]
    
    data = pd.merge(data, previous_app_counts, how='left', on = 'APPLICATION_NUMBER')
    
    unknown_customers_2 = list(set(data['APPLICATION_NUMBER']) - set(client_profile['APPLICATION_NUMBER']) - set(previous_app_counts['APPLICATION_NUMBER']))
    
    data.loc[data['APPLICATION_NUMBER'].isin(unknown_customers_2), 'diff_1'] = 0
    
    return data
    

In [120]:
train = bki_features(train, bki)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bki_active['diff_1'] = bki_active['AMT_CREDIT_SUM'] / bki_active['DAYS_CREDIT']


In [121]:
test = bki_features(test, bki)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  bki_active['diff_1'] = bki_active['AMT_CREDIT_SUM'] / bki_active['DAYS_CREDIT']


In [122]:
train.head()

Unnamed: 0,APPLICATION_NUMBER,TARGET,NAME_CONTRACT_TYPE,GENDER,CHILDRENS,TOTAL_SALARY,AMOUNT_CREDIT,AMOUNT_ANNUITY,EDUCATION_LEVEL,FAMILY_STATUS,...,EXTERNAL_SCORING_RATING_3,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,AMT_REQ_CREDIT_BUREAU_YEAR,CREDIT_TIME,EXTERNAL_SCORING_RATING_mean,diff_1
0,123687442,0,Cash,M,1.0,157500.0,855000.0,25128.0,Secondary / secondary special,Married,...,0.71657,0.0,0.0,1.0,0.0,0.0,2.0,34.025788,0.687756,
1,123597908,1,Cash,,,,,,,,...,1.0,,,,,,,,1.0,0.0
2,123526683,0,Cash,F,0.0,135000.0,1006920.0,42660.0,Higher education,Married,...,0.267869,0.0,0.0,0.0,7.0,0.0,4.0,23.603376,0.650006,4554.621849
3,123710391,1,Cash,M,0.0,180000.0,518562.0,22972.5,Secondary / secondary special,Married,...,0.170446,0.0,0.0,0.0,0.0,0.0,0.0,22.573164,0.447248,
4,123590329,1,Cash,,,,,,,,...,1.0,,,,,,,,1.0,0.0


In [123]:
test.head()

Unnamed: 0,APPLICATION_NUMBER,NAME_CONTRACT_TYPE,GENDER,CHILDRENS,TOTAL_SALARY,AMOUNT_CREDIT,AMOUNT_ANNUITY,EDUCATION_LEVEL,FAMILY_STATUS,REGION_POPULATION,...,EXTERNAL_SCORING_RATING_3,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,AMT_REQ_CREDIT_BUREAU_YEAR,CREDIT_TIME,EXTERNAL_SCORING_RATING_mean,diff_1
0,123724268,Cash,M,0.0,117000.0,1125000.0,32895.0,Secondary / secondary special,Married,0.028663,...,1.0,0.0,0.0,0.0,0.0,1.0,4.0,34.199726,0.876089,
1,123456549,Cash,F,2.0,81000.0,312768.0,17095.5,Secondary / secondary special,Married,0.019689,...,0.18849,0.0,0.0,1.0,0.0,0.0,2.0,18.295341,0.588884,
2,123428178,Credit Card,F,2.0,157500.0,450000.0,22500.0,Secondary / secondary special,Married,0.019101,...,0.382502,0.0,0.0,0.0,0.0,1.0,6.0,20.0,0.511682,2700.0
3,123619984,Cash,,,,,,,,,...,1.0,,,,,,,,1.0,0.0
4,123671104,Cash,F,1.0,90000.0,254700.0,24939.0,Higher education,Married,0.015221,...,0.415347,0.0,0.0,0.0,0.0,1.0,0.0,10.21292,0.546552,42724.734376


### 6.4 Final datasets

In [124]:
# a final set of features
features = ['TARGET', 'NAME_CONTRACT_TYPE',
 'GENDER', 
 'EDUCATION_LEVEL',
 'FAMILY_STATUS', 
 'AGE',
 'DAYS_ON_LAST_JOB',
 'OWN_CAR_AGE',  
 'EXTERNAL_SCORING_RATING_1',
 'EXTERNAL_SCORING_RATING_2',
 'EXTERNAL_SCORING_RATING_3',
 'CREDIT_TIME',
 'AMOUNT_CREDIT',
 'AMOUNT_ANNUITY',
 'REGION_POPULATION',
 'AMT_REQ_CREDIT_BUREAU_HOUR',
 'AMT_REQ_CREDIT_BUREAU_DAY',
 'AMT_REQ_CREDIT_BUREAU_WEEK',
 'AMT_REQ_CREDIT_BUREAU_MON',
 'AMT_REQ_CREDIT_BUREAU_QRT', 
 'EXTERNAL_SCORING_RATING_mean', 
 'diff_1', 
 ]

In [125]:
train = train[features]

In [126]:
features.remove('TARGET')

In [127]:
test_AN = test['APPLICATION_NUMBER']

In [128]:
test = test[features]

In [129]:
test.head()

Unnamed: 0,NAME_CONTRACT_TYPE,GENDER,EDUCATION_LEVEL,FAMILY_STATUS,AGE,DAYS_ON_LAST_JOB,OWN_CAR_AGE,EXTERNAL_SCORING_RATING_1,EXTERNAL_SCORING_RATING_2,EXTERNAL_SCORING_RATING_3,...,AMOUNT_CREDIT,AMOUNT_ANNUITY,REGION_POPULATION,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,EXTERNAL_SCORING_RATING_mean,diff_1
0,Cash,M,Secondary / secondary special,Married,16007.0,2646.0,20.0,1.0,0.628266,1.0,...,1125000.0,32895.0,0.028663,0.0,0.0,0.0,0.0,1.0,0.876089,
1,Cash,F,Secondary / secondary special,Married,10315.0,459.0,,1.0,0.578161,0.18849,...,312768.0,17095.5,0.019689,0.0,0.0,1.0,0.0,0.0,0.588884,
2,Credit Card,F,Secondary / secondary special,Married,13016.0,977.0,,1.0,0.152544,0.382502,...,450000.0,22500.0,0.019101,0.0,0.0,0.0,0.0,1.0,0.511682,2700.0
3,Cash,,,,,,,1.0,1.0,1.0,...,,,,,,,,,1.0,0.0
4,Cash,F,Higher education,Married,17743.0,9258.0,,0.718604,0.505704,0.415347,...,254700.0,24939.0,0.015221,0.0,0.0,0.0,0.0,1.0,0.546552,42724.734376


In [130]:
train.head()

Unnamed: 0,TARGET,NAME_CONTRACT_TYPE,GENDER,EDUCATION_LEVEL,FAMILY_STATUS,AGE,DAYS_ON_LAST_JOB,OWN_CAR_AGE,EXTERNAL_SCORING_RATING_1,EXTERNAL_SCORING_RATING_2,...,AMOUNT_CREDIT,AMOUNT_ANNUITY,REGION_POPULATION,AMT_REQ_CREDIT_BUREAU_HOUR,AMT_REQ_CREDIT_BUREAU_DAY,AMT_REQ_CREDIT_BUREAU_WEEK,AMT_REQ_CREDIT_BUREAU_MON,AMT_REQ_CREDIT_BUREAU_QRT,EXTERNAL_SCORING_RATING_mean,diff_1
0,0,Cash,M,Secondary / secondary special,Married,15728.0,1719.0,11.0,0.700784,0.645914,...,855000.0,25128.0,0.019101,0.0,0.0,1.0,0.0,0.0,0.687756,
1,1,Cash,,,,,,,1.0,1.0,...,,,,,,,,,1.0,0.0
2,0,Cash,F,Higher education,Married,21557.0,3618.0,,1.0,0.682149,...,1006920.0,42660.0,0.026392,0.0,0.0,0.0,7.0,0.0,0.650006,4554.621849
3,1,Cash,M,Secondary / secondary special,Married,22338.0,,,1.0,0.171299,...,518562.0,22972.5,0.031329,0.0,0.0,0.0,0.0,0.0,0.447248,
4,1,Cash,,,,,,,1.0,1.0,...,,,,,,,,,1.0,0.0


### 6.5 LightGBM 

__6.5.1 LGBM model with final features__

Before fitting on all train dataset and predicting on test dataset, I  find the average number of estimators per validation for the hyperparameters of the final model. 

In [131]:
cv_strategy = KFold(n_splits=5, random_state=1)

estimators, oof_score, fold_train_scores, fold_valid_scores, oof_predictions = make_cross_validation(
    train.drop(["TARGET"], axis=1), train["TARGET"], model, metric=roc_auc_score, cv_strategy=cv_strategy
)



Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[1593]	training's auc: 0.826169	training's binary_logloss: 0.22415	valid_1's auc: 0.732335	valid_1's binary_logloss: 0.256325
Fold: 1, train-observations = 88074, valid-observations = 22019
train-score = 0.8262, valid-score = 0.7323, target_median = 0.06431217116199961
Training until validation scores don't improve for 500 rounds




Early stopping, best iteration is:
[2120]	training's auc: 0.847243	training's binary_logloss: 0.216086	valid_1's auc: 0.729083	valid_1's binary_logloss: 0.260528
Fold: 2, train-observations = 88074, valid-observations = 22019
train-score = 0.8472, valid-score = 0.7291, target_median = 0.06270943347445858
Training until validation scores don't improve for 500 rounds




Early stopping, best iteration is:
[1245]	training's auc: 0.814281	training's binary_logloss: 0.230141	valid_1's auc: 0.739057	valid_1's binary_logloss: 0.249937
Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.8143, valid-score = 0.7391, target_median = 0.06542598300522556




Training until validation scores don't improve for 500 rounds
Early stopping, best iteration is:
[2216]	training's auc: 0.849925	training's binary_logloss: 0.216873	valid_1's auc: 0.729174	valid_1's binary_logloss: 0.251797
Fold: 4, train-observations = 88075, valid-observations = 22018
train-score = 0.8499, valid-score = 0.7292, target_median = 0.06262188921016779
Training until validation scores don't improve for 500 rounds




Early stopping, best iteration is:
[1603]	training's auc: 0.829598	training's binary_logloss: 0.226533	valid_1's auc: 0.74166	valid_1's binary_logloss: 0.242391
Fold: 5, train-observations = 88075, valid-observations = 22018
train-score = 0.8296, valid-score = 0.7417, target_median = 0.06620613036055613
CV-results train: 0.8334 +/- 0.013
CV-results valid: 0.7343 +/- 0.005
OOF-score = 0.7333


__6.5.2 LGBM model for predictions on a test dataset__

In [132]:
#average number of estimators per validation
np.mean([1593, 2120, 1245, 2216, 1603])

1755.4

In [138]:
params = {
    'boosting_type': 'gbdt',
    'n_estimators': 1755, #fixed n_estimators 
    'learning_rate': 0.005134,
    'num_leaves': 54,
    'max_depth': 10,
    'subsample_for_bin': 240000,
    'reg_alpha': 0.436193,
    'reg_lambda': 0.479169,
    'colsample_bytree': 0.508716,
    'min_split_gain': 0.024766,
    'subsample': 0.7,
    'is_unbalance': False,
    'random_state': 42,
    'silent': -1,
    'verbose': -1
}

In [139]:
x_train = train.drop(["TARGET"], axis=1)
y_train = train["TARGET"]

In [140]:
model = lgb.LGBMClassifier(**params)
model.fit(
    X=x_train,
    y=y_train,
    eval_set=[(x_train, y_train)],    
    eval_metric="auc",
    verbose=200
)

[50]	training's auc: 0.734748	training's binary_logloss: 0.271551
[100]	training's auc: 0.737772	training's binary_logloss: 0.265371
[150]	training's auc: 0.741419	training's binary_logloss: 0.261018
[200]	training's auc: 0.74439	training's binary_logloss: 0.25766
[250]	training's auc: 0.747636	training's binary_logloss: 0.254951
[300]	training's auc: 0.750704	training's binary_logloss: 0.25276
[350]	training's auc: 0.753871	training's binary_logloss: 0.25085
[400]	training's auc: 0.757061	training's binary_logloss: 0.249225
[450]	training's auc: 0.760281	training's binary_logloss: 0.247736
[500]	training's auc: 0.763952	training's binary_logloss: 0.246358
[550]	training's auc: 0.766909	training's binary_logloss: 0.245137
[600]	training's auc: 0.769855	training's binary_logloss: 0.243983
[650]	training's auc: 0.772681	training's binary_logloss: 0.242913
[700]	training's auc: 0.775506	training's binary_logloss: 0.241872
[750]	training's auc: 0.778229	training's binary_logloss: 0.240904


LGBMClassifier(colsample_bytree=0.508716, is_unbalance=False,
               learning_rate=0.005134, max_depth=10, min_split_gain=0.024766,
               n_estimators=1755, num_leaves=54, random_state=42,
               reg_alpha=0.436193, reg_lambda=0.479169, silent=-1,
               subsample=0.7, subsample_for_bin=240000, verbose=-1)

In [141]:
y_pred_test_lgbm = model.predict_proba(test)

In [142]:
y_pred_test_lgbm

array([[0.96537049, 0.03462951],
       [0.73901369, 0.26098631],
       [0.8603227 , 0.1396773 ],
       ...,
       [0.91480179, 0.08519821],
       [0.980853  , 0.019147  ],
       [0.95463667, 0.04536333]])

In [145]:
result_test_x = pd.DataFrame({
    "APPLICATION_NUMBER": test_AN,
    "TARGET": y_pred_test_lgbm[:,1]})
result_test_x.head()

Unnamed: 0,APPLICATION_NUMBER,TARGET
0,123724268,0.03463
1,123456549,0.260986
2,123428178,0.139677
3,123619984,0.085198
4,123671104,0.014274


In [146]:
filename = 'lgbm_only.csv'
result_test_x.to_csv(filename, index=None)

### 6.6 XGBoost

__6.6.1 XGBoost model with final features__

...The same for XGBoost...

In [147]:
#categorical to number
def tun_dataset(data):
  numerical_features = data.select_dtypes(include=[np.number])
  categorical_features = pd.get_dummies(data.select_dtypes(include=['category']), drop_first=True) 
  num_plus_cat = numerical_features.join(categorical_features)
  return num_plus_cat

In [148]:
train = tun_dataset(train)

In [149]:
test = tun_dataset(test)

In [150]:
params = {
    "objective": "binary:logistic",
    "booster": "gbtree",
    "eval_metric": "auc",
    "eta": "0.05",
    "max_depth": 6,
    "gamma": 10,
    "subsample": 0.85,
    "colsample_bytree": 0.7,
    "colsample_bylevel": 0.632,
    "min_child_weight": 30,
    "alpha": 0,
    "lambda": 0,
    "nthread": 6,
    "random_seed": 27,
    "n_estimators": 5555,
}

In [151]:
model = xgb.XGBClassifier(**params)

In [152]:
cv_strategy = KFold(n_splits=5, random_state=1)

estimators, oof_score, fold_train_scores, fold_valid_scores, oof_predictions = make_cross_validation(
    train.drop(["TARGET"], axis=1), train["TARGET"], model, metric=roc_auc_score, cv_strategy=cv_strategy
)



Parameters: { random_seed } might not be used.

  This may not be accurate due to some parameters are only used in language bindings but
  passed down to XGBoost core.  Or some parameters are not used but slip through this
  verification. Please open an issue if you find above cases.


[0]	validation_0-auc:0.65609	validation_1-auc:0.64318
Multiple eval metrics have been passed: 'validation_1-auc' will be used for early stopping.

Will train until validation_1-auc hasn't improved in 500 rounds.
[1]	validation_0-auc:0.68945	validation_1-auc:0.67182
[2]	validation_0-auc:0.69617	validation_1-auc:0.68083
[3]	validation_0-auc:0.69949	validation_1-auc:0.68537
[4]	validation_0-auc:0.70189	validation_1-auc:0.68643
[5]	validation_0-auc:0.70498	validation_1-auc:0.68961
[6]	validation_0-auc:0.70923	validation_1-auc:0.69282
[7]	validation_0-auc:0.71116	validation_1-auc:0.69533
[8]	validation_0-auc:0.71289	validation_1-auc:0.69804
[9]	validation_0-auc:0.71372	validation_1-auc:0.69813
[10]	validation

[140]	validation_0-auc:0.74804	validation_1-auc:0.72298
[141]	validation_0-auc:0.74814	validation_1-auc:0.72304
[142]	validation_0-auc:0.74813	validation_1-auc:0.72298
[143]	validation_0-auc:0.74823	validation_1-auc:0.72308
[144]	validation_0-auc:0.74823	validation_1-auc:0.72316
[145]	validation_0-auc:0.74833	validation_1-auc:0.72340
[146]	validation_0-auc:0.74890	validation_1-auc:0.72387
[147]	validation_0-auc:0.74890	validation_1-auc:0.72387
[148]	validation_0-auc:0.74891	validation_1-auc:0.72397
[149]	validation_0-auc:0.74891	validation_1-auc:0.72397
[150]	validation_0-auc:0.74890	validation_1-auc:0.72394
[151]	validation_0-auc:0.74905	validation_1-auc:0.72403
[152]	validation_0-auc:0.74905	validation_1-auc:0.72403
[153]	validation_0-auc:0.74916	validation_1-auc:0.72418
[154]	validation_0-auc:0.74918	validation_1-auc:0.72461
[155]	validation_0-auc:0.74913	validation_1-auc:0.72465
[156]	validation_0-auc:0.74914	validation_1-auc:0.72451
[157]	validation_0-auc:0.74914	validation_1-auc:

[287]	validation_0-auc:0.75616	validation_1-auc:0.72761
[288]	validation_0-auc:0.75616	validation_1-auc:0.72761
[289]	validation_0-auc:0.75616	validation_1-auc:0.72761
[290]	validation_0-auc:0.75641	validation_1-auc:0.72781
[291]	validation_0-auc:0.75641	validation_1-auc:0.72781
[292]	validation_0-auc:0.75681	validation_1-auc:0.72847
[293]	validation_0-auc:0.75681	validation_1-auc:0.72847
[294]	validation_0-auc:0.75681	validation_1-auc:0.72844
[295]	validation_0-auc:0.75682	validation_1-auc:0.72846
[296]	validation_0-auc:0.75682	validation_1-auc:0.72846
[297]	validation_0-auc:0.75684	validation_1-auc:0.72850
[298]	validation_0-auc:0.75684	validation_1-auc:0.72850
[299]	validation_0-auc:0.75689	validation_1-auc:0.72854
[300]	validation_0-auc:0.75689	validation_1-auc:0.72854
[301]	validation_0-auc:0.75689	validation_1-auc:0.72854
[302]	validation_0-auc:0.75689	validation_1-auc:0.72854
[303]	validation_0-auc:0.75689	validation_1-auc:0.72854
[304]	validation_0-auc:0.75689	validation_1-auc:

[434]	validation_0-auc:0.76025	validation_1-auc:0.72904
[435]	validation_0-auc:0.76031	validation_1-auc:0.72897
[436]	validation_0-auc:0.76049	validation_1-auc:0.72893
[437]	validation_0-auc:0.76049	validation_1-auc:0.72893
[438]	validation_0-auc:0.76046	validation_1-auc:0.72908
[439]	validation_0-auc:0.76041	validation_1-auc:0.72893
[440]	validation_0-auc:0.76042	validation_1-auc:0.72903
[441]	validation_0-auc:0.76042	validation_1-auc:0.72903
[442]	validation_0-auc:0.76042	validation_1-auc:0.72903
[443]	validation_0-auc:0.76042	validation_1-auc:0.72903
[444]	validation_0-auc:0.76047	validation_1-auc:0.72895
[445]	validation_0-auc:0.76047	validation_1-auc:0.72895
[446]	validation_0-auc:0.76055	validation_1-auc:0.72885
[447]	validation_0-auc:0.76055	validation_1-auc:0.72885
[448]	validation_0-auc:0.76055	validation_1-auc:0.72885
[449]	validation_0-auc:0.76055	validation_1-auc:0.72885
[450]	validation_0-auc:0.76055	validation_1-auc:0.72885
[451]	validation_0-auc:0.76055	validation_1-auc:

[581]	validation_0-auc:0.76387	validation_1-auc:0.72963
[582]	validation_0-auc:0.76387	validation_1-auc:0.72963
[583]	validation_0-auc:0.76387	validation_1-auc:0.72963
[584]	validation_0-auc:0.76387	validation_1-auc:0.72963
[585]	validation_0-auc:0.76387	validation_1-auc:0.72963
[586]	validation_0-auc:0.76387	validation_1-auc:0.72963
[587]	validation_0-auc:0.76389	validation_1-auc:0.72966
[588]	validation_0-auc:0.76389	validation_1-auc:0.72966
[589]	validation_0-auc:0.76389	validation_1-auc:0.72966
[590]	validation_0-auc:0.76389	validation_1-auc:0.72966
[591]	validation_0-auc:0.76392	validation_1-auc:0.72980
[592]	validation_0-auc:0.76392	validation_1-auc:0.72980
[593]	validation_0-auc:0.76400	validation_1-auc:0.72979
[594]	validation_0-auc:0.76400	validation_1-auc:0.72979
[595]	validation_0-auc:0.76400	validation_1-auc:0.72979
[596]	validation_0-auc:0.76409	validation_1-auc:0.72970
[597]	validation_0-auc:0.76409	validation_1-auc:0.72970
[598]	validation_0-auc:0.76409	validation_1-auc:

[728]	validation_0-auc:0.76686	validation_1-auc:0.72993
[729]	validation_0-auc:0.76698	validation_1-auc:0.72990
[730]	validation_0-auc:0.76707	validation_1-auc:0.72977
[731]	validation_0-auc:0.76713	validation_1-auc:0.72987
[732]	validation_0-auc:0.76713	validation_1-auc:0.72987
[733]	validation_0-auc:0.76713	validation_1-auc:0.72987
[734]	validation_0-auc:0.76709	validation_1-auc:0.72982
[735]	validation_0-auc:0.76709	validation_1-auc:0.72982
[736]	validation_0-auc:0.76709	validation_1-auc:0.72982
[737]	validation_0-auc:0.76709	validation_1-auc:0.72982
[738]	validation_0-auc:0.76711	validation_1-auc:0.72985
[739]	validation_0-auc:0.76711	validation_1-auc:0.72985
[740]	validation_0-auc:0.76711	validation_1-auc:0.72985
[741]	validation_0-auc:0.76711	validation_1-auc:0.72985
[742]	validation_0-auc:0.76709	validation_1-auc:0.72981
[743]	validation_0-auc:0.76709	validation_1-auc:0.72981
[744]	validation_0-auc:0.76717	validation_1-auc:0.73018
[745]	validation_0-auc:0.76726	validation_1-auc:

[875]	validation_0-auc:0.76915	validation_1-auc:0.73107
[876]	validation_0-auc:0.76930	validation_1-auc:0.73115
[877]	validation_0-auc:0.76930	validation_1-auc:0.73115
[878]	validation_0-auc:0.76930	validation_1-auc:0.73115
[879]	validation_0-auc:0.76928	validation_1-auc:0.73113
[880]	validation_0-auc:0.76928	validation_1-auc:0.73113
[881]	validation_0-auc:0.76928	validation_1-auc:0.73113
[882]	validation_0-auc:0.76928	validation_1-auc:0.73113
[883]	validation_0-auc:0.76928	validation_1-auc:0.73113
[884]	validation_0-auc:0.76928	validation_1-auc:0.73113
[885]	validation_0-auc:0.76928	validation_1-auc:0.73113
[886]	validation_0-auc:0.76931	validation_1-auc:0.73115
[887]	validation_0-auc:0.76931	validation_1-auc:0.73115
[888]	validation_0-auc:0.76931	validation_1-auc:0.73115
[889]	validation_0-auc:0.76931	validation_1-auc:0.73115
[890]	validation_0-auc:0.76931	validation_1-auc:0.73115
[891]	validation_0-auc:0.76931	validation_1-auc:0.73115
[892]	validation_0-auc:0.76931	validation_1-auc:

[1021]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1022]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1023]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1024]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1025]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1026]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1027]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1028]	validation_0-auc:0.77090	validation_1-auc:0.73140
[1029]	validation_0-auc:0.77102	validation_1-auc:0.73132
[1030]	validation_0-auc:0.77102	validation_1-auc:0.73132
[1031]	validation_0-auc:0.77109	validation_1-auc:0.73146
[1032]	validation_0-auc:0.77109	validation_1-auc:0.73146
[1033]	validation_0-auc:0.77109	validation_1-auc:0.73146
[1034]	validation_0-auc:0.77109	validation_1-auc:0.73146
[1035]	validation_0-auc:0.77109	validation_1-auc:0.73146
[1036]	validation_0-auc:0.77112	validation_1-auc:0.73166
[1037]	validation_0-auc:0.77120	validation_1-auc:0.73157
[1038]	validation_0-auc:0.77120

[1165]	validation_0-auc:0.77249	validation_1-auc:0.73136
[1166]	validation_0-auc:0.77249	validation_1-auc:0.73136
[1167]	validation_0-auc:0.77249	validation_1-auc:0.73136
[1168]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1169]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1170]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1171]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1172]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1173]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1174]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1175]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1176]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1177]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1178]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1179]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1180]	validation_0-auc:0.77251	validation_1-auc:0.73158
[1181]	validation_0-auc:0.77265	validation_1-auc:0.73144
[1182]	validation_0-auc:0.77265

[1309]	validation_0-auc:0.77372	validation_1-auc:0.73137
[1310]	validation_0-auc:0.77372	validation_1-auc:0.73137
[1311]	validation_0-auc:0.77372	validation_1-auc:0.73137
[1312]	validation_0-auc:0.77372	validation_1-auc:0.73137
[1313]	validation_0-auc:0.77377	validation_1-auc:0.73145
[1314]	validation_0-auc:0.77374	validation_1-auc:0.73142
[1315]	validation_0-auc:0.77374	validation_1-auc:0.73142
[1316]	validation_0-auc:0.77374	validation_1-auc:0.73142
[1317]	validation_0-auc:0.77375	validation_1-auc:0.73142
[1318]	validation_0-auc:0.77375	validation_1-auc:0.73142
[1319]	validation_0-auc:0.77375	validation_1-auc:0.73142
[1320]	validation_0-auc:0.77375	validation_1-auc:0.73142
[1321]	validation_0-auc:0.77377	validation_1-auc:0.73147
[1322]	validation_0-auc:0.77377	validation_1-auc:0.73147
[1323]	validation_0-auc:0.77377	validation_1-auc:0.73147
[1324]	validation_0-auc:0.77377	validation_1-auc:0.73147
[1325]	validation_0-auc:0.77377	validation_1-auc:0.73147
[1326]	validation_0-auc:0.77377

[1453]	validation_0-auc:0.77459	validation_1-auc:0.73149
[1454]	validation_0-auc:0.77460	validation_1-auc:0.73154
[1455]	validation_0-auc:0.77465	validation_1-auc:0.73172
[1456]	validation_0-auc:0.77472	validation_1-auc:0.73170
[1457]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1458]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1459]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1460]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1461]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1462]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1463]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1464]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1465]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1466]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1467]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1468]	validation_0-auc:0.77475	validation_1-auc:0.73173
[1469]	validation_0-auc:0.77486	validation_1-auc:0.73158
[1470]	validation_0-auc:0.77486

[8]	validation_0-auc:0.71511	validation_1-auc:0.70160
[9]	validation_0-auc:0.71465	validation_1-auc:0.69991
[10]	validation_0-auc:0.71541	validation_1-auc:0.70096
[11]	validation_0-auc:0.71486	validation_1-auc:0.70139
[12]	validation_0-auc:0.71405	validation_1-auc:0.70132
[13]	validation_0-auc:0.71428	validation_1-auc:0.69973
[14]	validation_0-auc:0.71380	validation_1-auc:0.69945
[15]	validation_0-auc:0.71465	validation_1-auc:0.69987
[16]	validation_0-auc:0.71391	validation_1-auc:0.70000
[17]	validation_0-auc:0.71337	validation_1-auc:0.69959
[18]	validation_0-auc:0.71370	validation_1-auc:0.69963
[19]	validation_0-auc:0.71358	validation_1-auc:0.69931
[20]	validation_0-auc:0.71353	validation_1-auc:0.69942
[21]	validation_0-auc:0.71433	validation_1-auc:0.70027
[22]	validation_0-auc:0.71448	validation_1-auc:0.70032
[23]	validation_0-auc:0.71535	validation_1-auc:0.70104
[24]	validation_0-auc:0.71563	validation_1-auc:0.70182
[25]	validation_0-auc:0.71589	validation_1-auc:0.70171
[26]	validat

[156]	validation_0-auc:0.75035	validation_1-auc:0.72430
[157]	validation_0-auc:0.75040	validation_1-auc:0.72432
[158]	validation_0-auc:0.75051	validation_1-auc:0.72433
[159]	validation_0-auc:0.75057	validation_1-auc:0.72436
[160]	validation_0-auc:0.75057	validation_1-auc:0.72436
[161]	validation_0-auc:0.75054	validation_1-auc:0.72439
[162]	validation_0-auc:0.75059	validation_1-auc:0.72413
[163]	validation_0-auc:0.75059	validation_1-auc:0.72413
[164]	validation_0-auc:0.75059	validation_1-auc:0.72413
[165]	validation_0-auc:0.75046	validation_1-auc:0.72381
[166]	validation_0-auc:0.75046	validation_1-auc:0.72381
[167]	validation_0-auc:0.75060	validation_1-auc:0.72389
[168]	validation_0-auc:0.75085	validation_1-auc:0.72400
[169]	validation_0-auc:0.75095	validation_1-auc:0.72401
[170]	validation_0-auc:0.75120	validation_1-auc:0.72408
[171]	validation_0-auc:0.75127	validation_1-auc:0.72406
[172]	validation_0-auc:0.75135	validation_1-auc:0.72410
[173]	validation_0-auc:0.75135	validation_1-auc:

[303]	validation_0-auc:0.75786	validation_1-auc:0.72635
[304]	validation_0-auc:0.75786	validation_1-auc:0.72635
[305]	validation_0-auc:0.75796	validation_1-auc:0.72640
[306]	validation_0-auc:0.75796	validation_1-auc:0.72640
[307]	validation_0-auc:0.75796	validation_1-auc:0.72640
[308]	validation_0-auc:0.75797	validation_1-auc:0.72628
[309]	validation_0-auc:0.75797	validation_1-auc:0.72628
[310]	validation_0-auc:0.75797	validation_1-auc:0.72628
[311]	validation_0-auc:0.75808	validation_1-auc:0.72636
[312]	validation_0-auc:0.75808	validation_1-auc:0.72636
[313]	validation_0-auc:0.75826	validation_1-auc:0.72651
[314]	validation_0-auc:0.75827	validation_1-auc:0.72645
[315]	validation_0-auc:0.75840	validation_1-auc:0.72636
[316]	validation_0-auc:0.75840	validation_1-auc:0.72636
[317]	validation_0-auc:0.75840	validation_1-auc:0.72636
[318]	validation_0-auc:0.75840	validation_1-auc:0.72636
[319]	validation_0-auc:0.75840	validation_1-auc:0.72636
[320]	validation_0-auc:0.75840	validation_1-auc:

[450]	validation_0-auc:0.76130	validation_1-auc:0.72712
[451]	validation_0-auc:0.76130	validation_1-auc:0.72712
[452]	validation_0-auc:0.76146	validation_1-auc:0.72712
[453]	validation_0-auc:0.76146	validation_1-auc:0.72712
[454]	validation_0-auc:0.76146	validation_1-auc:0.72712
[455]	validation_0-auc:0.76149	validation_1-auc:0.72716
[456]	validation_0-auc:0.76152	validation_1-auc:0.72714
[457]	validation_0-auc:0.76152	validation_1-auc:0.72714
[458]	validation_0-auc:0.76152	validation_1-auc:0.72714
[459]	validation_0-auc:0.76152	validation_1-auc:0.72714
[460]	validation_0-auc:0.76152	validation_1-auc:0.72714
[461]	validation_0-auc:0.76152	validation_1-auc:0.72714
[462]	validation_0-auc:0.76152	validation_1-auc:0.72714
[463]	validation_0-auc:0.76152	validation_1-auc:0.72714
[464]	validation_0-auc:0.76152	validation_1-auc:0.72714
[465]	validation_0-auc:0.76152	validation_1-auc:0.72714
[466]	validation_0-auc:0.76152	validation_1-auc:0.72714
[467]	validation_0-auc:0.76152	validation_1-auc:

[597]	validation_0-auc:0.76339	validation_1-auc:0.72732
[598]	validation_0-auc:0.76339	validation_1-auc:0.72732
[599]	validation_0-auc:0.76339	validation_1-auc:0.72749
[600]	validation_0-auc:0.76339	validation_1-auc:0.72749
[601]	validation_0-auc:0.76359	validation_1-auc:0.72762
[602]	validation_0-auc:0.76359	validation_1-auc:0.72762
[603]	validation_0-auc:0.76357	validation_1-auc:0.72759
[604]	validation_0-auc:0.76357	validation_1-auc:0.72759
[605]	validation_0-auc:0.76361	validation_1-auc:0.72756
[606]	validation_0-auc:0.76369	validation_1-auc:0.72760
[607]	validation_0-auc:0.76369	validation_1-auc:0.72760
[608]	validation_0-auc:0.76369	validation_1-auc:0.72760
[609]	validation_0-auc:0.76384	validation_1-auc:0.72779
[610]	validation_0-auc:0.76384	validation_1-auc:0.72779
[611]	validation_0-auc:0.76384	validation_1-auc:0.72779
[612]	validation_0-auc:0.76384	validation_1-auc:0.72779
[613]	validation_0-auc:0.76392	validation_1-auc:0.72786
[614]	validation_0-auc:0.76392	validation_1-auc:

[744]	validation_0-auc:0.76533	validation_1-auc:0.72796
[745]	validation_0-auc:0.76533	validation_1-auc:0.72796
[746]	validation_0-auc:0.76533	validation_1-auc:0.72796
[747]	validation_0-auc:0.76533	validation_1-auc:0.72798
[748]	validation_0-auc:0.76533	validation_1-auc:0.72798
[749]	validation_0-auc:0.76533	validation_1-auc:0.72798
[750]	validation_0-auc:0.76533	validation_1-auc:0.72798
[751]	validation_0-auc:0.76533	validation_1-auc:0.72798
[752]	validation_0-auc:0.76533	validation_1-auc:0.72798
[753]	validation_0-auc:0.76533	validation_1-auc:0.72798
[754]	validation_0-auc:0.76533	validation_1-auc:0.72798
[755]	validation_0-auc:0.76533	validation_1-auc:0.72798
[756]	validation_0-auc:0.76533	validation_1-auc:0.72798
[757]	validation_0-auc:0.76533	validation_1-auc:0.72798
[758]	validation_0-auc:0.76533	validation_1-auc:0.72798
[759]	validation_0-auc:0.76535	validation_1-auc:0.72789
[760]	validation_0-auc:0.76535	validation_1-auc:0.72789
[761]	validation_0-auc:0.76535	validation_1-auc:

[891]	validation_0-auc:0.76713	validation_1-auc:0.72869
[892]	validation_0-auc:0.76713	validation_1-auc:0.72864
[893]	validation_0-auc:0.76713	validation_1-auc:0.72864
[894]	validation_0-auc:0.76713	validation_1-auc:0.72864
[895]	validation_0-auc:0.76713	validation_1-auc:0.72864
[896]	validation_0-auc:0.76733	validation_1-auc:0.72859
[897]	validation_0-auc:0.76733	validation_1-auc:0.72859
[898]	validation_0-auc:0.76741	validation_1-auc:0.72852
[899]	validation_0-auc:0.76741	validation_1-auc:0.72852
[900]	validation_0-auc:0.76741	validation_1-auc:0.72852
[901]	validation_0-auc:0.76741	validation_1-auc:0.72852
[902]	validation_0-auc:0.76741	validation_1-auc:0.72852
[903]	validation_0-auc:0.76741	validation_1-auc:0.72852
[904]	validation_0-auc:0.76741	validation_1-auc:0.72852
[905]	validation_0-auc:0.76741	validation_1-auc:0.72852
[906]	validation_0-auc:0.76753	validation_1-auc:0.72845
[907]	validation_0-auc:0.76753	validation_1-auc:0.72845
[908]	validation_0-auc:0.76757	validation_1-auc:

[1037]	validation_0-auc:0.76924	validation_1-auc:0.72810
[1038]	validation_0-auc:0.76924	validation_1-auc:0.72810
[1039]	validation_0-auc:0.76924	validation_1-auc:0.72810
[1040]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1041]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1042]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1043]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1044]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1045]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1046]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1047]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1048]	validation_0-auc:0.76925	validation_1-auc:0.72807
[1049]	validation_0-auc:0.76933	validation_1-auc:0.72812
[1050]	validation_0-auc:0.76933	validation_1-auc:0.72812
[1051]	validation_0-auc:0.76933	validation_1-auc:0.72812
[1052]	validation_0-auc:0.76933	validation_1-auc:0.72812
[1053]	validation_0-auc:0.76933	validation_1-auc:0.72812
[1054]	validation_0-auc:0.76941

[1181]	validation_0-auc:0.77047	validation_1-auc:0.72823
[1182]	validation_0-auc:0.77047	validation_1-auc:0.72823
[1183]	validation_0-auc:0.77043	validation_1-auc:0.72820
[1184]	validation_0-auc:0.77043	validation_1-auc:0.72820
[1185]	validation_0-auc:0.77053	validation_1-auc:0.72817
[1186]	validation_0-auc:0.77053	validation_1-auc:0.72817
[1187]	validation_0-auc:0.77053	validation_1-auc:0.72817
[1188]	validation_0-auc:0.77053	validation_1-auc:0.72817
[1189]	validation_0-auc:0.77053	validation_1-auc:0.72817
[1190]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1191]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1192]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1193]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1194]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1195]	validation_0-auc:0.77061	validation_1-auc:0.72811
[1196]	validation_0-auc:0.77064	validation_1-auc:0.72821
[1197]	validation_0-auc:0.77073	validation_1-auc:0.72828
[1198]	validation_0-auc:0.77073

[1325]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1326]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1327]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1328]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1329]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1330]	validation_0-auc:0.77185	validation_1-auc:0.72828
[1331]	validation_0-auc:0.77181	validation_1-auc:0.72836
[1332]	validation_0-auc:0.77181	validation_1-auc:0.72836
[1333]	validation_0-auc:0.77181	validation_1-auc:0.72836
[1334]	validation_0-auc:0.77181	validation_1-auc:0.72836
[1335]	validation_0-auc:0.77196	validation_1-auc:0.72829
[1336]	validation_0-auc:0.77196	validation_1-auc:0.72829
[1337]	validation_0-auc:0.77196	validation_1-auc:0.72829
[1338]	validation_0-auc:0.77198	validation_1-auc:0.72833
[1339]	validation_0-auc:0.77198	validation_1-auc:0.72833
[1340]	validation_0-auc:0.77198	validation_1-auc:0.72833
[1341]	validation_0-auc:0.77198	validation_1-auc:0.72833
[1342]	validation_0-auc:0.77198

[73]	validation_0-auc:0.73487	validation_1-auc:0.72742
[74]	validation_0-auc:0.73552	validation_1-auc:0.72726
[75]	validation_0-auc:0.73540	validation_1-auc:0.72735
[76]	validation_0-auc:0.73584	validation_1-auc:0.72788
[77]	validation_0-auc:0.73601	validation_1-auc:0.72822
[78]	validation_0-auc:0.73629	validation_1-auc:0.72833
[79]	validation_0-auc:0.73642	validation_1-auc:0.72831
[80]	validation_0-auc:0.73668	validation_1-auc:0.72874
[81]	validation_0-auc:0.73717	validation_1-auc:0.72887
[82]	validation_0-auc:0.73750	validation_1-auc:0.72918
[83]	validation_0-auc:0.73794	validation_1-auc:0.72893
[84]	validation_0-auc:0.73801	validation_1-auc:0.72921
[85]	validation_0-auc:0.73817	validation_1-auc:0.72908
[86]	validation_0-auc:0.73845	validation_1-auc:0.72952
[87]	validation_0-auc:0.73857	validation_1-auc:0.72958
[88]	validation_0-auc:0.73889	validation_1-auc:0.72972
[89]	validation_0-auc:0.73926	validation_1-auc:0.72993
[90]	validation_0-auc:0.73960	validation_1-auc:0.72991
[91]	valid

[220]	validation_0-auc:0.75320	validation_1-auc:0.73596
[221]	validation_0-auc:0.75320	validation_1-auc:0.73596
[222]	validation_0-auc:0.75341	validation_1-auc:0.73603
[223]	validation_0-auc:0.75343	validation_1-auc:0.73588
[224]	validation_0-auc:0.75341	validation_1-auc:0.73590
[225]	validation_0-auc:0.75363	validation_1-auc:0.73601
[226]	validation_0-auc:0.75363	validation_1-auc:0.73601
[227]	validation_0-auc:0.75365	validation_1-auc:0.73591
[228]	validation_0-auc:0.75364	validation_1-auc:0.73588
[229]	validation_0-auc:0.75364	validation_1-auc:0.73588
[230]	validation_0-auc:0.75371	validation_1-auc:0.73583
[231]	validation_0-auc:0.75371	validation_1-auc:0.73583
[232]	validation_0-auc:0.75376	validation_1-auc:0.73585
[233]	validation_0-auc:0.75376	validation_1-auc:0.73585
[234]	validation_0-auc:0.75381	validation_1-auc:0.73604
[235]	validation_0-auc:0.75384	validation_1-auc:0.73602
[236]	validation_0-auc:0.75401	validation_1-auc:0.73585
[237]	validation_0-auc:0.75401	validation_1-auc:

[367]	validation_0-auc:0.75816	validation_1-auc:0.73716
[368]	validation_0-auc:0.75816	validation_1-auc:0.73716
[369]	validation_0-auc:0.75816	validation_1-auc:0.73716
[370]	validation_0-auc:0.75816	validation_1-auc:0.73716
[371]	validation_0-auc:0.75842	validation_1-auc:0.73737
[372]	validation_0-auc:0.75842	validation_1-auc:0.73737
[373]	validation_0-auc:0.75842	validation_1-auc:0.73737
[374]	validation_0-auc:0.75842	validation_1-auc:0.73737
[375]	validation_0-auc:0.75842	validation_1-auc:0.73737
[376]	validation_0-auc:0.75842	validation_1-auc:0.73737
[377]	validation_0-auc:0.75842	validation_1-auc:0.73737
[378]	validation_0-auc:0.75842	validation_1-auc:0.73737
[379]	validation_0-auc:0.75842	validation_1-auc:0.73737
[380]	validation_0-auc:0.75849	validation_1-auc:0.73711
[381]	validation_0-auc:0.75849	validation_1-auc:0.73711
[382]	validation_0-auc:0.75849	validation_1-auc:0.73711
[383]	validation_0-auc:0.75850	validation_1-auc:0.73715
[384]	validation_0-auc:0.75851	validation_1-auc:

[514]	validation_0-auc:0.76140	validation_1-auc:0.73803
[515]	validation_0-auc:0.76146	validation_1-auc:0.73797
[516]	validation_0-auc:0.76149	validation_1-auc:0.73802
[517]	validation_0-auc:0.76149	validation_1-auc:0.73802
[518]	validation_0-auc:0.76159	validation_1-auc:0.73807
[519]	validation_0-auc:0.76159	validation_1-auc:0.73807
[520]	validation_0-auc:0.76159	validation_1-auc:0.73807
[521]	validation_0-auc:0.76159	validation_1-auc:0.73807
[522]	validation_0-auc:0.76159	validation_1-auc:0.73807
[523]	validation_0-auc:0.76166	validation_1-auc:0.73812
[524]	validation_0-auc:0.76166	validation_1-auc:0.73812
[525]	validation_0-auc:0.76166	validation_1-auc:0.73812
[526]	validation_0-auc:0.76167	validation_1-auc:0.73818
[527]	validation_0-auc:0.76167	validation_1-auc:0.73818
[528]	validation_0-auc:0.76167	validation_1-auc:0.73818
[529]	validation_0-auc:0.76167	validation_1-auc:0.73818
[530]	validation_0-auc:0.76167	validation_1-auc:0.73818
[531]	validation_0-auc:0.76167	validation_1-auc:

[661]	validation_0-auc:0.76459	validation_1-auc:0.73882
[662]	validation_0-auc:0.76459	validation_1-auc:0.73882
[663]	validation_0-auc:0.76459	validation_1-auc:0.73882
[664]	validation_0-auc:0.76459	validation_1-auc:0.73882
[665]	validation_0-auc:0.76459	validation_1-auc:0.73882
[666]	validation_0-auc:0.76468	validation_1-auc:0.73874
[667]	validation_0-auc:0.76470	validation_1-auc:0.73871
[668]	validation_0-auc:0.76470	validation_1-auc:0.73871
[669]	validation_0-auc:0.76470	validation_1-auc:0.73871
[670]	validation_0-auc:0.76470	validation_1-auc:0.73871
[671]	validation_0-auc:0.76470	validation_1-auc:0.73871
[672]	validation_0-auc:0.76471	validation_1-auc:0.73855
[673]	validation_0-auc:0.76480	validation_1-auc:0.73850
[674]	validation_0-auc:0.76485	validation_1-auc:0.73848
[675]	validation_0-auc:0.76485	validation_1-auc:0.73848
[676]	validation_0-auc:0.76491	validation_1-auc:0.73838
[677]	validation_0-auc:0.76491	validation_1-auc:0.73838
[678]	validation_0-auc:0.76491	validation_1-auc:

[808]	validation_0-auc:0.76691	validation_1-auc:0.73892
[809]	validation_0-auc:0.76691	validation_1-auc:0.73892
[810]	validation_0-auc:0.76691	validation_1-auc:0.73892
[811]	validation_0-auc:0.76691	validation_1-auc:0.73892
[812]	validation_0-auc:0.76696	validation_1-auc:0.73906
[813]	validation_0-auc:0.76696	validation_1-auc:0.73906
[814]	validation_0-auc:0.76696	validation_1-auc:0.73906
[815]	validation_0-auc:0.76696	validation_1-auc:0.73906
[816]	validation_0-auc:0.76696	validation_1-auc:0.73906
[817]	validation_0-auc:0.76696	validation_1-auc:0.73906
[818]	validation_0-auc:0.76696	validation_1-auc:0.73906
[819]	validation_0-auc:0.76696	validation_1-auc:0.73906
[820]	validation_0-auc:0.76696	validation_1-auc:0.73906
[821]	validation_0-auc:0.76696	validation_1-auc:0.73906
[822]	validation_0-auc:0.76696	validation_1-auc:0.73906
[823]	validation_0-auc:0.76696	validation_1-auc:0.73906
[824]	validation_0-auc:0.76696	validation_1-auc:0.73906
[825]	validation_0-auc:0.76696	validation_1-auc:

[955]	validation_0-auc:0.76798	validation_1-auc:0.73915
[956]	validation_0-auc:0.76798	validation_1-auc:0.73915
[957]	validation_0-auc:0.76798	validation_1-auc:0.73915
[958]	validation_0-auc:0.76800	validation_1-auc:0.73921
[959]	validation_0-auc:0.76800	validation_1-auc:0.73921
[960]	validation_0-auc:0.76800	validation_1-auc:0.73921
[961]	validation_0-auc:0.76800	validation_1-auc:0.73921
[962]	validation_0-auc:0.76800	validation_1-auc:0.73921
[963]	validation_0-auc:0.76802	validation_1-auc:0.73919
[964]	validation_0-auc:0.76802	validation_1-auc:0.73919
[965]	validation_0-auc:0.76802	validation_1-auc:0.73916
[966]	validation_0-auc:0.76802	validation_1-auc:0.73916
[967]	validation_0-auc:0.76799	validation_1-auc:0.73923
[968]	validation_0-auc:0.76799	validation_1-auc:0.73923
[969]	validation_0-auc:0.76799	validation_1-auc:0.73923
[970]	validation_0-auc:0.76799	validation_1-auc:0.73923
[971]	validation_0-auc:0.76799	validation_1-auc:0.73923
[972]	validation_0-auc:0.76799	validation_1-auc:

[1100]	validation_0-auc:0.76885	validation_1-auc:0.73901
[1101]	validation_0-auc:0.76885	validation_1-auc:0.73901
[1102]	validation_0-auc:0.76885	validation_1-auc:0.73901
[1103]	validation_0-auc:0.76893	validation_1-auc:0.73890
[1104]	validation_0-auc:0.76893	validation_1-auc:0.73890
[1105]	validation_0-auc:0.76893	validation_1-auc:0.73890
[1106]	validation_0-auc:0.76894	validation_1-auc:0.73909
[1107]	validation_0-auc:0.76894	validation_1-auc:0.73909
[1108]	validation_0-auc:0.76897	validation_1-auc:0.73919
[1109]	validation_0-auc:0.76897	validation_1-auc:0.73919
[1110]	validation_0-auc:0.76897	validation_1-auc:0.73919
[1111]	validation_0-auc:0.76897	validation_1-auc:0.73919
[1112]	validation_0-auc:0.76905	validation_1-auc:0.73926
[1113]	validation_0-auc:0.76905	validation_1-auc:0.73926
[1114]	validation_0-auc:0.76905	validation_1-auc:0.73926
[1115]	validation_0-auc:0.76905	validation_1-auc:0.73926
[1116]	validation_0-auc:0.76905	validation_1-auc:0.73926
[1117]	validation_0-auc:0.76912

[1244]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1245]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1246]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1247]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1248]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1249]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1250]	validation_0-auc:0.77063	validation_1-auc:0.73917
[1251]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1252]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1253]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1254]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1255]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1256]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1257]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1258]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1259]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1260]	validation_0-auc:0.77067	validation_1-auc:0.73913
[1261]	validation_0-auc:0.77067

[1388]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1389]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1390]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1391]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1392]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1393]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1394]	validation_0-auc:0.77160	validation_1-auc:0.73924
[1395]	validation_0-auc:0.77160	validation_1-auc:0.73924
Stopping. Best iteration:
[895]	validation_0-auc:0.76745	validation_1-auc:0.73947

Fold: 3, train-observations = 88074, valid-observations = 22019
train-score = 0.7675, valid-score = 0.7395, target_median = 0.0650995671749115
Parameters: { random_seed } might not be used.

  This may not be accurate due to some parameters are only used in language bindings but
  passed down to XGBoost core.  Or some parameters are not used but slip through this
  verification. Please open an issue if you find above cases.


[0]	validation_0-auc:0.65747	va

[127]	validation_0-auc:0.74574	validation_1-auc:0.72012
[128]	validation_0-auc:0.74576	validation_1-auc:0.72001
[129]	validation_0-auc:0.74576	validation_1-auc:0.72001
[130]	validation_0-auc:0.74596	validation_1-auc:0.72000
[131]	validation_0-auc:0.74611	validation_1-auc:0.72017
[132]	validation_0-auc:0.74632	validation_1-auc:0.72017
[133]	validation_0-auc:0.74632	validation_1-auc:0.72017
[134]	validation_0-auc:0.74635	validation_1-auc:0.72025
[135]	validation_0-auc:0.74658	validation_1-auc:0.72051
[136]	validation_0-auc:0.74665	validation_1-auc:0.72059
[137]	validation_0-auc:0.74676	validation_1-auc:0.72069
[138]	validation_0-auc:0.74696	validation_1-auc:0.72091
[139]	validation_0-auc:0.74699	validation_1-auc:0.72081
[140]	validation_0-auc:0.74723	validation_1-auc:0.72097
[141]	validation_0-auc:0.74742	validation_1-auc:0.72074
[142]	validation_0-auc:0.74805	validation_1-auc:0.72130
[143]	validation_0-auc:0.74809	validation_1-auc:0.72131
[144]	validation_0-auc:0.74808	validation_1-auc:

[274]	validation_0-auc:0.75631	validation_1-auc:0.72554
[275]	validation_0-auc:0.75648	validation_1-auc:0.72535
[276]	validation_0-auc:0.75648	validation_1-auc:0.72535
[277]	validation_0-auc:0.75648	validation_1-auc:0.72535
[278]	validation_0-auc:0.75648	validation_1-auc:0.72535
[279]	validation_0-auc:0.75659	validation_1-auc:0.72556
[280]	validation_0-auc:0.75659	validation_1-auc:0.72556
[281]	validation_0-auc:0.75662	validation_1-auc:0.72536
[282]	validation_0-auc:0.75663	validation_1-auc:0.72537
[283]	validation_0-auc:0.75679	validation_1-auc:0.72545
[284]	validation_0-auc:0.75678	validation_1-auc:0.72555
[285]	validation_0-auc:0.75678	validation_1-auc:0.72554
[286]	validation_0-auc:0.75677	validation_1-auc:0.72551
[287]	validation_0-auc:0.75677	validation_1-auc:0.72551
[288]	validation_0-auc:0.75695	validation_1-auc:0.72551
[289]	validation_0-auc:0.75695	validation_1-auc:0.72551
[290]	validation_0-auc:0.75695	validation_1-auc:0.72551
[291]	validation_0-auc:0.75699	validation_1-auc:

[421]	validation_0-auc:0.76085	validation_1-auc:0.72702
[422]	validation_0-auc:0.76085	validation_1-auc:0.72702
[423]	validation_0-auc:0.76085	validation_1-auc:0.72702
[424]	validation_0-auc:0.76083	validation_1-auc:0.72697
[425]	validation_0-auc:0.76083	validation_1-auc:0.72697
[426]	validation_0-auc:0.76083	validation_1-auc:0.72697
[427]	validation_0-auc:0.76083	validation_1-auc:0.72697
[428]	validation_0-auc:0.76086	validation_1-auc:0.72692
[429]	validation_0-auc:0.76086	validation_1-auc:0.72692
[430]	validation_0-auc:0.76086	validation_1-auc:0.72692
[431]	validation_0-auc:0.76090	validation_1-auc:0.72698
[432]	validation_0-auc:0.76090	validation_1-auc:0.72698
[433]	validation_0-auc:0.76108	validation_1-auc:0.72715
[434]	validation_0-auc:0.76128	validation_1-auc:0.72708
[435]	validation_0-auc:0.76128	validation_1-auc:0.72708
[436]	validation_0-auc:0.76128	validation_1-auc:0.72708
[437]	validation_0-auc:0.76128	validation_1-auc:0.72708
[438]	validation_0-auc:0.76128	validation_1-auc:

[568]	validation_0-auc:0.76367	validation_1-auc:0.72772
[569]	validation_0-auc:0.76367	validation_1-auc:0.72772
[570]	validation_0-auc:0.76367	validation_1-auc:0.72772
[571]	validation_0-auc:0.76367	validation_1-auc:0.72772
[572]	validation_0-auc:0.76374	validation_1-auc:0.72782
[573]	validation_0-auc:0.76374	validation_1-auc:0.72782
[574]	validation_0-auc:0.76375	validation_1-auc:0.72781
[575]	validation_0-auc:0.76375	validation_1-auc:0.72781
[576]	validation_0-auc:0.76375	validation_1-auc:0.72781
[577]	validation_0-auc:0.76375	validation_1-auc:0.72781
[578]	validation_0-auc:0.76375	validation_1-auc:0.72781
[579]	validation_0-auc:0.76375	validation_1-auc:0.72781
[580]	validation_0-auc:0.76375	validation_1-auc:0.72781
[581]	validation_0-auc:0.76375	validation_1-auc:0.72781
[582]	validation_0-auc:0.76375	validation_1-auc:0.72781
[583]	validation_0-auc:0.76387	validation_1-auc:0.72785
[584]	validation_0-auc:0.76387	validation_1-auc:0.72785
[585]	validation_0-auc:0.76387	validation_1-auc:

[715]	validation_0-auc:0.76597	validation_1-auc:0.72825
[716]	validation_0-auc:0.76597	validation_1-auc:0.72825
[717]	validation_0-auc:0.76605	validation_1-auc:0.72837
[718]	validation_0-auc:0.76605	validation_1-auc:0.72837
[719]	validation_0-auc:0.76610	validation_1-auc:0.72833
[720]	validation_0-auc:0.76610	validation_1-auc:0.72833
[721]	validation_0-auc:0.76610	validation_1-auc:0.72833
[722]	validation_0-auc:0.76611	validation_1-auc:0.72832
[723]	validation_0-auc:0.76614	validation_1-auc:0.72833
[724]	validation_0-auc:0.76615	validation_1-auc:0.72845
[725]	validation_0-auc:0.76620	validation_1-auc:0.72843
[726]	validation_0-auc:0.76620	validation_1-auc:0.72843
[727]	validation_0-auc:0.76626	validation_1-auc:0.72841
[728]	validation_0-auc:0.76626	validation_1-auc:0.72841
[729]	validation_0-auc:0.76636	validation_1-auc:0.72822
[730]	validation_0-auc:0.76636	validation_1-auc:0.72822
[731]	validation_0-auc:0.76639	validation_1-auc:0.72826
[732]	validation_0-auc:0.76639	validation_1-auc:

[862]	validation_0-auc:0.76761	validation_1-auc:0.72832
[863]	validation_0-auc:0.76761	validation_1-auc:0.72832
[864]	validation_0-auc:0.76761	validation_1-auc:0.72832
[865]	validation_0-auc:0.76760	validation_1-auc:0.72829
[866]	validation_0-auc:0.76759	validation_1-auc:0.72831
[867]	validation_0-auc:0.76759	validation_1-auc:0.72831
[868]	validation_0-auc:0.76768	validation_1-auc:0.72828
[869]	validation_0-auc:0.76768	validation_1-auc:0.72828
[870]	validation_0-auc:0.76768	validation_1-auc:0.72828
[871]	validation_0-auc:0.76773	validation_1-auc:0.72811
[872]	validation_0-auc:0.76772	validation_1-auc:0.72809
[873]	validation_0-auc:0.76772	validation_1-auc:0.72809
[874]	validation_0-auc:0.76772	validation_1-auc:0.72809
[875]	validation_0-auc:0.76772	validation_1-auc:0.72809
[876]	validation_0-auc:0.76772	validation_1-auc:0.72809
[877]	validation_0-auc:0.76772	validation_1-auc:0.72809
[878]	validation_0-auc:0.76772	validation_1-auc:0.72809
[879]	validation_0-auc:0.76772	validation_1-auc:

[1009]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1010]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1011]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1012]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1013]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1014]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1015]	validation_0-auc:0.76921	validation_1-auc:0.72801
[1016]	validation_0-auc:0.76922	validation_1-auc:0.72800
[1017]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1018]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1019]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1020]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1021]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1022]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1023]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1024]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1025]	validation_0-auc:0.76922	validation_1-auc:0.72792
[1026]	validation_0-auc:0.76927

[1153]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1154]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1155]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1156]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1157]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1158]	validation_0-auc:0.77032	validation_1-auc:0.72811
[1159]	validation_0-auc:0.77049	validation_1-auc:0.72814
[1160]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1161]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1162]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1163]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1164]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1165]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1166]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1167]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1168]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1169]	validation_0-auc:0.77057	validation_1-auc:0.72817
[1170]	validation_0-auc:0.77057

[39]	validation_0-auc:0.72067	validation_1-auc:0.71937
[40]	validation_0-auc:0.72091	validation_1-auc:0.71949
[41]	validation_0-auc:0.72113	validation_1-auc:0.71955
[42]	validation_0-auc:0.72167	validation_1-auc:0.71984
[43]	validation_0-auc:0.72199	validation_1-auc:0.71978
[44]	validation_0-auc:0.72248	validation_1-auc:0.72003
[45]	validation_0-auc:0.72260	validation_1-auc:0.72017
[46]	validation_0-auc:0.72244	validation_1-auc:0.71973
[47]	validation_0-auc:0.72253	validation_1-auc:0.71993
[48]	validation_0-auc:0.72230	validation_1-auc:0.71961
[49]	validation_0-auc:0.72298	validation_1-auc:0.71963
[50]	validation_0-auc:0.72329	validation_1-auc:0.72043
[51]	validation_0-auc:0.72392	validation_1-auc:0.72093
[52]	validation_0-auc:0.72419	validation_1-auc:0.72109
[53]	validation_0-auc:0.72458	validation_1-auc:0.72140
[54]	validation_0-auc:0.72491	validation_1-auc:0.72131
[55]	validation_0-auc:0.72548	validation_1-auc:0.72128
[56]	validation_0-auc:0.72619	validation_1-auc:0.72211
[57]	valid

[187]	validation_0-auc:0.75039	validation_1-auc:0.73832
[188]	validation_0-auc:0.75044	validation_1-auc:0.73839
[189]	validation_0-auc:0.75052	validation_1-auc:0.73860
[190]	validation_0-auc:0.75052	validation_1-auc:0.73860
[191]	validation_0-auc:0.75060	validation_1-auc:0.73850
[192]	validation_0-auc:0.75075	validation_1-auc:0.73837
[193]	validation_0-auc:0.75082	validation_1-auc:0.73857
[194]	validation_0-auc:0.75094	validation_1-auc:0.73858
[195]	validation_0-auc:0.75098	validation_1-auc:0.73861
[196]	validation_0-auc:0.75098	validation_1-auc:0.73861
[197]	validation_0-auc:0.75126	validation_1-auc:0.73863
[198]	validation_0-auc:0.75135	validation_1-auc:0.73854
[199]	validation_0-auc:0.75154	validation_1-auc:0.73875
[200]	validation_0-auc:0.75149	validation_1-auc:0.73875
[201]	validation_0-auc:0.75161	validation_1-auc:0.73850
[202]	validation_0-auc:0.75161	validation_1-auc:0.73850
[203]	validation_0-auc:0.75168	validation_1-auc:0.73866
[204]	validation_0-auc:0.75168	validation_1-auc:

[334]	validation_0-auc:0.75739	validation_1-auc:0.74095
[335]	validation_0-auc:0.75739	validation_1-auc:0.74095
[336]	validation_0-auc:0.75744	validation_1-auc:0.74092
[337]	validation_0-auc:0.75744	validation_1-auc:0.74092
[338]	validation_0-auc:0.75744	validation_1-auc:0.74092
[339]	validation_0-auc:0.75744	validation_1-auc:0.74092
[340]	validation_0-auc:0.75750	validation_1-auc:0.74095
[341]	validation_0-auc:0.75750	validation_1-auc:0.74095
[342]	validation_0-auc:0.75750	validation_1-auc:0.74095
[343]	validation_0-auc:0.75762	validation_1-auc:0.74108
[344]	validation_0-auc:0.75767	validation_1-auc:0.74116
[345]	validation_0-auc:0.75767	validation_1-auc:0.74116
[346]	validation_0-auc:0.75767	validation_1-auc:0.74116
[347]	validation_0-auc:0.75773	validation_1-auc:0.74111
[348]	validation_0-auc:0.75781	validation_1-auc:0.74067
[349]	validation_0-auc:0.75781	validation_1-auc:0.74067
[350]	validation_0-auc:0.75781	validation_1-auc:0.74067
[351]	validation_0-auc:0.75781	validation_1-auc:

[481]	validation_0-auc:0.76178	validation_1-auc:0.74173
[482]	validation_0-auc:0.76178	validation_1-auc:0.74173
[483]	validation_0-auc:0.76178	validation_1-auc:0.74173
[484]	validation_0-auc:0.76178	validation_1-auc:0.74173
[485]	validation_0-auc:0.76178	validation_1-auc:0.74173
[486]	validation_0-auc:0.76178	validation_1-auc:0.74173
[487]	validation_0-auc:0.76178	validation_1-auc:0.74173
[488]	validation_0-auc:0.76185	validation_1-auc:0.74169
[489]	validation_0-auc:0.76185	validation_1-auc:0.74169
[490]	validation_0-auc:0.76185	validation_1-auc:0.74169
[491]	validation_0-auc:0.76185	validation_1-auc:0.74169
[492]	validation_0-auc:0.76192	validation_1-auc:0.74189
[493]	validation_0-auc:0.76198	validation_1-auc:0.74192
[494]	validation_0-auc:0.76198	validation_1-auc:0.74192
[495]	validation_0-auc:0.76198	validation_1-auc:0.74192
[496]	validation_0-auc:0.76198	validation_1-auc:0.74192
[497]	validation_0-auc:0.76200	validation_1-auc:0.74189
[498]	validation_0-auc:0.76203	validation_1-auc:

[628]	validation_0-auc:0.76419	validation_1-auc:0.74170
[629]	validation_0-auc:0.76419	validation_1-auc:0.74170
[630]	validation_0-auc:0.76419	validation_1-auc:0.74170
[631]	validation_0-auc:0.76424	validation_1-auc:0.74169
[632]	validation_0-auc:0.76424	validation_1-auc:0.74169
[633]	validation_0-auc:0.76424	validation_1-auc:0.74169
[634]	validation_0-auc:0.76424	validation_1-auc:0.74169
[635]	validation_0-auc:0.76424	validation_1-auc:0.74169
[636]	validation_0-auc:0.76424	validation_1-auc:0.74169
[637]	validation_0-auc:0.76424	validation_1-auc:0.74169
[638]	validation_0-auc:0.76432	validation_1-auc:0.74172
[639]	validation_0-auc:0.76438	validation_1-auc:0.74179
[640]	validation_0-auc:0.76447	validation_1-auc:0.74164
[641]	validation_0-auc:0.76446	validation_1-auc:0.74167
[642]	validation_0-auc:0.76457	validation_1-auc:0.74170
[643]	validation_0-auc:0.76457	validation_1-auc:0.74170
[644]	validation_0-auc:0.76457	validation_1-auc:0.74170
[645]	validation_0-auc:0.76457	validation_1-auc:

[775]	validation_0-auc:0.76656	validation_1-auc:0.74254
[776]	validation_0-auc:0.76656	validation_1-auc:0.74254
[777]	validation_0-auc:0.76680	validation_1-auc:0.74256
[778]	validation_0-auc:0.76680	validation_1-auc:0.74256
[779]	validation_0-auc:0.76680	validation_1-auc:0.74256
[780]	validation_0-auc:0.76680	validation_1-auc:0.74256
[781]	validation_0-auc:0.76680	validation_1-auc:0.74256
[782]	validation_0-auc:0.76680	validation_1-auc:0.74256
[783]	validation_0-auc:0.76680	validation_1-auc:0.74243
[784]	validation_0-auc:0.76685	validation_1-auc:0.74239
[785]	validation_0-auc:0.76685	validation_1-auc:0.74239
[786]	validation_0-auc:0.76697	validation_1-auc:0.74249
[787]	validation_0-auc:0.76697	validation_1-auc:0.74249
[788]	validation_0-auc:0.76699	validation_1-auc:0.74252
[789]	validation_0-auc:0.76704	validation_1-auc:0.74229
[790]	validation_0-auc:0.76704	validation_1-auc:0.74229
[791]	validation_0-auc:0.76704	validation_1-auc:0.74229
[792]	validation_0-auc:0.76704	validation_1-auc:

[922]	validation_0-auc:0.76813	validation_1-auc:0.74209
[923]	validation_0-auc:0.76813	validation_1-auc:0.74209
[924]	validation_0-auc:0.76821	validation_1-auc:0.74218
[925]	validation_0-auc:0.76821	validation_1-auc:0.74218
[926]	validation_0-auc:0.76819	validation_1-auc:0.74207
[927]	validation_0-auc:0.76819	validation_1-auc:0.74207
[928]	validation_0-auc:0.76819	validation_1-auc:0.74207
[929]	validation_0-auc:0.76819	validation_1-auc:0.74207
[930]	validation_0-auc:0.76819	validation_1-auc:0.74207
[931]	validation_0-auc:0.76819	validation_1-auc:0.74207
[932]	validation_0-auc:0.76819	validation_1-auc:0.74207
[933]	validation_0-auc:0.76818	validation_1-auc:0.74205
[934]	validation_0-auc:0.76818	validation_1-auc:0.74205
[935]	validation_0-auc:0.76818	validation_1-auc:0.74205
[936]	validation_0-auc:0.76818	validation_1-auc:0.74205
[937]	validation_0-auc:0.76818	validation_1-auc:0.74205
[938]	validation_0-auc:0.76818	validation_1-auc:0.74205
[939]	validation_0-auc:0.76818	validation_1-auc:

[1068]	validation_0-auc:0.76983	validation_1-auc:0.74198
[1069]	validation_0-auc:0.76983	validation_1-auc:0.74198
[1070]	validation_0-auc:0.76983	validation_1-auc:0.74198
[1071]	validation_0-auc:0.76989	validation_1-auc:0.74198
[1072]	validation_0-auc:0.76989	validation_1-auc:0.74198
[1073]	validation_0-auc:0.76989	validation_1-auc:0.74198
[1074]	validation_0-auc:0.76988	validation_1-auc:0.74198
[1075]	validation_0-auc:0.76988	validation_1-auc:0.74198
[1076]	validation_0-auc:0.76988	validation_1-auc:0.74198
[1077]	validation_0-auc:0.76988	validation_1-auc:0.74198
[1078]	validation_0-auc:0.76992	validation_1-auc:0.74189
[1079]	validation_0-auc:0.76992	validation_1-auc:0.74189
[1080]	validation_0-auc:0.76993	validation_1-auc:0.74194
[1081]	validation_0-auc:0.76993	validation_1-auc:0.74194
[1082]	validation_0-auc:0.76993	validation_1-auc:0.74194
[1083]	validation_0-auc:0.76993	validation_1-auc:0.74194
[1084]	validation_0-auc:0.76993	validation_1-auc:0.74194
[1085]	validation_0-auc:0.76993

[1212]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1213]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1214]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1215]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1216]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1217]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1218]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1219]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1220]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1221]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1222]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1223]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1224]	validation_0-auc:0.77154	validation_1-auc:0.74213
[1225]	validation_0-auc:0.77156	validation_1-auc:0.74212
[1226]	validation_0-auc:0.77156	validation_1-auc:0.74212
[1227]	validation_0-auc:0.77156	validation_1-auc:0.74212
[1228]	validation_0-auc:0.77156	validation_1-auc:0.74212
[1229]	validation_0-auc:0.77156

In [153]:
np.mean([1075, 884, 895, 745, 767])

873.2

In [154]:
x_train = train.drop(["TARGET"], axis=1)
y_train = train["TARGET"]

Unfortunatelly there aren't Exploratory Data Analysis the next two cells are needed to equalize the number of features in the train and test :/  

In [155]:
tt_features = list(set(list(test.columns)) - set(list(train.columns)))

In [156]:
tr_features = list(set(list(train.columns)) - set(list(test.columns))) 

In [157]:
train = train.drop(tr_features, axis=1)

In [158]:
test = test.drop(tt_features, axis=1)

In [159]:
params = {
    "objective": "binary:logistic",
    "booster": "gbtree",
    "eval_metric": "auc",
    "eta": "0.05",
    "max_depth": 6,
    "gamma": 10,
    "subsample": 0.85,
    "colsample_bytree": 0.7,
    "colsample_bylevel": 0.632,
    "min_child_weight": 30,
    "alpha": 0,
    "lambda": 0,
    "nthread": 6,
    "random_seed": 27,
    "n_estimators": 873,
}

In [160]:
model = xgb.XGBClassifier(**params)
model.fit(
    X=x_train,
    y=y_train,
    eval_set=[(x_train, y_train)],    
    eval_metric="auc",
    verbose=200
)

Parameters: { random_seed } might not be used.

  This may not be accurate due to some parameters are only used in language bindings but
  passed down to XGBoost core.  Or some parameters are not used but slip through this
  verification. Please open an issue if you find above cases.


[0]	validation_0-auc:0.65332
[200]	validation_0-auc:0.75256
[400]	validation_0-auc:0.75880
[600]	validation_0-auc:0.76335
[800]	validation_0-auc:0.76624
[872]	validation_0-auc:0.76689


XGBClassifier(alpha=0, base_score=0.5, booster='gbtree',
              colsample_bylevel=0.632, colsample_bynode=1, colsample_bytree=0.7,
              eta='0.05', eval_metric='auc', gamma=10, gpu_id=-1,
              importance_type='gain', interaction_constraints='', lambda=0,
              learning_rate=0.0500000007, max_delta_step=0, max_depth=6,
              min_child_weight=30, missing=nan, monotone_constraints='()',
              n_estimators=873, n_jobs=6, nthread=6, num_parallel_tree=1,
              random_seed=27, random_state=0, reg_alpha=0, reg_lambda=0,
              scale_pos_weight=1, subsample=0.85, tree_method='exact', ...)

In [161]:
y_pred_test_xgb = model.predict_proba(test)

In [162]:
y_pred_test_xgb

array([[0.92899084, 0.07100917],
       [0.71526265, 0.28473738],
       [0.87644124, 0.12355877],
       ...,
       [0.9152941 , 0.08470587],
       [0.9814359 , 0.0185641 ],
       [0.9480052 , 0.05199481]], dtype=float32)

### 6.7 LightGBM + XGBoost

Counting the average of Lgbm and XGBoost predictions.

In [163]:
scores = pd.DataFrame({
    "lgbm": y_pred_test_lgbm[:,1],
    "xgb": y_pred_test_xgb[:,1]
})

In [164]:
scores.head(10)

Unnamed: 0,lgbm,xgb
0,0.03463,0.071009
1,0.260986,0.284737
2,0.139677,0.123559
3,0.085198,0.084706
4,0.014274,0.013599
5,0.031712,0.031053
6,0.057907,0.056952
7,0.100388,0.103192
8,0.008492,0.013319
9,0.021653,0.024609


In [165]:
scores_mean = scores.mean(axis=1)

In [166]:
result_test_x = pd.DataFrame({
    "APPLICATION_NUMBER": test_AN,
    "TARGET": scores_mean})
result_test_x.head()

Unnamed: 0,APPLICATION_NUMBER,TARGET
0,123724268,0.052819
1,123456549,0.272862
2,123428178,0.131618
3,123619984,0.084952
4,123671104,0.013937


In [167]:
filename = 'xgb+lgbm.csv'
result_test_x.to_csv(filename, index=None)
# the best Kaggle submission (0.735448)

## 7. Conclusions

In the process of this work, it turned out that new features don't always work, but it is also possible to increase the result quite well by dropping existing ones.
It is also very important to select the hyperparameters of the models and averaging the results of different models.