# Comments:
    
This is an improvement of my baseline, you can find it here: https://www.kaggle.com/code/ragnar123/amex-lgbm-dart-cv-0-7963

The main difference between this solution and previous one is that we add new features and do seed blend to boost LB. Single 5 kfold model using seed 42 achieve an out of folds CV of 0.7977 and a public leaderboard of 0.799. If we use seed blend (train three different models using seed 42, 52, 62 and then average predictions) the LB boost niceley.

The main features that boost CV are the following:

* The difference between last value and the lag1
* The difference between last value and the average (this features gives a nice boost)

This feature engineer is done on all the last columns, so we actually add a lot of features, this model used 1368 features.

I uploaded test predictions to avoid running training and inference

Next Steps:

* Could try feature selection, maybe a lot of the feature are just noise, actually I perform permutation importance and I reduce the amount of features to 1000 app and the CV was almost the same. Maybe there is a better feature selection technique that can boost performance.

* Could try different models, maybe some neural network with the same features or a subset of the features and then blend with LGBM can work, in my experience blending tree models and neural network works great because they are very diverse so the boost is nice

* Could try more feature engineering, maybe we can create more features that extract the hidden signal of the dataset, actually I would first work on this option and really try to capture all the signal that the dataset has.

# Preprocessing

In [2]:
import gc
import os
import joblib
import random
import warnings
import itertools
import scipy as sp
import numpy as np
import pandas as pd
from tqdm import tqdm
import lightgbm as lgb
from itertools import combinations
pd.set_option('display.width', 1000)
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
from sklearn.preprocessing import LabelEncoder
import warnings; warnings.filterwarnings('ignore')
from sklearn.model_selection import StratifiedKFold, train_test_split

def get_difference(data, num_features):
    df1,df3,df6 = [],[],[]
    customer_ids = []
    for customer_id, df in tqdm(data.groupby(['customer_ID'])):
        diff_df1 = df[num_features].diff(1).iloc[[-1]].values.astype(np.float32)
        diff_df3 = df[num_features].diff(3).iloc[[-1]].values.astype(np.float32)
        diff_df6 = df[num_features].diff(6).iloc[[-1]].values.astype(np.float32)

        df1.append(diff_df1)
        df3.append(diff_df3)
        df6.append(diff_df6)
        customer_ids.append(customer_id)
        
    df1 = np.concatenate(df1, axis = 0)
    df1 = pd.DataFrame(df1, columns = [col + '_diff1' for col in df[num_features].columns])
    df3 = np.concatenate(df3, axis = 0)
    df3 = pd.DataFrame(df3, columns = [col + '_diff3' for col in df[num_features].columns])
    df6 = np.concatenate(df6, axis = 0)
    df6 = pd.DataFrame(df6, columns = [col + '_diff6' for col in df[num_features].columns])
    df_all = pd.concat([df1,df3,df6],axis=1)
    df_all['customer_ID'] = customer_ids
    return df_all

def read_preprocess_data():
    train = pd.read_parquet('../input/amex-data-integer-dtypes-parquet-format/train.parquet')
    features = train.drop(['customer_ID', 'S_2'], axis = 1).columns.to_list()
    cat_features = [
        "B_30",
        "B_38",
        "D_114",
        "D_116",
        "D_117",
        "D_120",
        "D_126",
        "D_63",
        "D_64",
        "D_66",
        "D_68",
    ]
    num_features = [col for col in features if col not in cat_features]
    print('Starting training feature engineer...')
    train_num_agg = train.groupby("customer_ID")[num_features].agg(['first', 'mean', 'std', 'min', 'max'])
    train_num_agg.columns = ['_'.join(x) for x in train_num_agg.columns]
    train_num_agg.reset_index(inplace = True)
    
    train_tail2 = train.groupby("customer_ID").tail(2)
    train_tail2_num_agg = train_tail2.groupby("customer_ID")[num_features].agg(['mean'])
    train_tail2_num_agg.columns = ['_'.join([xx.replace('mean','last') for xx in x]) for x in train_tail2_num_agg.columns]
    train_tail2_num_agg.reset_index(inplace = True)

    train_num_agg = train_num_agg.merge(train_tail2_num_agg, how = 'inner', on = 'customer_ID')
    # Lag Features
    for col in num_features:
        train_num_agg[f'{col}_last_mean_diff'] = train_num_agg[f'{col}_last'] - train_num_agg[f'{col}_mean']
        train_num_agg[f'{col}_last_first_diff'] = train_num_agg[f'{col}_last'] - train_num_agg[f'{col}_first']

    train_cat_agg = train.groupby("customer_ID")[cat_features].agg(['count', 'first', 'last', 'nunique'])
    train_cat_agg.columns = ['_'.join(x) for x in train_cat_agg.columns]
    train_cat_agg.reset_index(inplace = True)
    train_labels = pd.read_csv('../input/amex-default-prediction/train_labels.csv')
    # Transform float64 columns to float32
    cols = list(train_num_agg.dtypes[train_num_agg.dtypes == 'float64'].index)
    for col in tqdm(cols):
        train_num_agg[col] = train_num_agg[col].astype(np.float32)
    # Transform int64 columns to int32
    cols = list(train_cat_agg.dtypes[train_cat_agg.dtypes == 'int64'].index)
    for col in tqdm(cols):
        train_cat_agg[col] = train_cat_agg[col].astype(np.int32)
    # Get the difference
    train_diff = get_difference(train, num_features)
    train = train_num_agg.merge(train_cat_agg, how = 'inner', on = 'customer_ID').\
            merge(train_diff, how = 'inner', on = 'customer_ID').\
            merge(train_labels, how = 'inner', on = 'customer_ID')
    del train_num_agg, train_cat_agg, train_diff, train_tail2_num_agg
    gc.collect()
    
    

        
    test = pd.read_parquet('../input/amex-data-integer-dtypes-parquet-format/test.parquet')
    print('Starting test feature engineer...')
    test_num_agg = test.groupby("customer_ID")[num_features].agg(['first', 'mean', 'std', 'min', 'max'])
    test_num_agg.columns = ['_'.join(x) for x in test_num_agg.columns]
    test_num_agg.reset_index(inplace = True)

    
    test_tail2 = test.groupby("customer_ID").tail(2)
    test_tail2_num_agg = test_tail2.groupby("customer_ID")[num_features].agg(['mean'])
    test_tail2_num_agg.columns = ['_'.join([xx.replace('mean','last') for xx in x]) for x in test_tail2_num_agg.columns]
    test_tail2_num_agg.reset_index(inplace = True)

    test_num_agg = test_num_agg.merge(test_tail2_num_agg, how = 'inner', on = 'customer_ID')

    # Lag Features
    for col in num_features:
        test_num_agg[f'{col}_last_mean_diff'] = test_num_agg[f'{col}_last'] - test_num_agg[f'{col}_mean']
        test_num_agg[f'{col}_last_first_diff'] = test_num_agg[f'{col}_last'] - test_num_agg[f'{col}_first']


    test_cat_agg = test.groupby("customer_ID")[cat_features].agg(['count', 'first', 'last', 'nunique'])
    test_cat_agg.columns = ['_'.join(x) for x in test_cat_agg.columns]
    test_cat_agg.reset_index(inplace = True)
    # Transform float64 columns to float32
    cols = list(test_num_agg.dtypes[test_num_agg.dtypes == 'float64'].index)
    for col in tqdm(cols):
        test_num_agg[col] = test_num_agg[col].astype(np.float32)
    # Transform int64 columns to int32
    cols = list(test_cat_agg.dtypes[test_cat_agg.dtypes == 'int64'].index)
    for col in tqdm(cols):
        test_cat_agg[col] = test_cat_agg[col].astype(np.int32)
    # Get the difference
    test_diff = get_difference(test, num_features)
    test = test_num_agg.merge(test_cat_agg, how = 'inner', on = 'customer_ID').\
            merge(test_diff, how = 'inner', on = 'customer_ID')
    del test_num_agg, test_cat_agg, test_diff
    gc.collect()

    features = train.drop(['customer_ID'], axis = 1).columns.to_list()
    num_features = [col for col in features if col not in cat_features]
    num_cols = [col for col in num_features if (('last' in col or 'mean' in col) and 'diff' not in col)]
    for col in num_cols:
        train[col + '_round2'] = train[col].round(2)
        test[col + '_round2'] = test[col].round(2)
    
    print('train.shape:',train.shape)
    print('test.shape:',test.shape)

    # Save files to disk
    train.to_parquet('train_fe.parquet')
    test.to_parquet('test_fe.parquet')
    
# Read & Preprocess Data
# read_preprocess_data()

# Training & Inference

In [3]:
# ====================================================
# Library
# ====================================================
import os
import gc
import warnings
warnings.filterwarnings('ignore')
import random
import scipy as sp
import numpy as np
import pandas as pd
import joblib
import itertools
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
from tqdm.auto import tqdm
from sklearn.model_selection import StratifiedKFold, train_test_split
from sklearn.preprocessing import LabelEncoder
import lightgbm as lgb
from itertools import combinations
from catboost import CatBoostClassifier
# ====================================================
# Configurations
# ====================================================
class CFG:
    input_dir = './'
    seed = 42
    n_folds = 5
    target = 'target'
    boosting_type = 'dart'
    metric = 'binary_logloss'

# ====================================================
# Seed everything
# ====================================================
def seed_everything(seed):
    random.seed(seed)
    np.random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)

# ====================================================
# Read data
# ====================================================
def read_data():
    train = pd.read_parquet(CFG.input_dir + 'train_fe_v2.parquet')
    test = pd.read_parquet(CFG.input_dir + 'train_fe_v2.parquet')
    return train, test

# ====================================================
# Amex metric
# ====================================================
def amex_metric(y_true, y_pred):
    labels = np.transpose(np.array([y_true, y_pred]))
    labels = labels[labels[:, 1].argsort()[::-1]]
    weights = np.where(labels[:,0]==0, 20, 1)
    cut_vals = labels[np.cumsum(weights) <= int(0.04 * np.sum(weights))]
    top_four = np.sum(cut_vals[:,0]) / np.sum(labels[:,0])
    gini = [0,0]
    for i in [1,0]:
        labels = np.transpose(np.array([y_true, y_pred]))
        labels = labels[labels[:, i].argsort()[::-1]]
        weight = np.where(labels[:,0]==0, 20, 1)
        weight_random = np.cumsum(weight / np.sum(weight))
        total_pos = np.sum(labels[:, 0] *  weight)
        cum_pos_found = np.cumsum(labels[:, 0] * weight)
        lorentz = cum_pos_found / total_pos
        gini[i] = np.sum((lorentz - weight_random) * weight)
    return 0.5 * (gini[1]/gini[0] + top_four)

# ====================================================
# LGBM amex metric
# ====================================================
def lgb_amex_metric(y_pred, y_true):
    y_true = y_true.get_label()
    return 'amex_metric', amex_metric(y_true, y_pred), True

# ====================================================
# Train & Evaluate
# ====================================================
def train_and_evaluate(train, test):
    # Label encode categorical features
    cat_features = [
        "B_30",
        "B_38",
        "D_114",
        "D_116",
        "D_117",
        "D_120",
        "D_126",
        "D_63",
        "D_64",
        "D_66",
        "D_68"
    ]
    cat_features = [f"{cf}_last" for cf in cat_features] +  [f"{cf}_first" for cf in cat_features]
    for cat_col in cat_features:
        encoder = LabelEncoder()
        train[cat_col] = encoder.fit_transform(train[cat_col])
        test[cat_col] = encoder.transform(test[cat_col])
#     # Round last float features to 2 decimal place
#     num_cols = list(train.dtypes[(train.dtypes == 'float32') | (train.dtypes == 'float64')].index)
#     num_cols = [col for col in num_cols if 'last' in col]
#     for col in num_cols:
#         train[col + '_round2'] = train[col].round(2)
#         test[col + '_round2'] = test[col].round(2)
#     # Get the difference between last and mean
#     num_cols = [col for col in train.columns if 'last' in col]
#     num_cols = [col[:-5] for col in num_cols if 'round' not in col]
#     for col in num_cols:
#         try:
#             train[f'{col}_last_mean_diff'] = train[f'{col}_last'] - train[f'{col}_mean']
#             test[f'{col}_last_mean_diff'] = test[f'{col}_last'] - test[f'{col}_mean']
#         except:
#             pass
    # Transform float64 and float32 to float16
    num_cols = list(train.dtypes[(train.dtypes == 'float32') | (train.dtypes == 'float64')].index)
#     for col in tqdm(num_cols):
#         train[col] = train[col].astype(np.float16)
#         test[col] = test[col].astype(np.float16)
    # Get feature list
    features = [col for col in train.columns if col not in ['customer_ID', CFG.target]]
    features = [col for col in features if 'B_29' not in col]

    params = {
        'objective': 'binary',
        'metric': CFG.metric,
        'boosting': CFG.boosting_type,
        'seed': CFG.seed,
        'num_leaves': 100,
        'learning_rate': 0.01,
        'feature_fraction': 0.20,
        'bagging_freq': 10,
        'bagging_fraction': 0.50,
        'n_jobs': -1,
        'lambda_l2': 2,
        'min_data_in_leaf': 40,
        }
    # Create a numpy array to store test predictions
    test_predictions = np.zeros(len(test))
    # Create a numpy array to store out of folds predictions
    oof_predictions = np.zeros(len(train))
    kfold = StratifiedKFold(n_splits = CFG.n_folds, shuffle = True, random_state = CFG.seed)
    for fold, (trn_ind, val_ind) in enumerate(kfold.split(train, train[CFG.target])):
        print(' ')
        print('-'*50)
        print(f'Training fold {fold} with {len(features)} features...')
        x_train, x_val = train[features].iloc[trn_ind], train[features].iloc[val_ind]
        y_train, y_val = train[CFG.target].iloc[trn_ind], train[CFG.target].iloc[val_ind]
        model = CatBoostClassifier(iterations=10000,learning_rate=0.03, random_state=22,task_type='GPU')
        model.fit(x_train, y_train, eval_set=[(x_val, y_val)], cat_features=cat_features,  verbose=100, use_best_model=True)
        joblib.dump(model, f'cat_fold{fold}_seed{CFG.seed}.pkl')
        val_pred = model.predict_proba(x_val)[:, 1]

#         lgb_train = lgb.Dataset(x_train, y_train, categorical_feature = cat_features)
#         lgb_valid = lgb.Dataset(x_val, y_val, categorical_feature = cat_features)
#         model = lgb.train(
#             params = params,
#             train_set = lgb_train,
#             num_boost_round = 10500,
#             valid_sets = [lgb_train, lgb_valid],
#             early_stopping_rounds = 1500,
#             verbose_eval = 500,
#             feval = lgb_amex_metric
#             )
#         # Save best model
#         joblib.dump(model, f'/content/drive/MyDrive/Amex/Models/lgbm_{CFG.boosting_type}_fold{fold}_seed{CFG.seed}.pkl')
        # Predict validation
#         val_pred = model.predict(x_val)
        # Add to out of folds array
        oof_predictions[val_ind] = val_pred
        # Predict the test set
        test_pred_list = []
        for ii in tqdm(range(int(len(test)/10000)+1)):
            test_pred_tmp = model.predict_proba(test[ii*10000:(ii+1)*10000][features])[:, 1]
            test_pred_list.append(test_pred_tmp)
        test_pred = np.concatenate(test_pred_list)
#         test_pred = model.predict(test[features])
        test_predictions += test_pred / CFG.n_folds
        # Compute fold metric
        score = amex_metric(y_val, val_pred)
        print(f'Our fold {fold} CV score is {score}')
        del x_train, x_val, y_train, y_val
        gc.collect()
    # Compute out of folds metric
    score = amex_metric(train[CFG.target], oof_predictions)
    print(f'Our out of folds CV score is {score}')
    # Create a dataframe to store out of folds predictions
    oof_df = pd.DataFrame({'customer_ID': train['customer_ID'], 'target': train[CFG.target], 'prediction': oof_predictions})
    oof_df.to_csv(f'oof_cat_baseline_{CFG.n_folds}fold_seed{CFG.seed}.csv', index = False)
    # Create a dataframe to store test prediction
    test_df = pd.DataFrame({'customer_ID': test['customer_ID'], 'prediction': test_predictions})
    test_df.to_csv('../sub/test_lgbm_5fold_seed42_v2.csv', index = False)
    


In [4]:
seed_everything(CFG.seed)
train, test = read_data()
print(train.shape)

(458913, 2358)


In [5]:
train_and_evaluate(train, test)

 
--------------------------------------------------
Training fold 0 with 2356 features...
0:	learn: 0.6563804	test: 0.6562318	best: 0.6562318 (0)	total: 63.3ms	remaining: 10m 32s
100:	learn: 0.2402303	test: 0.2393476	best: 0.2393476 (100)	total: 5.72s	remaining: 9m 21s
200:	learn: 0.2288250	test: 0.2285330	best: 0.2285330 (200)	total: 11.3s	remaining: 9m 8s
300:	learn: 0.2238839	test: 0.2243531	best: 0.2243531 (300)	total: 16.8s	remaining: 9m
400:	learn: 0.2207308	test: 0.2220360	best: 0.2220360 (400)	total: 22.3s	remaining: 8m 53s
500:	learn: 0.2182962	test: 0.2204984	best: 0.2204984 (500)	total: 27.7s	remaining: 8m 46s
600:	learn: 0.2163246	test: 0.2194388	best: 0.2194388 (599)	total: 33.1s	remaining: 8m 38s
700:	learn: 0.2145994	test: 0.2187049	best: 0.2187049 (700)	total: 38.5s	remaining: 8m 31s
800:	learn: 0.2130946	test: 0.2181350	best: 0.2181350 (800)	total: 43.9s	remaining: 8m 23s
900:	learn: 0.2116648	test: 0.2177056	best: 0.2177056 (900)	total: 49.3s	remaining: 8m 17s
1000:	

8700:	learn: 0.1520688	test: 0.2138012	best: 0.2137257 (7782)	total: 7m 48s	remaining: 1m 9s
8800:	learn: 0.1514877	test: 0.2138200	best: 0.2137257 (7782)	total: 7m 53s	remaining: 1m 4s
8900:	learn: 0.1509393	test: 0.2138227	best: 0.2137257 (7782)	total: 7m 58s	remaining: 59.1s
9000:	learn: 0.1503637	test: 0.2138345	best: 0.2137257 (7782)	total: 8m 4s	remaining: 53.8s
9100:	learn: 0.1497938	test: 0.2138349	best: 0.2137257 (7782)	total: 8m 9s	remaining: 48.4s
9200:	learn: 0.1492514	test: 0.2138466	best: 0.2137257 (7782)	total: 8m 15s	remaining: 43s
9300:	learn: 0.1486969	test: 0.2138436	best: 0.2137257 (7782)	total: 8m 20s	remaining: 37.6s
9400:	learn: 0.1481560	test: 0.2138434	best: 0.2137257 (7782)	total: 8m 26s	remaining: 32.2s
9500:	learn: 0.1476203	test: 0.2138775	best: 0.2137257 (7782)	total: 8m 31s	remaining: 26.9s
9600:	learn: 0.1470711	test: 0.2138733	best: 0.2137257 (7782)	total: 8m 36s	remaining: 21.5s
9700:	learn: 0.1464986	test: 0.2138585	best: 0.2137257 (7782)	total: 8m 42

  0%|          | 0/46 [00:00<?, ?it/s]

Our fold 0 CV score is 0.8012420303906174
 
--------------------------------------------------
Training fold 1 with 2356 features...
0:	learn: 0.6561923	test: 0.6561567	best: 0.6561567 (0)	total: 250ms	remaining: 41m 44s
100:	learn: 0.2396201	test: 0.2410770	best: 0.2410770 (100)	total: 6.23s	remaining: 10m 10s
200:	learn: 0.2280419	test: 0.2308644	best: 0.2308644 (200)	total: 12.1s	remaining: 9m 48s
300:	learn: 0.2230345	test: 0.2269370	best: 0.2269370 (300)	total: 18s	remaining: 9m 39s
400:	learn: 0.2197767	test: 0.2246839	best: 0.2246839 (400)	total: 23.7s	remaining: 9m 28s
500:	learn: 0.2173276	test: 0.2232986	best: 0.2232986 (500)	total: 29.6s	remaining: 9m 21s
600:	learn: 0.2152865	test: 0.2223534	best: 0.2223534 (600)	total: 35.4s	remaining: 9m 14s
700:	learn: 0.2135718	test: 0.2217101	best: 0.2217101 (700)	total: 41.2s	remaining: 9m 7s
800:	learn: 0.2120168	test: 0.2212069	best: 0.2212069 (800)	total: 46.9s	remaining: 8m 58s
900:	learn: 0.2106323	test: 0.2208514	best: 0.2208514

8700:	learn: 0.1515696	test: 0.2177902	best: 0.2176242 (6358)	total: 8m 3s	remaining: 1m 12s
8800:	learn: 0.1510364	test: 0.2177942	best: 0.2176242 (6358)	total: 8m 9s	remaining: 1m 6s
8900:	learn: 0.1504559	test: 0.2177900	best: 0.2176242 (6358)	total: 8m 14s	remaining: 1m 1s
9000:	learn: 0.1499073	test: 0.2177859	best: 0.2176242 (6358)	total: 8m 20s	remaining: 55.5s
9100:	learn: 0.1493553	test: 0.2177829	best: 0.2176242 (6358)	total: 8m 25s	remaining: 49.9s
9200:	learn: 0.1487901	test: 0.2177772	best: 0.2176242 (6358)	total: 8m 31s	remaining: 44.4s
9300:	learn: 0.1482045	test: 0.2177895	best: 0.2176242 (6358)	total: 8m 36s	remaining: 38.8s
9400:	learn: 0.1476455	test: 0.2178073	best: 0.2176242 (6358)	total: 8m 42s	remaining: 33.3s
9500:	learn: 0.1471038	test: 0.2178108	best: 0.2176242 (6358)	total: 8m 47s	remaining: 27.7s
9600:	learn: 0.1465508	test: 0.2178437	best: 0.2176242 (6358)	total: 8m 53s	remaining: 22.2s
9700:	learn: 0.1459970	test: 0.2178598	best: 0.2176242 (6358)	total: 8m

  0%|          | 0/46 [00:00<?, ?it/s]

Our fold 1 CV score is 0.7920508854200967
 
--------------------------------------------------
Training fold 2 with 2356 features...
0:	learn: 0.6558370	test: 0.6559445	best: 0.6559445 (0)	total: 339ms	remaining: 56m 25s
100:	learn: 0.2398455	test: 0.2416623	best: 0.2416623 (100)	total: 6.17s	remaining: 10m 5s
200:	learn: 0.2282743	test: 0.2311625	best: 0.2311625 (200)	total: 11.9s	remaining: 9m 41s
300:	learn: 0.2232106	test: 0.2271367	best: 0.2271367 (300)	total: 17.6s	remaining: 9m 26s
400:	learn: 0.2200058	test: 0.2248935	best: 0.2248935 (400)	total: 23.3s	remaining: 9m 17s
500:	learn: 0.2176111	test: 0.2234475	best: 0.2234475 (500)	total: 28.9s	remaining: 9m 7s
600:	learn: 0.2156354	test: 0.2224493	best: 0.2224493 (600)	total: 34.4s	remaining: 8m 58s
700:	learn: 0.2139303	test: 0.2217245	best: 0.2217245 (700)	total: 40s	remaining: 8m 50s
800:	learn: 0.2123990	test: 0.2211750	best: 0.2211750 (800)	total: 45.5s	remaining: 8m 42s
900:	learn: 0.2110026	test: 0.2207154	best: 0.2207154 

8700:	learn: 0.1518852	test: 0.2168445	best: 0.2168065 (7369)	total: 8m	remaining: 1m 11s
8800:	learn: 0.1512948	test: 0.2168285	best: 0.2168065 (7369)	total: 8m 5s	remaining: 1m 6s
8900:	learn: 0.1507655	test: 0.2168347	best: 0.2168065 (7369)	total: 8m 11s	remaining: 1m
9000:	learn: 0.1501981	test: 0.2168407	best: 0.2168065 (7369)	total: 8m 16s	remaining: 55.1s
9100:	learn: 0.1496293	test: 0.2168378	best: 0.2168065 (7369)	total: 8m 22s	remaining: 49.6s
9200:	learn: 0.1490624	test: 0.2168519	best: 0.2168065 (7369)	total: 8m 28s	remaining: 44.1s
9300:	learn: 0.1484923	test: 0.2168468	best: 0.2168065 (7369)	total: 8m 33s	remaining: 38.6s
9400:	learn: 0.1479641	test: 0.2168423	best: 0.2168065 (7369)	total: 8m 39s	remaining: 33.1s
9500:	learn: 0.1474124	test: 0.2168523	best: 0.2168065 (7369)	total: 8m 44s	remaining: 27.6s
9600:	learn: 0.1468678	test: 0.2168690	best: 0.2168065 (7369)	total: 8m 50s	remaining: 22s
9700:	learn: 0.1463401	test: 0.2168954	best: 0.2168065 (7369)	total: 8m 55s	rem

  0%|          | 0/46 [00:00<?, ?it/s]

Our fold 2 CV score is 0.7948414342584729
 
--------------------------------------------------
Training fold 3 with 2356 features...
0:	learn: 0.6559801	test: 0.6561131	best: 0.6561131 (0)	total: 272ms	remaining: 45m 19s
100:	learn: 0.2393968	test: 0.2426858	best: 0.2426858 (100)	total: 6.34s	remaining: 10m 21s
200:	learn: 0.2277490	test: 0.2321627	best: 0.2321627 (200)	total: 12.2s	remaining: 9m 55s
300:	learn: 0.2228136	test: 0.2281034	best: 0.2281034 (300)	total: 17.9s	remaining: 9m 36s
400:	learn: 0.2196474	test: 0.2258491	best: 0.2258491 (400)	total: 23.8s	remaining: 9m 29s
500:	learn: 0.2172825	test: 0.2244233	best: 0.2244233 (500)	total: 29.6s	remaining: 9m 21s
600:	learn: 0.2152328	test: 0.2234363	best: 0.2234363 (600)	total: 35.4s	remaining: 9m 13s
700:	learn: 0.2135565	test: 0.2227466	best: 0.2227466 (700)	total: 41.1s	remaining: 9m 5s
800:	learn: 0.2120187	test: 0.2222172	best: 0.2222172 (800)	total: 46.8s	remaining: 8m 57s
900:	learn: 0.2106431	test: 0.2218288	best: 0.22182

8700:	learn: 0.1514151	test: 0.2176491	best: 0.2176381 (7952)	total: 8m 4s	remaining: 1m 12s
8800:	learn: 0.1508292	test: 0.2176455	best: 0.2176381 (7952)	total: 8m 10s	remaining: 1m 6s
8900:	learn: 0.1502247	test: 0.2176423	best: 0.2176381 (7952)	total: 8m 15s	remaining: 1m 1s
9000:	learn: 0.1496661	test: 0.2176525	best: 0.2176381 (7952)	total: 8m 21s	remaining: 55.6s
9100:	learn: 0.1491065	test: 0.2176774	best: 0.2176381 (7952)	total: 8m 26s	remaining: 50s
9200:	learn: 0.1485727	test: 0.2176849	best: 0.2176381 (7952)	total: 8m 32s	remaining: 44.5s
9300:	learn: 0.1480186	test: 0.2176888	best: 0.2176381 (7952)	total: 8m 37s	remaining: 38.9s
9400:	learn: 0.1474430	test: 0.2176680	best: 0.2176381 (7952)	total: 8m 43s	remaining: 33.3s
9500:	learn: 0.1468976	test: 0.2176784	best: 0.2176381 (7952)	total: 8m 48s	remaining: 27.8s
9600:	learn: 0.1463346	test: 0.2177012	best: 0.2176381 (7952)	total: 8m 54s	remaining: 22.2s
9700:	learn: 0.1457706	test: 0.2176830	best: 0.2176381 (7952)	total: 8m 

  0%|          | 0/46 [00:00<?, ?it/s]

Our fold 3 CV score is 0.7931036149203536
 
--------------------------------------------------
Training fold 4 with 2356 features...
0:	learn: 0.6561225	test: 0.6561543	best: 0.6561543 (0)	total: 313ms	remaining: 52m 12s
100:	learn: 0.2401478	test: 0.2401986	best: 0.2401986 (100)	total: 6.14s	remaining: 10m 2s
200:	learn: 0.2284808	test: 0.2291907	best: 0.2291907 (200)	total: 12s	remaining: 9m 43s
300:	learn: 0.2236734	test: 0.2251445	best: 0.2251445 (300)	total: 17.7s	remaining: 9m 30s
400:	learn: 0.2205221	test: 0.2229195	best: 0.2229195 (400)	total: 23.4s	remaining: 9m 20s
500:	learn: 0.2180981	test: 0.2214304	best: 0.2214304 (500)	total: 29.1s	remaining: 9m 11s
600:	learn: 0.2160736	test: 0.2204101	best: 0.2204101 (600)	total: 34.9s	remaining: 9m 5s
700:	learn: 0.2143819	test: 0.2196940	best: 0.2196940 (700)	total: 40.4s	remaining: 8m 56s
800:	learn: 0.2129014	test: 0.2191436	best: 0.2191436 (800)	total: 45.9s	remaining: 8m 47s
900:	learn: 0.2114735	test: 0.2186348	best: 0.2186348 

8700:	learn: 0.1524686	test: 0.2148384	best: 0.2147591 (5567)	total: 8m 8s	remaining: 1m 12s
8800:	learn: 0.1519016	test: 0.2148650	best: 0.2147591 (5567)	total: 8m 13s	remaining: 1m 7s
8900:	learn: 0.1513254	test: 0.2148812	best: 0.2147591 (5567)	total: 8m 19s	remaining: 1m 1s
9000:	learn: 0.1507707	test: 0.2148704	best: 0.2147591 (5567)	total: 8m 24s	remaining: 56s
9100:	learn: 0.1502243	test: 0.2148820	best: 0.2147591 (5567)	total: 8m 30s	remaining: 50.4s
9200:	learn: 0.1496639	test: 0.2148932	best: 0.2147591 (5567)	total: 8m 36s	remaining: 44.8s
9300:	learn: 0.1490823	test: 0.2149201	best: 0.2147591 (5567)	total: 8m 42s	remaining: 39.2s
9400:	learn: 0.1485421	test: 0.2149105	best: 0.2147591 (5567)	total: 8m 47s	remaining: 33.6s
9500:	learn: 0.1479786	test: 0.2149286	best: 0.2147591 (5567)	total: 8m 53s	remaining: 28s
9600:	learn: 0.1474446	test: 0.2149488	best: 0.2147591 (5567)	total: 8m 58s	remaining: 22.4s
9700:	learn: 0.1468877	test: 0.2149543	best: 0.2147591 (5567)	total: 9m 4s

  0%|          | 0/46 [00:00<?, ?it/s]

Our fold 4 CV score is 0.7959801966825741
Our out of folds CV score is 0.7952451818721405


# Read Submission File
This is the submission file corresponding to the output of the previous pipeline (using the average blend of 3 seeds)

In [6]:
# sub = pd.read_csv('../input/amex-sub/test_lgbm_baseline_5fold_seed_blend.csv')
# sub.to_csv('test_lgbm_baseline_5fold_seed_blend.csv', index = False)