# Feature Engineering

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></span></li><li><span><a href="#Load-Libraries" data-toc-modified-id="Load-Libraries-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Load Libraries</a></span></li><li><span><a href="#Load-data-from-Pickle" data-toc-modified-id="Load-data-from-Pickle-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Load data from Pickle</a></span></li><li><span><a href="#Split-the-dataset" data-toc-modified-id="Split-the-dataset-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Split the dataset</a></span></li><li><span><a href="#List-of-features-to-drop" data-toc-modified-id="List-of-features-to-drop-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>List of features to drop</a></span></li><li><span><a href="#Feature-Engineering" data-toc-modified-id="Feature-Engineering-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Feature Engineering</a></span><ul class="toc-item"><li><span><a href="#CatBoost" data-toc-modified-id="CatBoost-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>CatBoost</a></span></li><li><span><a href="#Pickle-the-CatBoost-Features" data-toc-modified-id="Pickle-the-CatBoost-Features-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Pickle the CatBoost Features</a></span></li><li><span><a href="#LGBM" data-toc-modified-id="LGBM-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>LGBM</a></span></li><li><span><a href="#Pickle-the-LGBM-features" data-toc-modified-id="Pickle-the-LGBM-features-6.4"><span class="toc-item-num">6.4&nbsp;&nbsp;</span>Pickle the LGBM features</a></span></li></ul></li><li><span><a href="#Next-Step---Modeling" data-toc-modified-id="Next-Step---Modeling-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Next Step - Modeling</a></span></li></ul></div>

## Introduction
This is where the bulk of the feature engineering takes place. The code can be found under `src/fe_modeling.py`. Here we are using feature aggregation to capture the historical activity of the transactions. Each transaction information about previous transactions including aggregations so that we can track the mean and standard deviations for specific time intervals. These properties should allow our models to perform a `behavioral analysis` of the activities of each credit card. Of course, one of the side effects of this is that the feature space can grow considerabily. <p></p>
We are also `frequency encoding` values, doing `min-max normalizaton`, clipping negative values, and treating `NaNs`.

## Load Libraries

In [1]:
import warnings
warnings.filterwarnings("ignore")
import os
import sys
import gc
import pickle

module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
from src.fe_modeling import *

%matplotlib inline

# Some Pandas options so that we can examine all the columns in the dataset with head
pd.options.display.max_rows = 10000
pd.options.display.max_columns = 10000
pd.options.display.max_colwidth = 1000

In [3]:
def seed_everything(seed=0):
    random.seed(seed)
    np.random.seed(seed)

In [4]:
SEED = 42
seed_everything(SEED)
START_DATE = datetime.datetime.strptime('2017-11-30', '%Y-%m-%d')
TARGET = 'isFraud'
NFOLDS = 5

## Load data from Pickle
We have already done some cleaning and feature engineering on the datasets. We then save the datasets to pickle files. Here, we load the datasets from pickle so that we can continue working.

In [5]:
%%time
X = pd.read_pickle('../data/train_reduced.pkl')
X_test = pd.read_pickle('../data/test_reduced.pkl')

y = X[TARGET]
X = X.drop(TARGET, axis=1)
        
print(f'X.shape : {X.shape}, X_test.shape : {X_test.shape}')

X.shape : (590540, 434), X_test.shape : (506691, 434)
Wall time: 7.41 s


## Split the dataset
As per usual, we split our data into training and validation.

In [6]:
def train_val_split_by_time(X, y, test_size=0.2):
    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=test_size, shuffle=False)
    
    print(f'train.shape: {X_train.shape}, val.shape: {X_val.shape}')
    
    return X_train, y_train, X_val, y_val

In [7]:
X_train, y_train, X_val, y_val = train_val_split_by_time(X, y)

train.shape: (472432, 434), val.shape: (118108, 434)


## List of features to drop
This list contains temporary features that will be created during the feature engineering stage. They are now dropped as they are longer needed.

In [9]:
cols_to_drop = [
    'D5_DT_W_std_score',
    'ProductCD_TransactionAmt_DT_W',
    'D4_DT_D_std_score',
    'D15_DT_D_std_score',
    'D3_DT_W_std_score',
    'D11_DT_W_std_score',
    'card3_card5_DT_W_week_day_dist',
    'card5_DT_W_week_day_dist',
    'D10_DT_D_std_score',
    'card3_card5_DT_D',
    'ProductCD_cents_DT_D',
    'D4_DT_W_std_score',
    'D15_DT_W_std_score',
    'uid_DT_D',
    'card3_DT_W_week_day_dist',
    'D10_DT_W_std_score',
    'D8_DT_D_std_score',
    'card3_card5_DT_W',
    'ProductCD_cents_DT_W',
    'uid_DT_W',
    'D8_DT_W_std_score'
]

## Feature Engineering
This function does the bulk of the feature engineering. These are the most important tasks:
* Identify valid credit cards - those cards that have more than two entries.
* Create timeblocks for use in feature aggregation to examine transactions based on periodic behavior.
* Use feature aggregation to capture behavioral aspects of card transactions.
* Feature encoding.
* Removing temporary and unnecessary features.

The datasets are a bit different for `CatBoost` and `LGBM`, so we do feature engineering twice. In the future I will do the majority of the work mutual to both algorithms in one pass, and that which is only needed for `CatBoost` separately. Which can then be merged to a copy of the first for `CatBoost`.


### CatBoost 

In [13]:
X_train_fe, X_val_fe, feature_cols1, category_cols1 = fe1(X_train, X_val, cols_to_drop, algo='CatBoost')

Rare data card1 5134
No intersection in Train card1 20399
Intersection in Train card1 452033
####################
Rare data ProductCD_card1 10509
No intersection in Train ProductCD_card1 33115
Intersection in Train ProductCD_card1 439317
####################
Rare data card1_addr1 21640
No intersection in Train card1_addr1 57867
Intersection in Train card1_addr1 414565
####################
Rare data TransactionAmt_dist2 18260
No intersection in Train TransactionAmt_dist2 49343
Intersection in Train TransactionAmt_dist2 423089
####################
No intersection in Train card2 6102
Intersection in Train card2 466330
####################
No intersection in Train card3 146
Intersection in Train card3 472286
####################
No intersection in Train card4 0
Intersection in Train card4 472432
####################
No intersection in Train card5 7339
Intersection in Train card5 465093
####################
No intersection in Train card6 45
Intersection in Train card6 472387
###############

uid_aggregation: card5_id_14_mean
uid_aggregation: card5_id_14_std
uid_aggregation: uid_id_14_mean
uid_aggregation: uid_id_14_std
uid_aggregation: card3_card5_id_14_mean
uid_aggregation: card3_card5_id_14_std
uid_aggregation: uid_V258_mean
uid_aggregation: uid_V258_std
uid_aggregation: card3_card5_V258_mean
uid_aggregation: card3_card5_V258_std
uid_aggregation: uid_V306_mean
uid_aggregation: uid_V306_std
uid_aggregation: card3_card5_V306_mean
uid_aggregation: card3_card5_V306_std
uid_aggregation: uid_V307_mean
uid_aggregation: uid_V307_std
uid_aggregation: card3_card5_V307_mean
uid_aggregation: card3_card5_V307_std
uid_aggregation: uid_V308_mean
uid_aggregation: uid_V308_std
uid_aggregation: card3_card5_V308_mean
uid_aggregation: card3_card5_V308_std
uid_aggregation: uid_V294_mean
uid_aggregation: uid_V294_std
uid_aggregation: card3_card5_V294_mean
uid_aggregation: card3_card5_V294_std
timeblock frequency encoding: ProductCD_TransactionAmt_DT_D
timeblock frequency encoding: ProductCD_T

processing nan group agg for: ['V143', 'V144', 'V145', 'V150', 'V151', 'V152', 'V159', 'V160', 'V164', 'V165', 'V166']
processing nan group agg for: ['V167', 'V168', 'V172', 'V173', 'V176', 'V177', 'V178', 'V179', 'V181', 'V182', 'V183', 'V186', 'V187', 'V190', 'V191', 'V192', 'V193', 'V196', 'V199', 'V202', 'V203', 'V204', 'V205', 'V206', 'V207', 'V211', 'V212', 'V213', 'V214', 'V215', 'V216']
processing nan group agg for: ['V169', 'V170', 'V171', 'V174', 'V175', 'V180', 'V184', 'V185', 'V188', 'V189', 'V194', 'V195', 'V197', 'V198', 'V200', 'V201', 'V208', 'V209', 'V210']
processing nan group agg for: ['V217', 'V218', 'V219', 'V223', 'V224', 'V225', 'V226', 'V228', 'V229', 'V230', 'V231', 'V232', 'V233', 'V235', 'V236', 'V237', 'V240', 'V241', 'V242', 'V243', 'V244', 'V246', 'V247', 'V248', 'V249', 'V252', 'V253', 'V254', 'V257', 'V258', 'V260', 'V261', 'V262', 'V263', 'V264', 'V265', 'V266', 'V267', 'V268', 'V269', 'V273', 'V274', 'V275', 'V276', 'V277', 'V278']
processing nan group

Column: V295  | Dominator: 0.0
Column: V296  | Dominator: 0.0
Column: V297  | Dominator: 0.0
Column: V298  | Dominator: 0.0
Column: V299  | Dominator: 0.0
Column: V300  | Dominator: 0.0
Column: V301  | Dominator: 0.0
Column: V305  | Dominator: 1.0
Column: V309  | Dominator: 0.0
Column: V311  | Dominator: 0.0
Column: V316  | Dominator: 0.0
Column: V317  | Dominator: 0.0
Column: V318  | Dominator: 0.0
Column: V319  | Dominator: 0.0
Column: V320  | Dominator: 0.0
Column: V321  | Dominator: 0.0
Column: V322  | Dominator: -999.0
Column: V323  | Dominator: -999.0
Column: V324  | Dominator: -999.0
Column: V325  | Dominator: -999.0
Column: V326  | Dominator: -999.0
Column: V327  | Dominator: -999.0
Column: V328  | Dominator: -999.0
Column: V329  | Dominator: -999.0
Column: V330  | Dominator: -999.0
Column: V331  | Dominator: -999.0
Column: V332  | Dominator: -999.0
Column: V333  | Dominator: -999.0
Column: V334  | Dominator: -999.0
Column: V335  | Dominator: -999.0
Column: V336  | Dominator: -

Column: nan_group_508595  | Dominator: 1
Column: nan_group_508589  | Dominator: 1
Column: nan_group_12  | Dominator: 0
Column: nan_group_508189  | Dominator: 1
Column: nan_group_524216  | Dominator: 1
Column: nan_group_585385  | Dominator: 1
Column: nan_group_100  | Dominator: 0
Column: nan_group_519723  | Dominator: 1
Column: nan_group_559  | Dominator: 0
Column: nan_group_781  | Dominator: 0
Column: nan_group_225  | Dominator: 0
Column: nan_group_13087  | Dominator: 0
Column: nan_group_182  | Dominator: 0
Column: nan_group_187  | Dominator: 0
Column: nan_group_8310  | Dominator: 0
Column: nan_group_8325  | Dominator: 0
Column: nan_group_56771  | Dominator: 0
Column: nan_group_77261  | Dominator: 0
Column: nan_group_13160  | Dominator: 0
Column: nan_group_15575  | Dominator: 0
Column: nan_group_160  | Dominator: 0
Column: nan_group_573  | Dominator: 0
Duplicate card3_FE_FULL
Duplicate card3_TransactionAmt_mean
Duplicate card3_TransactionAmt_std
Duplicate card3_id_01_mean
Duplicate car

In [15]:
X_train_fe.head()

Unnamed: 0,TransactionID,TransactionAmt,ProductCD,card1,card2,card3,card4,card5,card6,addr1,addr2,dist1,dist2,P_emaildomain,R_emaildomain,C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12,C13,C14,M1,M2,M3,M4,M5,M6,M7,M8,M9,id_01,id_02,id_03,id_05,id_06,id_07,id_11,id_12,id_13,id_14,id_15,id_16,id_17,id_18,id_19,id_20,id_21,id_22,id_23,id_24,id_25,id_26,id_27,id_28,id_29,id_32,id_34,id_35,id_36,id_37,id_38,DeviceType,uid,is_december,card1_FE_FULL,card2_FE_FULL,card5_FE_FULL,ProductCD_card1_FE_FULL,card1_addr1_FE_FULL,TransactionAmt_dist2_FE_FULL,uid_FE_FULL,card3_card5,card3_DT_D_hour_dist,card3_DT_D_hour_dist_best,card3_DT_W_week_day_dist_best,card5_DT_D_hour_dist,card5_DT_D_hour_dist_best,card5_DT_W_week_day_dist_best,card3_card5_DT_D_hour_dist,card3_card5_DT_D_hour_dist_best,card3_card5_DT_W_week_day_dist_best,uid_D1_mean,uid_D1_std,card3_card5_D1_mean,card3_card5_D1_std,uid_D2_mean,uid_D2_std,card3_card5_D2_mean,card3_card5_D2_std,uid_D3_mean,uid_D3_std,card3_card5_D3_mean,card3_card5_D3_std,uid_D4_mean,uid_D4_std,card3_card5_D4_mean,card3_card5_D4_std,uid_D5_mean,uid_D5_std,card3_card5_D5_mean,card3_card5_D5_std,uid_D6_mean,uid_D6_std,card3_card5_D6_mean,card3_card5_D6_std,uid_D7_mean,uid_D7_std,card3_card5_D7_mean,card3_card5_D7_std,uid_D8_mean,uid_D8_std,card3_card5_D8_mean,card3_card5_D8_std,uid_D9_mean,card3_card5_D9_mean,card3_card5_D9_std,uid_D10_mean,uid_D10_std,card3_card5_D10_mean,card3_card5_D10_std,uid_D11_mean,uid_D11_std,card3_card5_D11_mean,card3_card5_D11_std,uid_D12_mean,uid_D12_std,card3_card5_D12_mean,card3_card5_D12_std,uid_D13_mean,uid_D13_std,card3_card5_D13_mean,card3_card5_D13_std,uid_D14_mean,uid_D14_std,card3_card5_D14_mean,card3_card5_D14_std,uid_D15_mean,uid_D15_std,card3_card5_D15_mean,card3_card5_D15_std,D8_not_same_day,D3_DT_D_min_max,D3_DT_D_std_score,D4_DT_D_min_max,D5_DT_D_min_max,D5_DT_D_std_score,D10_DT_D_min_max,D11_DT_D_min_max,D11_DT_D_std_score,D15_DT_D_min_max,D3_DT_W_min_max,D4_DT_W_min_max,D5_DT_W_min_max,D10_DT_W_min_max,D11_DT_W_min_max,D15_DT_W_min_max,D1_scaled,D2_scaled,D1_FE_FULL,D2_FE_FULL,D3_FE_FULL,D4_FE_FULL,D5_FE_FULL,D10_FE_FULL,D11_FE_FULL,D15_FE_FULL,TransactionAmt_check,card1_TransactionAmt_mean,card1_TransactionAmt_std,card2_TransactionAmt_mean,card2_TransactionAmt_std,card5_TransactionAmt_mean,card5_TransactionAmt_std,uid_TransactionAmt_mean,uid_TransactionAmt_std,card3_card5_TransactionAmt_mean,card3_card5_TransactionAmt_std,TransactionAmt_DT_D_min_max,TransactionAmt_DT_D_std_score,TransactionAmt_DT_W_min_max,TransactionAmt_DT_W_std_score,card1_id_01_mean,card1_id_01_std,card2_id_01_mean,card2_id_01_std,card5_id_01_mean,card5_id_01_std,uid_id_01_mean,uid_id_01_std,card3_card5_id_01_mean,card3_card5_id_01_std,card1_id_02_mean,card1_id_02_std,card2_id_02_mean,card2_id_02_std,card5_id_02_mean,card5_id_02_std,uid_id_02_mean,uid_id_02_std,card3_card5_id_02_mean,card3_card5_id_02_std,card1_id_05_mean,card1_id_05_std,card2_id_05_mean,card2_id_05_std,card5_id_05_mean,card5_id_05_std,uid_id_05_mean,uid_id_05_std,card3_card5_id_05_mean,card3_card5_id_05_std,card1_id_06_mean,card1_id_06_std,card2_id_06_mean,card2_id_06_std,card3_id_06_std,card5_id_06_mean,card5_id_06_std,uid_id_06_mean,uid_id_06_std,card3_card5_id_06_mean,card3_card5_id_06_std,card1_id_09_mean,card1_id_09_std,card2_id_09_mean,card2_id_09_std,card5_id_09_mean,card5_id_09_std,uid_id_09_mean,card3_card5_id_09_mean,card3_card5_id_09_std,card1_id_14_mean,card1_id_14_std,card2_id_14_mean,card2_id_14_std,card5_id_14_mean,card5_id_14_std,uid_id_14_mean,uid_id_14_std,card3_card5_id_14_mean,card3_card5_id_14_std,uid_V258_mean,uid_V258_std,card3_card5_V258_mean,card3_card5_V258_std,uid_V306_mean,uid_V306_std,card3_card5_V306_mean,card3_card5_V306_std,uid_V307_mean,uid_V307_std,card3_card5_V307_mean,card3_card5_V307_std,uid_V308_mean,uid_V308_std,card3_card5_V308_mean,card3_card5_V308_std,uid_V294_mean,uid_V294_std,card3_card5_V294_mean,card3_card5_V294_std,ProductCD_TransactionAmt_DT_D,ProductCD_TransactionAmt_FE_FULL,ProductCD_cents_FE_FULL,c_cols_0_bin,c_cols_0_bin_FE_FULL,C1_FE_FULL,C2_FE_FULL,C3_FE_FULL,C4_FE_FULL,C5_FE_FULL,C6_FE_FULL,C8_FE_FULL,C9_FE_FULL,C10_FE_FULL,C11_FE_FULL,C12_FE_FULL,C13_FE_FULL,C14_FE_FULL,card1_C1_mean,card1_C1_std,card2_C1_mean,card2_C1_std,card5_C1_mean,card5_C1_std,uid_C1_mean,uid_C1_std,card3_card5_C1_mean,card3_card5_C1_std,card1_C2_mean,card1_C2_std,card2_C2_mean,card2_C2_std,card5_C2_mean,card5_C2_std,uid_C2_mean,uid_C2_std,card3_card5_C2_mean,card3_card5_C2_std,card1_C3_mean,card1_C3_std,card2_C3_mean,card2_C3_std,card5_C3_mean,card5_C3_std,uid_C3_mean,uid_C3_std,card3_card5_C3_mean,card3_card5_C3_std,card1_C4_mean,card1_C4_std,card2_C4_mean,card2_C4_std,card5_C4_mean,card5_C4_std,uid_C4_mean,uid_C4_std,card3_card5_C4_mean,card3_card5_C4_std,card1_C5_mean,card1_C5_std,card2_C5_mean,card2_C5_std,card5_C5_mean,card5_C5_std,uid_C5_mean,uid_C5_std,card3_card5_C5_mean,card3_card5_C5_std,card1_C6_mean,card1_C6_std,card2_C6_mean,card2_C6_std,card5_C6_mean,card5_C6_std,uid_C6_mean,uid_C6_std,card3_card5_C6_mean,card3_card5_C6_std,card1_C7_mean,card1_C7_std,card2_C7_mean,card2_C7_std,card5_C7_mean,card5_C7_std,uid_C7_mean,uid_C7_std,card3_card5_C7_mean,card3_card5_C7_std,card1_C8_mean,card1_C8_std,card2_C8_mean,card2_C8_std,card5_C8_mean,card5_C8_std,uid_C8_mean,uid_C8_std,card3_card5_C8_mean,card3_card5_C8_std,card1_C9_mean,card1_C9_std,card2_C9_mean,card2_C9_std,card5_C9_mean,card5_C9_std,uid_C9_mean,uid_C9_std,card3_card5_C9_mean,card3_card5_C9_std,card1_C10_mean,card1_C10_std,card2_C10_mean,card2_C10_std,card5_C10_mean,card5_C10_std,uid_C10_mean,uid_C10_std,card3_card5_C10_mean,card3_card5_C10_std,card1_C11_mean,card1_C11_std,card2_C11_mean,card2_C11_std,card5_C11_mean,card5_C11_std,uid_C11_mean,uid_C11_std,card3_card5_C11_mean,card3_card5_C11_std,card1_C12_mean,card1_C12_std,card2_C12_mean,card2_C12_std,card5_C12_mean,card5_C12_std,uid_C12_mean,uid_C12_std,card3_card5_C12_mean,card3_card5_C12_std,card1_C13_mean,card1_C13_std,card2_C13_mean,card2_C13_std,card5_C13_mean,card5_C13_std,uid_C13_mean,uid_C13_std,card3_card5_C13_mean,card3_card5_C13_std,card1_C14_mean,card1_C14_std,card2_C14_mean,card2_C14_std,card5_C14_mean,card5_C14_std,uid_C14_mean,uid_C14_std,card3_card5_C14_mean,card3_card5_C14_std,card1_dist1_mean,card1_dist1_std,card2_dist1_mean,card2_dist1_std,card3_dist1_std,card5_dist1_mean,card5_dist1_std,uid_dist1_mean,uid_dist1_std,card3_card5_dist1_mean,card3_card5_dist1_std,nan_group_0_sum,nan_group_0_mean,nan_group_0_std,nan_group_1_sum,nan_group_1_mean,nan_group_1_std,nan_group_2_sum,nan_group_2_mean,nan_group_2_std,nan_group_3_sum,nan_group_3_mean,nan_group_3_std,nan_group_4_sum,nan_group_4_mean,nan_group_4_std,nan_group_5_sum,nan_group_5_mean,nan_group_5_std,nan_group_6_sum,nan_group_8_sum,nan_group_8_mean,nan_group_8_std,nan_group_9_sum,nan_group_9_mean,nan_group_9_std,nan_group_10_sum,nan_group_10_mean,nan_group_10_std,nan_group_11_sum,nan_group_11_mean,nan_group_11_std,nan_group_12_sum,nan_group_12_mean,nan_group_12_std,nan_group_13_sum,nan_group_13_mean,nan_group_13_std,nan_group_14_sum,DeviceInfo_FE_FULL,DeviceInfo_device_FE_FULL,DeviceInfo_version_FE_FULL,id_30_FE_FULL,id_30_version_FE_FULL,id_31_FE_FULL,id_31_device_FE_FULL,id_33_FE_FULL,id_01_FE_FULL,id_05_FE_FULL,id_06_FE_FULL,id_11_FE_FULL,id_13_FE_FULL,id_17_FE_FULL,id_19_FE_FULL,id_20_FE_FULL,nan_group_22364,nan_group_10882,nan_group_179,nan_group_8300,nan_group_65706,nan_group_1269,nan_group_280797,nan_group_262878,nan_group_168922,nan_group_309841,nan_group_76022,nan_group_279287,nan_group_528353,nan_group_89113,nan_group_76073,nan_group_168969,nan_group_77096,nan_group_89164,nan_group_314,nan_group_450909,nan_group_450721,nan_group_460110,nan_group_449124,nan_group_12,nan_group_453675,nan_group_100,nan_group_481325,nan_group_559,nan_group_781,nan_group_225,nan_group_138568,nan_group_13087,nan_group_182,nan_group_187,nan_group_8310,nan_group_8325,nan_group_56771,nan_group_77261,nan_group_13160,nan_group_15575,nan_group_446169,nan_group_498605,nan_group_160,nan_group_573,nan_group_138574
0,2987000,4.241327,4,13926.0,327.0,1,1,142.0,0,315.0,1,19.0,1,48,49,1.0,1.0,1,0.0,0.0,1.0,1,0.0,1.0,0.0,2.0,0.0,1.0,1.0,1,1,1,2,0,1,2,2,2,,,1,,,1,,2,,1,3,2,,1,,,1,1,3,1,1,1,2,2,2,1,4,2,2,2,2,2,113985,1,7.3e-05,0.010893,0.000476,5.2e-05,0.110099,0.001429,2e-06,243,-14.682021,-16.0,0.0,-10.0,-20.0,-2.0,-10.0,-20,-2,14.0,,49.78125,98.8125,,,96.8125,120.5625,13.0,,23.21875,46.71875,,,86.1875,140.75,,,39.3125,80.25,1,1,0.0,0.0,1,1,,,,1,271.25,inf,,0.71875,0.171387,13.0,,67.375,123.5,13.0,,89.3125,149.75,1,1,,,1,1,,,1,1,6.667969,16.328125,0.0,,116.9375,178.25,1,0.026694,-0.29547,,,,0.018705,0.026804,-0.866757,0.0,0.026694,,,0.018705,0.026639,0.0,0.021881,,0.005045,0.475492,0.010541,0.286047,0.524674,0.003204,0.001929,0.295088,1,351.936035,371.153198,269.468292,404.310577,189.505783,331.647064,68.5,,189.505783,331.647064,0.020518,-0.307812,0.016626,-0.280889,-5.0,0.0,-7.820312,13.15625,-8.421875,9.726562,,,-8.421875,9.726562,153111.0,96778.710938,112361.515625,122155.867188,168419.765625,99183.804688,,,168419.765625,99183.804688,4.332031,10.273438,2.316406,5.585938,5.875,9.867188,,,5.875,9.867188,-6.667969,12.9375,-5.222656,13.703125,1,-15.6875,26.109375,,,-15.6875,26.109375,0.666504,1.154297,0.079834,0.849121,1.5,1.915039,,1.5,1.915039,-375.0,81.4375,-335.75,67.8125,-374.0,75.125,,1,-374.0,75.125,,,1.0,0.0,0.0,,133.509964,606.992615,117.0,,187.089355,622.895142,0.0,,157.259933,624.966431,1.0,,0.238403,0.753418,0.002538,0.001411,0.042517,66,0.257124,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.758064,0.387676,0.767843,0.151856,0.828393,0.33755,0.542197,1.186523,0.450195,9.25,75.875,6.800781,23.671875,1.0,,6.800781,23.671875,1.186523,0.393799,9.1875,82.375,6.578125,22.875,1.0,,6.578125,22.875,0.0,0.0,0.019119,0.146851,0.0,0.0,1,,0.0,0.0,0.302246,0.637695,3.996094,54.03125,0.071167,0.308105,0.0,,0.071167,0.308105,0.093018,0.293945,2.386719,16.328125,6.089844,28.25,0.0,,6.089844,28.25,1.395508,1.049805,6.582031,56.5625,5.355469,18.6875,1.0,,5.355469,18.6875,0.0,0.0,0.0,0.0,0.0,0.0,1,,0.0,0.0,0.348877,0.650391,2.728516,31.078125,0.074707,0.324219,0.0,,0.074707,0.324219,0.744141,0.693359,2.382812,11.53125,4.519531,15.570312,1.0,,4.519531,15.570312,0.325684,0.565918,2.613281,27.078125,0.067627,0.291016,0.0,,0.067627,0.291016,1.302734,0.599121,6.992188,59.71875,4.832031,16.40625,2.0,,4.832031,16.40625,0.0,0.0,0.068542,0.253174,0.067627,0.251465,0.0,,0.067627,0.251465,1.279297,0.766113,17.0,73.625,26.515625,103.375,1.0,,26.515625,103.375,1.023438,0.407715,5.890625,41.21875,5.738281,19.546875,1.0,,5.738281,19.546875,37.5,50.78125,136.75,inf,1,87.3125,0,19.0,,87.3125,0,9.0,0.818359,0.404541,9.0,0.391357,0.499023,0.0,,,9.0,0.40918,0.50293,7.0,0.350098,0.489502,255.0,5.930233,24.826441,1,0.0,,,0.0,,,0.0,,,0.0,,,239.0,7.46875,28.735708,2.0,0.181763,0.404541,1,0.799055,0.799055,1,1,1,0.762451,0.762451,1,0.755761,0.768238,0.768238,0.761273,0.784401,0.763997,0.764084,0.76418,1,1,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,2987001,3.401197,4,2755.0,404.0,1,2,102.0,0,325.0,1,,1,16,49,1.0,1.0,1,0.0,0.0,1.0,1,0.0,0.0,0.0,1.0,0.0,1.0,1.0,2,2,2,0,1,1,2,2,2,,,1,,,1,,2,,1,3,2,,1,,,1,1,3,1,1,1,2,2,2,1,4,2,2,2,2,2,165461,1,0.001157,0.005219,0.049409,0.001048,0.000103,0.01817,2e-06,225,-14.682021,-16.0,0.0,-14.068421,-18.0,0.0,-14.511905,-18,0,0.0,,62.46875,132.5,,,139.25,167.75,,,26.859375,65.6875,0.0,,119.9375,184.625,,,40.46875,90.125,1,1,213.125,123.9375,1,1,16.90625,72.9375,,1,139.5,233.0,,0.592773,0.291748,0.0,,122.75,176.0,,,118.3125,175.875,1,1,42.71875,102.8125,1,1,27.890625,45.4375,1,1,176.75,141.5,0.0,,143.0,195.5,1,,,0.0,,,0.0,,,0.0,,0.0,,0.0,,0.0,0.0,,0.474362,0.475492,0.445148,0.282091,0.524674,0.375809,0.472935,0.295088,1,232.698868,441.577118,227.853912,356.084045,212.483826,345.099121,29.0,,231.598236,359.79068,0.00835,-0.466953,0.006766,-0.468352,-7.539062,11.101562,-7.890625,13.65625,-9.28125,12.203125,,,-8.046875,11.210938,153593.109375,189083.765625,121260.648438,120521.335938,137190.859375,146466.15625,,,99874.757812,117258.554688,2.791016,6.511719,2.591797,6.0625,1.811523,5.144531,,,2.328125,5.503906,-8.867188,24.765625,-6.382812,16.578125,1,-5.617188,15.320312,,,-4.789062,14.273438,0.342773,0.872559,0.080688,1.503906,0.096741,0.95752,,0.085938,0.967773,-352.25,67.4375,-333.75,70.625,-330.5,90.75,,1,-333.75,76.1875,,,1.316406,2.003906,0.0,,2104.952148,10804.65332,0.0,,4186.664062,19393.453125,0.0,,3033.937988,13800.174805,0.0,,34.34375,167.75,0.009762,0.018083,0.459993,71,0.024615,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.758064,0.311232,0.767843,0.659891,0.828393,0.33755,0.542197,2.59375,7.933594,6.050781,63.0625,13.5625,156.375,1.0,,7.589844,65.1875,2.244141,7.0,6.003906,69.0625,14.921875,179.25,1.0,,7.355469,73.25,0.008781,0.10791,0.012657,0.117432,0.006237,0.085449,1,,0.006924,0.090027,0.08197,0.300049,2.533203,45.8125,5.613281,82.0,0.0,,2.576172,44.53125,1.043945,8.648438,1.347656,10.695312,2.482422,16.4375,0.0,,2.785156,17.390625,1.84082,5.753906,4.414062,47.34375,7.964844,83.0625,1.0,,5.214844,46.9375,0.001464,0.038269,0.00292,0.059662,3.574219,72.625,1,,0.287842,18.4375,0.117126,0.406494,1.708984,24.53125,7.058594,112.8125,0.0,,2.228516,35.59375,1.818359,5.109375,1.725586,7.539062,2.238281,10.546875,0.0,,2.509766,11.140625,0.127319,0.471924,1.557617,22.296875,6.671875,111.9375,0.0,,1.879883,36.0,1.762695,5.300781,4.53125,49.8125,9.84375,110.3125,1.0,,5.734375,51.6875,0.080505,0.272217,0.062927,0.245605,5.125,101.875,0.0,,0.450684,24.875,10.820312,36.6875,13.242188,56.09375,22.890625,120.4375,1.0,,20.734375,79.8125,2.212891,6.421875,3.966797,34.09375,6.78125,55.8125,1.0,,5.25,34.46875,120.8125,inf,98.4375,inf,1,122.8125,1,,,122.625,1,0.0,,,7.0,0.304443,0.470459,7.0,0.388916,0.501465,7.0,0.318115,0.476807,7.0,0.350098,0.489502,19.0,0.44186,0.502486,1,0.0,,,0.0,,,0.0,,,0.0,,,4.0,0.125,0.336011,2.0,0.181763,0.404541,1,0.799055,0.799055,1,1,1,0.762451,0.762451,1,0.755761,0.768238,0.768238,0.761273,0.784401,0.763997,0.764084,0.76418,1,1,1,1,1,1,1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
2,2987002,4.094345,4,4663.0,490.0,1,4,166.0,1,330.0,1,287.0,1,35,49,1.0,1.0,1,0.0,0.0,1.0,1,0.0,1.0,0.0,1.0,0.0,1.0,1.0,1,1,1,0,0,0,0,0,0,,,1,,,1,,2,,1,3,2,,1,,,1,1,3,1,1,1,2,2,2,1,4,2,2,2,2,2,180760,1,0.001876,0.064726,0.097098,0.001859,5.8e-05,0.051793,7e-06,252,-14.682021,-16.0,0.0,-15.987198,-21.0,0.0,-16.007143,-21,0,97.75,71.25,78.1875,134.0,130.375,35.3125,130.875,150.875,51.65625,42.65625,25.96875,55.34375,98.25,71.75,119.125,169.625,56.34375,36.84375,43.5625,87.9375,1,1,176.125,185.5,1,1,23.734375,75.25,,1,116.1875,210.125,,0.561523,0.304199,98.25,71.75,104.875,160.875,413.25,71.75,108.125,158.875,1,1,0.399902,1.549805,1,1,46.4375,107.75,1,1,60.53125,139.0,413.25,71.75,140.625,182.375,1,,,0.0,,,0.0,0.649485,0.97366,0.453237,,0.0,,0.0,0.645492,0.453237,0.0,,0.474362,0.475492,0.445148,0.282091,0.524674,0.375809,0.00042,0.000615,1,97.015373,100.131348,133.095764,208.109558,97.90815,135.407013,73.5,17.058722,98.448311,135.899841,0.017591,-0.346086,0.014254,-0.325975,-6.5,7.472656,-7.957031,12.359375,-9.875,13.265625,,,-7.058594,12.132812,104099.445312,48882.363281,127363.296875,138514.234375,134787.03125,144849.515625,,,94994.734375,116771.5625,2.5,5.097656,3.259766,6.726562,2.802734,6.128906,,,3.435547,6.667969,-6.5,11.835938,-6.996094,15.945312,1,-7.148438,17.625,,,-7.265625,17.296875,0.0,0.0,0.118591,0.88916,0.127686,0.924805,,0.173828,0.952148,-340.0,67.0625,-362.0,92.625,-336.0,88.75,,1,-339.0,74.125,,,1.185547,0.772461,0.0,0.0,32.063747,140.180511,24.5,49.0,132.171158,353.848572,0.0,0.0,61.715012,226.757095,0.0,0.0,0.297363,1.296875,0.035923,0.051785,0.459993,66,0.257124,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.758064,0.387676,0.767843,0.659891,0.828393,0.33755,0.542197,10.0625,33.0625,9.359375,59.125,9.40625,53.34375,1.0,0.0,8.945312,32.03125,8.945312,29.140625,9.21875,65.3125,8.835938,59.25,1.5,0.577148,8.234375,30.171875,0.0,0.0,0.000889,0.030685,0.000698,0.028305,1,0.0,0.00067,0.027847,0.010834,0.180054,0.898438,28.703125,0.447021,23.703125,0.0,0.0,0.153931,10.5,8.382812,31.828125,6.382812,27.0625,7.226562,29.390625,0.0,0.0,7.308594,29.53125,7.53125,24.265625,6.746094,35.4375,6.742188,31.640625,1.5,0.577148,6.519531,23.625,0.0,0.0,0.501465,23.859375,0.274658,21.03125,1,0.0,0.0003,0.018295,0.018951,0.273193,1.03125,37.6875,0.544922,32.15625,0.0,0.0,0.121765,5.664062,6.449219,20.296875,5.230469,18.109375,5.769531,18.5625,1.0,0.0,5.835938,18.65625,0.010834,0.127075,1.108398,38.09375,0.597168,32.125,0.0,0.0,0.16626,5.425781,7.328125,24.1875,6.898438,42.6875,6.800781,37.90625,1.0,0.0,6.46875,23.625,0.055054,0.228149,0.774902,33.28125,0.455811,29.625,0.0,0.0,0.073547,0.275635,33.65625,110.6875,32.46875,106.8125,33.0,112.625,3.0,1.826172,32.96875,109.75,7.957031,25.390625,7.0625,29.53125,7.5,27.84375,1.0,0.0,7.390625,24.484375,78.9375,219.125,126.125,inf,1,117.8125,1,287.0,0.0,117.8125,1,9.0,0.818359,0.404541,9.0,0.391357,0.499023,9.0,0.5,0.514648,9.0,0.40918,0.50293,9.0,0.449951,0.510254,19.0,0.44186,0.502486,1,0.0,,,0.0,,,0.0,,,0.0,,,4.0,0.125,0.336011,2.0,0.181763,0.404541,1,0.799055,0.799055,1,1,1,0.762451,0.762451,1,0.755761,0.768238,0.768238,0.761273,0.784401,0.763997,0.764084,0.76418,1,1,1,1,1,1,1,1,0,1,1,0,1,0,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0
3,2987003,3.931826,4,18132.0,567.0,1,2,117.0,1,476.0,1,,1,54,49,2.0,5.0,1,0.0,0.0,4.0,1,0.0,1.0,0.0,1.0,0.0,25.0,1.0,2,2,2,0,1,0,2,2,2,,,1,,,1,,2,,1,3,2,,1,,,1,1,3,1,1,1,2,2,2,1,4,2,2,2,2,2,153415,1,0.007127,0.010426,0.044471,0.006611,0.000435,0.026755,0.000142,229,-14.682021,-16.0,0.0,-15.555556,-17.0,0.0,-15.555556,-17,0,143.125,20.171875,118.0,167.0,143.125,20.171875,175.875,175.375,0.702148,1.287109,26.3125,55.78125,128.75,18.859375,167.875,203.125,0.766113,1.366211,41.28125,85.375,1,1,179.125,235.5,1,1,42.71875,113.1875,,1,175.75,251.5,,0.571777,0.307861,116.0,20.25,155.75,196.25,,,157.875,193.625,1,1,26.8125,108.5625,1,1,43.15625,110.875,1,1,74.125,166.375,145.75,18.859375,197.0,212.25,1,0.0,-0.542432,0.171846,0.0,-0.539373,0.120863,,,0.159712,0.0,0.143075,0.0,0.120863,,0.159712,0.175049,0.175049,0.00124,0.001219,0.123775,0.000784,0.110707,0.001571,0.472935,0.000986,1,123.41539,192.707825,133.86615,216.912796,125.156227,194.968552,66.39286,24.985306,125.144241,195.011383,0.014819,-0.382346,0.012008,-0.368688,-5.433594,9.125,-6.300781,9.554688,-7.144531,11.4375,,,-7.144531,11.4375,87683.242188,92841.789062,105996.234375,116191.359375,107135.671875,110397.210938,,,107135.671875,110397.210938,2.779297,6.367188,2.748047,6.34375,3.0625,6.5,,,3.0625,6.5,-5.132812,15.515625,-5.363281,16.390625,1,-6.667969,16.859375,,,-6.667969,16.859375,-0.184204,3.269531,0.051758,2.259766,0.07666,1.551758,,0.07666,1.551758,-307.75,25.984375,-308.0,26.625,-332.75,71.4375,,1,-332.75,71.4375,,,1.200195,2.369141,204.095245,202.40741,39.570004,216.426956,2284.880859,579.714966,220.284393,589.61261,819.226196,492.60379,88.520142,358.475677,28.515625,15.054688,0.450439,2.447266,0.032019,0.003615,0.459993,66,0.257124,0.177924,0.028738,0.995887,0.765203,0.630669,0.03015,0.758064,0.387676,0.767843,0.659891,0.828393,0.005175,0.542197,11.0,53.90625,10.71875,59.71875,10.15625,48.84375,2.0,0.0,10.164062,48.875,10.75,56.53125,10.492188,63.5625,9.515625,51.78125,5.0,0.0,9.523438,51.8125,0.000475,0.021805,0.003899,0.074219,0.001181,0.035431,1,0.0,0.001181,0.035431,1.294922,33.15625,1.80957,38.9375,0.604004,24.078125,0.0,0.0,0.604492,24.078125,7.773438,29.46875,6.875,27.625,7.890625,29.59375,0.0,0.0,7.894531,29.609375,8.484375,41.375,8.085938,45.21875,7.433594,32.6875,4.0,0.0,7.4375,32.6875,0.000238,0.015411,0.000487,0.022064,0.086121,13.453125,1,0.0,0.086121,13.460938,0.803711,17.703125,1.141602,21.453125,0.48291,22.84375,0.0,0.0,0.483154,22.84375,5.9375,18.75,5.332031,17.515625,6.273438,19.484375,1.0,0.0,6.277344,19.484375,0.705078,16.109375,0.995117,19.15625,0.462891,21.8125,0.0,0.0,0.463135,21.8125,8.226562,41.96875,7.992188,46.65625,7.4375,35.875,1.0,0.0,7.441406,35.875,0.075317,0.284668,0.081726,0.294434,0.200073,18.953125,0.0,0.0,0.200073,18.96875,37.71875,112.75,34.40625,108.1875,39.53125,118.0625,36.625,7.429688,39.53125,118.0625,8.40625,33.25,8.023438,35.28125,8.148438,29.046875,1.0,0.0,8.148438,29.046875,142.125,inf,118.5625,inf,1,101.25,1,,,101.3125,1,0.0,,,9.0,0.391357,0.499023,9.0,0.5,0.514648,9.0,0.40918,0.50293,9.0,0.449951,0.510254,5639.0,131.139542,376.56958,1,0.0,,,0.0,,,0.0,,,0.0,,,5576.0,174.25,429.604248,0.0,0.0,0.0,1,0.799055,0.799055,1,1,1,0.762451,0.762451,1,0.755761,0.768238,0.768238,0.761273,0.784401,0.763997,0.764084,0.76418,1,1,1,1,1,1,0,0,0,0,1,1,1,0,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0
4,2987004,3.931826,1,4497.0,514.0,1,2,102.0,0,420.0,1,,1,16,49,1.0,1.0,1,0.0,0.0,1.0,1,1.0,0.0,1.0,1.0,0.0,1.0,1.0,2,2,2,3,2,2,2,2,2,0.0,70787.0,1,,,1,100.0,1,,0,1,1,166.0,1,542.0,144.0,1,1,3,1,1,1,2,1,1,0,3,1,0,1,1,1,36245,1,3e-05,0.025724,0.049409,7e-06,0.110099,0.026755,2e-06,225,-14.682021,-16.0,0.0,-14.068421,-18.0,0.0,-14.511905,-18,0,0.0,,62.46875,132.5,,,139.25,167.75,,,26.859375,65.6875,,,119.9375,184.625,,,40.46875,90.125,1,1,213.125,123.9375,1,1,16.90625,72.9375,,1,139.5,233.0,,0.592773,0.291748,,,122.75,176.0,,,118.3125,175.875,1,1,42.71875,102.8125,1,1,27.890625,45.4375,1,1,176.75,141.5,,,143.0,195.5,1,,,,,,,,,,,,,,,,0.0,,0.474362,0.475492,0.445148,0.286047,0.524674,0.128733,0.472935,0.150901,1,96.971352,56.631653,224.353806,385.83432,212.483826,345.099121,50.0,,231.598236,359.79068,0.014819,-0.382346,0.012008,-0.368688,-2.5,2.886719,-10.84375,18.515625,-9.28125,12.203125,0.0,,-8.046875,11.210938,92559.5,40373.558594,124166.976562,113564.945312,137190.859375,146466.15625,70787.0,,99874.757812,117258.554688,0.333252,0.577148,2.349609,5.894531,1.811523,5.144531,,,2.328125,5.503906,-4.0,6.929688,-6.375,16.5,1,-5.617188,15.320312,,,-4.789062,14.273438,1.0,1.0,0.095154,0.915039,0.096741,0.95752,,0.085938,0.967773,-360.0,84.875,-356.75,85.125,-330.5,90.75,-480.0,1,-333.75,76.1875,1.0,,1.316406,2.003906,0.0,,2104.952148,10804.65332,0.0,,4186.664062,19393.453125,0.0,,3033.937988,13800.174805,0.0,,34.34375,167.75,0.018743,0.016659,0.055922,60,0.01451,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.177793,0.311232,0.169728,0.659891,0.828393,0.33755,0.542197,6.109375,5.941406,10.421875,81.4375,13.5625,156.375,1.0,,7.589844,65.1875,5.5,5.84375,10.546875,88.75,14.921875,179.25,1.0,,7.355469,73.25,0.0,0.0,0.010071,0.111084,0.006237,0.085449,1,,0.006924,0.090027,0.055542,0.235718,4.246094,57.78125,5.613281,82.0,0.0,,2.576172,44.53125,3.945312,4.855469,3.142578,18.359375,2.482422,16.4375,0.0,,2.785156,17.390625,4.0,3.677734,7.777344,60.84375,7.964844,83.0625,1.0,,5.214844,46.9375,0.0,0.0,0.062561,3.71875,3.574219,72.625,1,,0.287842,18.4375,0.222168,0.427734,2.724609,32.9375,7.058594,112.8125,1.0,,2.228516,35.59375,4.390625,4.285156,2.996094,13.484375,2.238281,10.546875,0.0,,2.509766,11.140625,0.222168,0.427734,2.423828,29.234375,6.671875,111.9375,1.0,,1.879883,36.0,4.554688,3.972656,7.996094,64.0625,9.84375,110.3125,1.0,,5.734375,51.6875,0.0,0.0,0.135864,5.253906,5.125,101.875,0.0,,0.450684,24.875,18.0625,14.429688,20.46875,83.9375,22.890625,120.4375,1.0,,20.734375,79.8125,4.445312,4.074219,6.730469,44.28125,6.78125,55.8125,1.0,,5.25,34.46875,12.25,8.90625,122.625,inf,1,122.8125,1,,,122.625,1,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,19.0,0.44186,0.502486,1,9.0,0.290323,0.461414,10.0,0.526367,0.513184,20.0,0.434783,0.501206,8.0,0.5,0.516113,7.0,0.21875,0.420013,2.0,0.181763,0.404541,1,1.5e-05,0.000559,0,0,0,0.001797,0.003079,0,0.033114,0.768238,0.768238,0.225492,0.784401,0.133151,0.008663,0.001143,1,1,1,1,1,1,1,1,1,1,0,1,1,1,0,1,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1


### Pickle the CatBoost Features

In [16]:
X_train_fe.to_pickle('../data/X_train_fe_engineered_cat.pkl')
y_train.to_pickle('../data/y_train_fe_engineered_cat.pkl')

X_val_fe.to_pickle('../data/X_val_fe_engineered_cat.pkl')
y_val.to_pickle('../data/y_val_fe_engineered_cat.pkl')

with open('../data/cat_cols_fe_engineered_cat.pkl', 'wb') as f:
    pickle.dump(category_cols1, f)
    
with open('../data/feat_cols_fe_engineered_cat.pkl', 'wb') as f:
    pickle.dump(feature_cols1, f)

### LGBM

In [11]:
X_train, X_val, category_cols1 = fe1(X_train, X_val, cols_to_drop, algo='LGBM')

Rare data card1 5134
No intersection in Train card1 20399
Intersection in Train card1 452033
####################
Rare data ProductCD_card1 10509
No intersection in Train ProductCD_card1 33115
Intersection in Train ProductCD_card1 439317
####################
Rare data card1_addr1 21640
No intersection in Train card1_addr1 57867
Intersection in Train card1_addr1 414565
####################
Rare data TransactionAmt_dist2 18260
No intersection in Train TransactionAmt_dist2 49343
Intersection in Train TransactionAmt_dist2 423089
####################
No intersection in Train card2 6102
Intersection in Train card2 466330
####################
No intersection in Train card3 146
Intersection in Train card3 472286
####################
No intersection in Train card4 0
Intersection in Train card4 472432
####################
No intersection in Train card5 7339
Intersection in Train card5 465093
####################
No intersection in Train card6 45
Intersection in Train card6 472387
###############

uid_aggregation: card5_id_14_mean
uid_aggregation: card5_id_14_std
uid_aggregation: uid_id_14_mean
uid_aggregation: uid_id_14_std
uid_aggregation: card3_card5_id_14_mean
uid_aggregation: card3_card5_id_14_std
uid_aggregation: uid_V258_mean
uid_aggregation: uid_V258_std
uid_aggregation: card3_card5_V258_mean
uid_aggregation: card3_card5_V258_std
uid_aggregation: uid_V306_mean
uid_aggregation: uid_V306_std
uid_aggregation: card3_card5_V306_mean
uid_aggregation: card3_card5_V306_std
uid_aggregation: uid_V307_mean
uid_aggregation: uid_V307_std
uid_aggregation: card3_card5_V307_mean
uid_aggregation: card3_card5_V307_std
uid_aggregation: uid_V308_mean
uid_aggregation: uid_V308_std
uid_aggregation: card3_card5_V308_mean
uid_aggregation: card3_card5_V308_std
uid_aggregation: uid_V294_mean
uid_aggregation: uid_V294_std
uid_aggregation: card3_card5_V294_mean
uid_aggregation: card3_card5_V294_std
timeblock frequency encoding: ProductCD_TransactionAmt_DT_D
timeblock frequency encoding: ProductCD_T

processing nan group agg for: ['V143', 'V144', 'V145', 'V150', 'V151', 'V152', 'V159', 'V160', 'V164', 'V165', 'V166']
processing nan group agg for: ['V167', 'V168', 'V172', 'V173', 'V176', 'V177', 'V178', 'V179', 'V181', 'V182', 'V183', 'V186', 'V187', 'V190', 'V191', 'V192', 'V193', 'V196', 'V199', 'V202', 'V203', 'V204', 'V205', 'V206', 'V207', 'V211', 'V212', 'V213', 'V214', 'V215', 'V216']
processing nan group agg for: ['V169', 'V170', 'V171', 'V174', 'V175', 'V180', 'V184', 'V185', 'V188', 'V189', 'V194', 'V195', 'V197', 'V198', 'V200', 'V201', 'V208', 'V209', 'V210']
processing nan group agg for: ['V217', 'V218', 'V219', 'V223', 'V224', 'V225', 'V226', 'V228', 'V229', 'V230', 'V231', 'V232', 'V233', 'V235', 'V236', 'V237', 'V240', 'V241', 'V242', 'V243', 'V244', 'V246', 'V247', 'V248', 'V249', 'V252', 'V253', 'V254', 'V257', 'V258', 'V260', 'V261', 'V262', 'V263', 'V264', 'V265', 'V266', 'V267', 'V268', 'V269', 'V273', 'V274', 'V275', 'V276', 'V277', 'V278']
processing nan group

In [12]:
X_train.head(3)

Unnamed: 0,TransactionAmt,ProductCD,card1,card2,card3,card4,card5,card6,addr1,addr2,dist1,dist2,P_emaildomain,R_emaildomain,C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12,C13,C14,M1,M2,M3,M4,M5,M6,M7,M8,M9,id_01,id_02,id_03,id_04,id_05,id_06,id_07,id_08,id_09,id_10,id_11,id_12,id_13,id_14,id_15,id_16,id_17,id_18,id_19,id_20,id_21,id_22,id_23,id_24,id_25,id_26,id_27,id_28,id_29,id_32,id_34,id_35,id_36,id_37,id_38,DeviceType,is_december,card1_FE_FULL,card2_FE_FULL,card3_FE_FULL,card5_FE_FULL,ProductCD_card1_FE_FULL,card1_addr1_FE_FULL,TransactionAmt_dist2_FE_FULL,uid_FE_FULL,card3_DT_D_hour_dist,card3_DT_D_hour_dist_best,card3_DT_W_week_day_dist_best,card5_DT_D_hour_dist,card5_DT_D_hour_dist_best,card5_DT_W_week_day_dist_best,card3_card5_DT_D_hour_dist,card3_card5_DT_D_hour_dist_best,card3_card5_DT_W_week_day_dist_best,uid_D1_mean,uid_D1_std,card3_card5_D1_mean,card3_card5_D1_std,uid_D2_mean,uid_D2_std,card3_card5_D2_mean,card3_card5_D2_std,uid_D3_mean,uid_D3_std,card3_card5_D3_mean,card3_card5_D3_std,uid_D4_mean,uid_D4_std,card3_card5_D4_mean,card3_card5_D4_std,uid_D5_mean,uid_D5_std,card3_card5_D5_mean,card3_card5_D5_std,uid_D6_mean,uid_D6_std,card3_card5_D6_mean,card3_card5_D6_std,uid_D7_mean,uid_D7_std,card3_card5_D7_mean,card3_card5_D7_std,uid_D8_mean,uid_D8_std,card3_card5_D8_mean,card3_card5_D8_std,uid_D9_mean,uid_D9_std,card3_card5_D9_mean,card3_card5_D9_std,uid_D10_mean,uid_D10_std,card3_card5_D10_mean,card3_card5_D10_std,uid_D11_mean,uid_D11_std,card3_card5_D11_mean,card3_card5_D11_std,uid_D12_mean,uid_D12_std,card3_card5_D12_mean,card3_card5_D12_std,uid_D13_mean,uid_D13_std,card3_card5_D13_mean,card3_card5_D13_std,uid_D14_mean,uid_D14_std,card3_card5_D14_mean,card3_card5_D14_std,uid_D15_mean,uid_D15_std,card3_card5_D15_mean,card3_card5_D15_std,D9_not_na,D8_not_same_day,D8_D9_decimal_dist,D3_DT_D_min_max,D3_DT_D_std_score,D4_DT_D_min_max,D5_DT_D_min_max,D5_DT_D_std_score,D6_DT_D_min_max,D6_DT_D_std_score,D7_DT_D_min_max,D7_DT_D_std_score,D8_DT_D_min_max,D10_DT_D_min_max,D11_DT_D_min_max,D11_DT_D_std_score,D12_DT_D_min_max,D12_DT_D_std_score,D13_DT_D_min_max,D13_DT_D_std_score,D14_DT_D_min_max,D14_DT_D_std_score,D15_DT_D_min_max,D3_DT_W_min_max,D4_DT_W_min_max,D5_DT_W_min_max,D6_DT_W_min_max,D6_DT_W_std_score,D7_DT_W_min_max,D7_DT_W_std_score,D8_DT_W_min_max,D10_DT_W_min_max,D11_DT_W_min_max,D12_DT_W_min_max,D12_DT_W_std_score,D13_DT_W_min_max,D13_DT_W_std_score,D14_DT_W_min_max,D14_DT_W_std_score,D15_DT_W_min_max,D1_scaled,D2_scaled,D1_FE_FULL,D2_FE_FULL,D3_FE_FULL,D4_FE_FULL,D5_FE_FULL,D6_FE_FULL,D7_FE_FULL,D8_FE_FULL,D9_FE_FULL,D10_FE_FULL,D11_FE_FULL,D12_FE_FULL,D13_FE_FULL,D14_FE_FULL,D15_FE_FULL,TransactionAmt_check,card1_TransactionAmt_mean,card1_TransactionAmt_std,card2_TransactionAmt_mean,card2_TransactionAmt_std,card3_TransactionAmt_mean,card3_TransactionAmt_std,card5_TransactionAmt_mean,card5_TransactionAmt_std,uid_TransactionAmt_mean,uid_TransactionAmt_std,card3_card5_TransactionAmt_mean,card3_card5_TransactionAmt_std,TransactionAmt_DT_D_min_max,TransactionAmt_DT_D_std_score,TransactionAmt_DT_W_min_max,TransactionAmt_DT_W_std_score,card1_id_01_mean,card1_id_01_std,card2_id_01_mean,card2_id_01_std,card3_id_01_mean,card3_id_01_std,card5_id_01_mean,card5_id_01_std,uid_id_01_mean,uid_id_01_std,card3_card5_id_01_mean,card3_card5_id_01_std,card1_id_02_mean,card1_id_02_std,card2_id_02_mean,card2_id_02_std,card3_id_02_mean,card3_id_02_std,card5_id_02_mean,card5_id_02_std,uid_id_02_mean,uid_id_02_std,card3_card5_id_02_mean,card3_card5_id_02_std,card1_id_05_mean,card1_id_05_std,card2_id_05_mean,card2_id_05_std,card3_id_05_mean,card3_id_05_std,card5_id_05_mean,card5_id_05_std,uid_id_05_mean,uid_id_05_std,card3_card5_id_05_mean,card3_card5_id_05_std,card1_id_06_mean,card1_id_06_std,card2_id_06_mean,card2_id_06_std,card3_id_06_mean,card3_id_06_std,card5_id_06_mean,card5_id_06_std,uid_id_06_mean,uid_id_06_std,card3_card5_id_06_mean,card3_card5_id_06_std,card1_id_09_mean,card1_id_09_std,card2_id_09_mean,card2_id_09_std,card3_id_09_mean,card3_id_09_std,card5_id_09_mean,card5_id_09_std,uid_id_09_mean,uid_id_09_std,card3_card5_id_09_mean,card3_card5_id_09_std,card1_id_14_mean,card1_id_14_std,card2_id_14_mean,card2_id_14_std,card3_id_14_mean,card3_id_14_std,card5_id_14_mean,card5_id_14_std,uid_id_14_mean,uid_id_14_std,card3_card5_id_14_mean,card3_card5_id_14_std,uid_V258_mean,uid_V258_std,card3_card5_V258_mean,card3_card5_V258_std,uid_V306_mean,uid_V306_std,card3_card5_V306_mean,card3_card5_V306_std,uid_V307_mean,uid_V307_std,card3_card5_V307_mean,card3_card5_V307_std,uid_V308_mean,uid_V308_std,card3_card5_V308_mean,card3_card5_V308_std,uid_V294_mean,uid_V294_std,card3_card5_V294_mean,card3_card5_V294_std,ProductCD_TransactionAmt_DT_D,ProductCD_TransactionAmt_FE_FULL,ProductCD_cents_FE_FULL,c_cols_0_bin,c_cols_0_bin_FE_FULL,C1_FE_FULL,C2_FE_FULL,C3_FE_FULL,C4_FE_FULL,C5_FE_FULL,C6_FE_FULL,C7_FE_FULL,C8_FE_FULL,C9_FE_FULL,C10_FE_FULL,C11_FE_FULL,C12_FE_FULL,C13_FE_FULL,C14_FE_FULL,card1_C1_mean,card1_C1_std,card2_C1_mean,card2_C1_std,card3_C1_mean,card3_C1_std,card5_C1_mean,card5_C1_std,uid_C1_mean,uid_C1_std,card3_card5_C1_mean,card3_card5_C1_std,card1_C2_mean,card1_C2_std,card2_C2_mean,card2_C2_std,card3_C2_mean,card3_C2_std,card5_C2_mean,card5_C2_std,uid_C2_mean,uid_C2_std,card3_card5_C2_mean,card3_card5_C2_std,card1_C3_mean,card1_C3_std,card2_C3_mean,card2_C3_std,card3_C3_mean,card3_C3_std,card5_C3_mean,card5_C3_std,uid_C3_mean,uid_C3_std,card3_card5_C3_mean,card3_card5_C3_std,card1_C4_mean,card1_C4_std,card2_C4_mean,card2_C4_std,card3_C4_mean,card3_C4_std,card5_C4_mean,card5_C4_std,uid_C4_mean,uid_C4_std,card3_card5_C4_mean,card3_card5_C4_std,card1_C5_mean,card1_C5_std,card2_C5_mean,card2_C5_std,card3_C5_mean,card3_C5_std,card5_C5_mean,card5_C5_std,uid_C5_mean,uid_C5_std,card3_card5_C5_mean,card3_card5_C5_std,card1_C6_mean,card1_C6_std,card2_C6_mean,card2_C6_std,card3_C6_mean,card3_C6_std,card5_C6_mean,card5_C6_std,uid_C6_mean,uid_C6_std,card3_card5_C6_mean,card3_card5_C6_std,card1_C7_mean,card1_C7_std,card2_C7_mean,card2_C7_std,card3_C7_mean,card3_C7_std,card5_C7_mean,card5_C7_std,uid_C7_mean,uid_C7_std,card3_card5_C7_mean,card3_card5_C7_std,card1_C8_mean,card1_C8_std,card2_C8_mean,card2_C8_std,card3_C8_mean,card3_C8_std,card5_C8_mean,card5_C8_std,uid_C8_mean,uid_C8_std,card3_card5_C8_mean,card3_card5_C8_std,card1_C9_mean,card1_C9_std,card2_C9_mean,card2_C9_std,card3_C9_mean,card3_C9_std,card5_C9_mean,card5_C9_std,uid_C9_mean,uid_C9_std,card3_card5_C9_mean,card3_card5_C9_std,card1_C10_mean,card1_C10_std,card2_C10_mean,card2_C10_std,card3_C10_mean,card3_C10_std,card5_C10_mean,card5_C10_std,uid_C10_mean,uid_C10_std,card3_card5_C10_mean,card3_card5_C10_std,card1_C11_mean,card1_C11_std,card2_C11_mean,card2_C11_std,card3_C11_mean,card3_C11_std,card5_C11_mean,card5_C11_std,uid_C11_mean,uid_C11_std,card3_card5_C11_mean,card3_card5_C11_std,card1_C12_mean,card1_C12_std,card2_C12_mean,card2_C12_std,card3_C12_mean,card3_C12_std,card5_C12_mean,card5_C12_std,uid_C12_mean,uid_C12_std,card3_card5_C12_mean,card3_card5_C12_std,card1_C13_mean,card1_C13_std,card2_C13_mean,card2_C13_std,card3_C13_mean,card3_C13_std,card5_C13_mean,card5_C13_std,uid_C13_mean,uid_C13_std,card3_card5_C13_mean,card3_card5_C13_std,card1_C14_mean,card1_C14_std,card2_C14_mean,card2_C14_std,card3_C14_mean,card3_C14_std,card5_C14_mean,card5_C14_std,uid_C14_mean,uid_C14_std,card3_card5_C14_mean,card3_card5_C14_std,card1_dist1_mean,card1_dist1_std,card2_dist1_mean,card2_dist1_std,card3_dist1_mean,card3_dist1_std,card5_dist1_mean,card5_dist1_std,uid_dist1_mean,uid_dist1_std,card3_card5_dist1_mean,card3_card5_dist1_std,nan_group_0_sum,nan_group_0_mean,nan_group_0_std,nan_group_1_sum,nan_group_1_mean,nan_group_1_std,nan_group_2_sum,nan_group_2_mean,nan_group_2_std,nan_group_3_sum,nan_group_3_mean,nan_group_3_std,nan_group_4_sum,nan_group_4_mean,nan_group_4_std,nan_group_5_sum,nan_group_5_mean,nan_group_5_std,nan_group_6_sum,nan_group_6_mean,nan_group_6_std,nan_group_7_sum,nan_group_7_mean,nan_group_7_std,nan_group_8_sum,nan_group_8_mean,nan_group_8_std,nan_group_9_sum,nan_group_9_mean,nan_group_9_std,nan_group_10_sum,nan_group_10_mean,nan_group_10_std,nan_group_11_sum,nan_group_11_mean,nan_group_11_std,nan_group_12_sum,nan_group_12_mean,nan_group_12_std,nan_group_13_sum,nan_group_13_mean,nan_group_13_std,nan_group_14_sum,nan_group_14_mean,nan_group_14_std,DeviceInfo_FE_FULL,DeviceInfo_device_FE_FULL,DeviceInfo_version_FE_FULL,id_30_FE_FULL,id_30_device_FE_FULL,id_30_version_FE_FULL,id_31_FE_FULL,id_31_device_FE_FULL,id_33_FE_FULL,id_01_FE_FULL,id_03_FE_FULL,id_04_FE_FULL,id_05_FE_FULL,id_06_FE_FULL,id_07_FE_FULL,id_08_FE_FULL,id_09_FE_FULL,id_10_FE_FULL,id_11_FE_FULL,id_13_FE_FULL,id_14_FE_FULL,id_17_FE_FULL,id_18_FE_FULL,id_19_FE_FULL,id_20_FE_FULL,id_21_FE_FULL,id_22_FE_FULL,id_24_FE_FULL,id_25_FE_FULL,id_26_FE_FULL
0,4.241327,4,13926.0,327.0,150.0,1,142.0,0,315.0,87.0,19.0,,48,49,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,2.0,0.0,1.0,1.0,1,1,1,2,0,1,2,2,2,,,,,,,,,,,,2,,,3,2,,,,,,,3,,,,2,2,2,,4,2,2,2,2,2,1,7.3e-05,0.010893,0.885029,0.000476,5.2e-05,0.110099,0.001429,2e-06,-14.682021,-16.0,0.0,-10.0,-20.0,-2.0,-10.0,-20,-2,14.0,,49.78125,98.8125,,,96.8125,120.5625,13.0,,23.21875,46.71875,,,86.1875,140.75,,,39.3125,80.25,,,0.0,0.0,,,,,,,271.25,inf,,,0.71875,0.171387,13.0,,67.375,123.5,13.0,,89.3125,149.75,,,,,,,,,,,6.667969,16.328125,0.0,,116.9375,178.25,0,0,,0.026694,-0.29547,,,,,,,,0.0,0.018705,0.026804,-0.866757,,,,,,,0.0,0.026694,,,,,,,0.0,0.018705,0.026639,,,,,,,0.0,0.021881,,0.005045,0.475492,0.010541,0.286047,0.524674,0.876068,0.934099,0.873123,0.873123,0.003204,0.001929,0.89041,0.895093,0.894695,0.295088,1,351.936035,371.153198,269.468292,404.310577,146.422134,243.120865,189.505783,331.647064,68.5,,189.505783,331.647064,0.020518,-0.307812,0.016626,-0.280889,-5.0,0.0,-7.820312,13.15625,-8.03125,12.726562,-8.421875,9.726562,,,-8.421875,9.726562,153111.0,96778.710938,112361.515625,122155.867188,116327.4375,121059.523438,168419.765625,99183.804688,,,168419.765625,99183.804688,4.332031,10.273438,2.316406,5.585938,2.523438,5.929688,5.875,9.867188,,,5.875,9.867188,-6.667969,12.9375,-5.222656,13.703125,-5.96875,15.507812,-15.6875,26.109375,,,-15.6875,26.109375,0.666504,1.154297,0.079834,0.849121,0.108398,1.078125,1.5,1.915039,,,1.5,1.915039,-375.0,81.4375,-335.75,67.8125,-347.25,85.375,-374.0,75.125,,,-374.0,75.125,,,1.0,0.0,0.0,,133.509964,606.992615,117.0,,187.089355,622.895142,0.0,,157.259933,624.966431,1.0,,0.238403,0.753418,0.002538,0.001411,0.042517,66,0.257124,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.885871,0.758064,0.387676,0.767843,0.151856,0.828393,0.33755,0.542197,1.186523,0.450195,9.25,75.875,9.789062,58.5625,6.800781,23.671875,1.0,,6.800781,23.671875,1.186523,0.393799,9.1875,82.375,9.445312,63.53125,6.578125,22.875,1.0,,6.578125,22.875,0.0,0.0,0.019119,0.146851,0.006351,0.159912,0.0,0.0,0.0,,0.0,0.0,0.302246,0.637695,3.996094,54.03125,1.552734,35.21875,0.071167,0.308105,0.0,,0.071167,0.308105,0.093018,0.293945,2.386719,16.328125,6.265625,27.265625,6.089844,28.25,0.0,,6.089844,28.25,1.395508,1.049805,6.582031,56.5625,7.152344,41.0625,5.355469,18.6875,1.0,,5.355469,18.6875,0.0,0.0,0.0,0.0,0.17395,14.203125,0.0,0.0,0.0,,0.0,0.0,0.348877,0.650391,2.728516,31.078125,1.201172,28.171875,0.074707,0.324219,0.0,,0.074707,0.324219,0.744141,0.693359,2.382812,11.53125,5.042969,17.59375,4.519531,15.570312,1.0,,4.519531,15.570312,0.325684,0.565918,2.613281,27.078125,1.239258,27.609375,0.067627,0.291016,0.0,,0.067627,0.291016,1.302734,0.599121,6.992188,59.71875,7.21875,44.5,4.832031,16.40625,2.0,,4.832031,16.40625,0.0,0.0,0.068542,0.253174,0.314697,19.859375,0.067627,0.251465,0.0,,0.067627,0.251465,1.279297,0.766113,17.0,73.625,32.5,107.3125,26.515625,103.375,1.0,,26.515625,103.375,1.023438,0.407715,5.890625,41.21875,7.386719,32.875,5.738281,19.546875,1.0,,5.738281,19.546875,37.5,50.78125,136.75,inf,118.25,inf,87.3125,232.25,19.0,,87.3125,232.25,9.0,0.818359,0.404541,9.0,0.391357,0.499023,0.0,,,9.0,0.40918,0.50293,7.0,0.350098,0.489502,255.0,5.930233,24.826441,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,239.0,7.46875,28.735708,2.0,0.181763,0.404541,0.0,,,0.799055,0.799055,0.936223,0.868654,0.868654,0.874068,0.762451,0.762451,0.875895,0.755761,0.887689,0.887689,0.768238,0.768238,0.991271,0.991271,0.873123,0.873123,0.761273,0.784401,0.864456,0.763997,0.923607,0.764084,0.76418,0.991264,0.991247,0.991962,0.99131,0.991257
1,3.401197,4,2755.0,404.0,150.0,2,102.0,0,325.0,87.0,,,16,49,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,2,2,2,0,1,1,2,2,2,,,,,,,,,,,,2,,,3,2,,,,,,,3,,,,2,2,2,,4,2,2,2,2,2,1,0.001157,0.005219,0.885029,0.049409,0.001048,0.000103,0.01817,2e-06,-14.682021,-16.0,0.0,-14.068421,-18.0,0.0,-14.511905,-18,0,0.0,,62.46875,132.5,,,139.25,167.75,,,26.859375,65.6875,0.0,,119.9375,184.625,,,40.46875,90.125,,,213.125,123.9375,,,16.90625,72.9375,,,139.5,233.0,,,0.592773,0.291748,0.0,,122.75,176.0,,,118.3125,175.875,,,42.71875,102.8125,,,27.890625,45.4375,,,176.75,141.5,0.0,,143.0,195.5,0,0,,,,0.0,,,,,,,0.0,0.0,,,,,,,,,0.0,,0.0,,,,,,0.0,0.0,,,,,,,,0.0,0.0,,0.474362,0.475492,0.445148,0.282091,0.524674,0.876068,0.934099,0.873123,0.873123,0.375809,0.472935,0.89041,0.895093,0.894695,0.295088,1,232.698868,441.577118,227.853912,356.084045,146.422134,243.120865,212.483826,345.099121,29.0,,231.598236,359.79068,0.00835,-0.466953,0.006766,-0.468352,-7.539062,11.101562,-7.890625,13.65625,-8.03125,12.726562,-9.28125,12.203125,,,-8.046875,11.210938,153593.109375,189083.765625,121260.648438,120521.335938,116327.4375,121059.523438,137190.859375,146466.15625,,,99874.757812,117258.554688,2.791016,6.511719,2.591797,6.0625,2.523438,5.929688,1.811523,5.144531,,,2.328125,5.503906,-8.867188,24.765625,-6.382812,16.578125,-5.96875,15.507812,-5.617188,15.320312,,,-4.789062,14.273438,0.342773,0.872559,0.080688,1.503906,0.108398,1.078125,0.096741,0.95752,,,0.085938,0.967773,-352.25,67.4375,-333.75,70.625,-347.25,85.375,-330.5,90.75,,,-333.75,76.1875,,,1.316406,2.003906,0.0,,2104.952148,10804.65332,0.0,,4186.664062,19393.453125,0.0,,3033.937988,13800.174805,0.0,,34.34375,167.75,0.009762,0.018083,0.459993,71,0.024615,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.885871,0.758064,0.311232,0.767843,0.659891,0.828393,0.33755,0.542197,2.59375,7.933594,6.050781,63.0625,9.789062,58.5625,13.5625,156.375,1.0,,7.589844,65.1875,2.244141,7.0,6.003906,69.0625,9.445312,63.53125,14.921875,179.25,1.0,,7.355469,73.25,0.008781,0.10791,0.012657,0.117432,0.006351,0.159912,0.006237,0.085449,0.0,,0.006924,0.090027,0.08197,0.300049,2.533203,45.8125,1.552734,35.21875,5.613281,82.0,0.0,,2.576172,44.53125,1.043945,8.648438,1.347656,10.695312,6.265625,27.265625,2.482422,16.4375,0.0,,2.785156,17.390625,1.84082,5.753906,4.414062,47.34375,7.152344,41.0625,7.964844,83.0625,1.0,,5.214844,46.9375,0.001464,0.038269,0.00292,0.059662,0.17395,14.203125,3.574219,72.625,0.0,,0.287842,18.4375,0.117126,0.406494,1.708984,24.53125,1.201172,28.171875,7.058594,112.8125,0.0,,2.228516,35.59375,1.818359,5.109375,1.725586,7.539062,5.042969,17.59375,2.238281,10.546875,0.0,,2.509766,11.140625,0.127319,0.471924,1.557617,22.296875,1.239258,27.609375,6.671875,111.9375,0.0,,1.879883,36.0,1.762695,5.300781,4.53125,49.8125,7.21875,44.5,9.84375,110.3125,1.0,,5.734375,51.6875,0.080505,0.272217,0.062927,0.245605,0.314697,19.859375,5.125,101.875,0.0,,0.450684,24.875,10.820312,36.6875,13.242188,56.09375,32.5,107.3125,22.890625,120.4375,1.0,,20.734375,79.8125,2.212891,6.421875,3.966797,34.09375,7.386719,32.875,6.78125,55.8125,1.0,,5.25,34.46875,120.8125,inf,98.4375,inf,118.25,inf,122.8125,inf,,,122.625,inf,0.0,,,7.0,0.304443,0.470459,7.0,0.388916,0.501465,7.0,0.318115,0.476807,7.0,0.350098,0.489502,19.0,0.44186,0.502486,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,4.0,0.125,0.336011,2.0,0.181763,0.404541,0.0,,,0.799055,0.799055,0.936223,0.868654,0.868654,0.874068,0.762451,0.762451,0.875895,0.755761,0.887689,0.887689,0.768238,0.768238,0.991271,0.991271,0.873123,0.873123,0.761273,0.784401,0.864456,0.763997,0.923607,0.764084,0.76418,0.991264,0.991247,0.991962,0.99131,0.991257
2,4.094345,4,4663.0,490.0,150.0,4,166.0,1,330.0,87.0,287.0,,35,49,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,1,1,1,0,0,0,0,0,0,,,,,,,,,,,,2,,,3,2,,,,,,,3,,,,2,2,2,,4,2,2,2,2,2,1,0.001876,0.064726,0.885029,0.097098,0.001859,5.8e-05,0.051793,7e-06,-14.682021,-16.0,0.0,-15.987198,-21.0,0.0,-16.007143,-21,0,97.75,71.25,78.1875,134.0,130.375,35.3125,130.875,150.875,51.65625,42.65625,25.96875,55.34375,98.25,71.75,119.125,169.625,56.34375,36.84375,43.5625,87.9375,,,176.125,185.5,,,23.734375,75.25,,,116.1875,210.125,,,0.561523,0.304199,98.25,71.75,104.875,160.875,413.25,71.75,108.125,158.875,,,0.399902,1.549805,,,46.4375,107.75,,,60.53125,139.0,413.25,71.75,140.625,182.375,0,0,,,,0.0,,,,,,,0.0,0.0,0.649485,0.97366,,,,,,,0.453237,,0.0,,,,,,0.0,0.0,0.645492,,,,,,,0.453237,0.0,,0.474362,0.475492,0.445148,0.282091,0.524674,0.876068,0.934099,0.873123,0.873123,0.375809,0.00042,0.89041,0.895093,0.894695,0.000615,1,97.015373,100.131348,133.095764,208.109558,146.422134,243.120865,97.90815,135.407013,73.5,17.058722,98.448311,135.899841,0.017591,-0.346086,0.014254,-0.325975,-6.5,7.472656,-7.957031,12.359375,-8.03125,12.726562,-9.875,13.265625,,,-7.058594,12.132812,104099.445312,48882.363281,127363.296875,138514.234375,116327.4375,121059.523438,134787.03125,144849.515625,,,94994.734375,116771.5625,2.5,5.097656,3.259766,6.726562,2.523438,5.929688,2.802734,6.128906,,,3.435547,6.667969,-6.5,11.835938,-6.996094,15.945312,-5.96875,15.507812,-7.148438,17.625,,,-7.265625,17.296875,0.0,0.0,0.118591,0.88916,0.108398,1.078125,0.127686,0.924805,,,0.173828,0.952148,-340.0,67.0625,-362.0,92.625,-347.25,85.375,-336.0,88.75,,,-339.0,74.125,,,1.185547,0.772461,0.0,0.0,32.063747,140.180511,24.5,49.0,132.171158,353.848572,0.0,0.0,61.715012,226.757095,0.0,0.0,0.297363,1.296875,0.035923,0.051785,0.459993,66,0.257124,0.536443,0.535545,0.995887,0.765203,0.630669,0.578372,0.885871,0.758064,0.387676,0.767843,0.659891,0.828393,0.33755,0.542197,10.0625,33.0625,9.359375,59.125,9.789062,58.5625,9.40625,53.34375,1.0,0.0,8.945312,32.03125,8.945312,29.140625,9.21875,65.3125,9.445312,63.53125,8.835938,59.25,1.5,0.577148,8.234375,30.171875,0.0,0.0,0.000889,0.030685,0.006351,0.159912,0.000698,0.028305,0.0,0.0,0.00067,0.027847,0.010834,0.180054,0.898438,28.703125,1.552734,35.21875,0.447021,23.703125,0.0,0.0,0.153931,10.5,8.382812,31.828125,6.382812,27.0625,6.265625,27.265625,7.226562,29.390625,0.0,0.0,7.308594,29.53125,7.53125,24.265625,6.746094,35.4375,7.152344,41.0625,6.742188,31.640625,1.5,0.577148,6.519531,23.625,0.0,0.0,0.501465,23.859375,0.17395,14.203125,0.274658,21.03125,0.0,0.0,0.0003,0.018295,0.018951,0.273193,1.03125,37.6875,1.201172,28.171875,0.544922,32.15625,0.0,0.0,0.121765,5.664062,6.449219,20.296875,5.230469,18.109375,5.042969,17.59375,5.769531,18.5625,1.0,0.0,5.835938,18.65625,0.010834,0.127075,1.108398,38.09375,1.239258,27.609375,0.597168,32.125,0.0,0.0,0.16626,5.425781,7.328125,24.1875,6.898438,42.6875,7.21875,44.5,6.800781,37.90625,1.0,0.0,6.46875,23.625,0.055054,0.228149,0.774902,33.28125,0.314697,19.859375,0.455811,29.625,0.0,0.0,0.073547,0.275635,33.65625,110.6875,32.46875,106.8125,32.5,107.3125,33.0,112.625,3.0,1.826172,32.96875,109.75,7.957031,25.390625,7.0625,29.53125,7.386719,32.875,7.5,27.84375,1.0,0.0,7.390625,24.484375,78.9375,219.125,126.125,inf,118.25,inf,117.8125,inf,287.0,0.0,117.8125,inf,9.0,0.818359,0.404541,9.0,0.391357,0.499023,9.0,0.5,0.514648,9.0,0.40918,0.50293,9.0,0.449951,0.510254,19.0,0.44186,0.502486,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,0.0,,,4.0,0.125,0.336011,2.0,0.181763,0.404541,0.0,,,0.799055,0.799055,0.936223,0.868654,0.868654,0.874068,0.762451,0.762451,0.875895,0.755761,0.887689,0.887689,0.768238,0.768238,0.991271,0.991271,0.873123,0.873123,0.761273,0.784401,0.864456,0.763997,0.923607,0.764084,0.76418,0.991264,0.991247,0.991962,0.99131,0.991257


### Pickle the LGBM features
We pickle the datasets to be unpackaged later for use in modeling.

In [16]:
X_train.to_pickle('../data/X_train_fe_engineered.pkl')
y_train.to_pickle('../data/y_train_fe_engineered.pkl')

X_val.to_pickle('../data/X_val_fe_engineered.pkl')
y_val.to_pickle('../data/y_val_fe_engineered.pkl')

with open('../data/cat_cols_fe_engineered.pkl', 'wb') as f:
    pickle.dump(category_cols1, f)

## Next Step - Modeling