## Tabular Playground Series March 2021

<img src="https://i.imgur.com/uHVJtv0.png">



<br><br>

### Notebook Contents:

0. [**Imports, Data Loading and Preprocessing**](#loading)

1. [**Optuna Hyperparameter Optimization**](#optuna)

2. [**Submission**](#submission)

In [3]:
pip install optuna

Collecting optuna
  Downloading optuna-2.6.0-py3-none-any.whl (293 kB)
[K     |████████████████████████████████| 293 kB 4.5 MB/s eta 0:00:01
Collecting cliff
  Downloading cliff-3.7.0-py3-none-any.whl (80 kB)
[K     |████████████████████████████████| 80 kB 10.7 MB/s eta 0:00:01
[?25hCollecting alembic
  Downloading alembic-1.5.7-py2.py3-none-any.whl (159 kB)
[K     |████████████████████████████████| 159 kB 15.2 MB/s eta 0:00:01
[?25hCollecting sqlalchemy>=1.1.0
  Downloading SQLAlchemy-1.4.2-cp38-cp38-macosx_10_14_x86_64.whl (1.4 MB)
[K     |████████████████████████████████| 1.4 MB 24.2 MB/s eta 0:00:01
[?25hCollecting colorlog
  Downloading colorlog-4.8.0-py2.py3-none-any.whl (10 kB)
Collecting cmaes>=0.8.2
  Downloading cmaes-0.8.2-py3-none-any.whl (15 kB)
Collecting greenlet!=0.4.17
  Downloading greenlet-1.0.0-cp38-cp38-macosx_10_14_x86_64.whl (86 kB)
[K     |████████████████████████████████| 86 kB 9.6 MB/s  eta 0:00:01
Collecting Mako
  Downloading Mako-1.1.4-py2.py3-none-

<a id="loading"></a>

##### 0. Imports, Data Loading and Preprocessing

In [9]:
import numpy as np
import pandas as pd
pd.options.display.max_columns = 100
from lightgbm import LGBMClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import QuantileTransformer, StandardScaler
from sklearn.feature_selection import VarianceThreshold, SelectKBest
import warnings
warnings.filterwarnings('ignore')
import optuna
import os
root_path = '../../input'

In [10]:
train = pd.read_csv(os.path.join(root_path, 'train.csv'))
test = pd.read_csv(os.path.join(root_path, 'test.csv'))
sample_submission = pd.read_csv(os.path.join(root_path, 'sample_submission.csv'))

categorical_cols = [i for i in train.columns if 'cat' in i]

dataset = pd.concat([train, test], axis = 0, ignore_index = True)
train_len = len(train)

In [11]:
float_cols = list(set(dataset.select_dtypes(['float']).columns.tolist()) - set(['target']))

In [12]:
for col in float_cols:
    transformer = QuantileTransformer(n_quantiles=100, 
                                      random_state=0, output_distribution="normal")   # from optimal commit 9
    data_len = len(dataset)
    raw_vec = dataset[col].values.reshape(data_len, 1)
    transformer.fit(raw_vec)
    dataset[col+"_qt"] = transformer.transform(raw_vec)

In [13]:
dataset = pd.get_dummies(dataset, columns = categorical_cols)
train_preprocessed = dataset.iloc[:train_len, ]
test_preprocessed = dataset.iloc[train_len:, ]

test_cols_always_0 = (test_preprocessed.drop('target',1).sum(axis = 0)
                      .rename("n_non_null").to_frame().query("n_non_null == 0").index.tolist())

features = list(set(train_preprocessed.drop(['id', 'target'], 1).columns.tolist()) - set(test_cols_always_0))

assert train_preprocessed.shape[1] == test_preprocessed.shape[1]

**Disclaimer:** 

I did not inspect whether some of the categorical columns in train have values not present in test or viceversa. A simple solution would be to drop them in train if you knew it beforehand.

<a id="optuna"></a>

### Optuna

Look [here](https://optuna.readthedocs.io/en/stable/tutorial/) for reference about Optuna library. 

Look [here](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMClassifier.html) for a set of Lightgbm Classifier hyperparameters.


Skip and go [here](#hyperparams) to find my best parameters.

In [14]:
#Set to False if you want to skip it

OPTUNA_OPTIMIZATION = True

In [15]:
N_SPLITS = 10 #Number of folds for validation
N_TRIALS = 50 #Number of trials to find best hyperparameters

In [16]:
def objective(trial, cv=StratifiedKFold(N_SPLITS, shuffle = True, random_state = 29)):
    
    
    param_lgb = {
        "random_state": trial.suggest_int("random_state", 1, 100),
        "objective": "binary",
        "metric": "binary_logloss",
        "verbosity": -1,
        "boosting_type": "gbdt",
        "reg_alpha": trial.suggest_float("reg_alpha", 1e-8, 10.0, log=True),
        "reg_lambda": trial.suggest_float("reg_lambda", 1e-8, 10.0, log=True),
        "max_depth": trial.suggest_int("max_depth", -1, 10),
        "n_estimators": trial.suggest_int("n_estimators", 50, 500),
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "colsample_bytree": trial.suggest_float("colsample_bytree", 0.4, 1.0),
        "subsample": trial.suggest_float("subsample", 0.4, 1.0),
        "subsample_freq": trial.suggest_int("subsample_freq", 1, 7),
        "min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
    }
    
    model = LGBMClassifier(**param_lgb)
    
    val_aucs = []
    aucs = []
    
    for kfold, (train_idx, val_idx) in enumerate(cv.split(train_preprocessed[features].values, train['target'].values)):
        
        model.fit(train_preprocessed.loc[train_idx, features], train_preprocessed.loc[train_idx, 'target'])
        print('Fitted {}'.format(type(model).__name__))
        val_true = train.loc[val_idx, 'target'].values
        
        preds = model.predict(train_preprocessed.loc[val_idx, features])
        
        auc = roc_auc_score(val_true, preds)
        
        print('Fold: {}\t AUC: {}\n'.format(kfold, auc))
        aucs.append(auc)
    
    print('Average AUC: {}'.format(np.average(auc)))
    return np.average(aucs)

In [None]:
if OPTUNA_OPTIMIZATION:
    study = optuna.create_study(study_name = 'lgbm_parameter_opt', direction="maximize")
    study.optimize(objective, n_trials=N_TRIALS) 
    
    trial = study.best_trial
    
    print("  Value: {}".format(trial.value))
    
    print("  Params: ")
    for key, value in trial.params.items():
        print("    {}: {}".format(key, value))
else:
    trial = {"reg_alpha": 0.362136938773081,
             "reg_lambda": 2.930297242488071,
             "max_depth": 10,
             "n_estimators": 306,
             "num_leaves": 71,
             "colsample_bytree": 0.7121396258381646,
             "subsample": 0.793959734582999,
             "subsample_freq": 2,
             "min_child_samples": 18}

[32m[I 2021-03-22 14:02:51,857][0m A new study created in memory with name: lgbm_parameter_opt[0m


Fitted LGBMClassifier
Fold: 0	 AUC: 0.77603247991778

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7743390699409278

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7748151740490261

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7717161233720741

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7771396746210494

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7770258485732717

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7755892276950306

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7743321366352652

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7768273620128942

Fitted LGBMClassifier


[32m[I 2021-03-22 14:10:54,984][0m Trial 0 finished with value: 0.7748517129890805 and parameters: {'random_state': 54, 'reg_alpha': 3.103717893483926e-07, 'reg_lambda': 0.001309168497610965, 'max_depth': 4, 'n_estimators': 444, 'num_leaves': 204, 'colsample_bytree': 0.43956080673594555, 'subsample': 0.9894353835585619, 'subsample_freq': 7, 'min_child_samples': 46}. Best is trial 0 with value: 0.7748517129890805.[0m


Fold: 9	 AUC: 0.7707000330734854

Average AUC: 0.7707000330734854
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7662981016483555

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7657248456646158

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7642956260188394

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7589821859353484

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7639102084075216

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7616060398256839

Fitted LGBMClassifier
Fold: 6	 AUC: 0.761184083978846

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7618424969341093

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7640527035315531

Fitted LGBMClassifier


[32m[I 2021-03-22 14:34:05,328][0m Trial 1 finished with value: 0.7627437947974908 and parameters: {'random_state': 43, 'reg_alpha': 2.1716166620751105, 'reg_lambda': 1.697400524470599e-05, 'max_depth': 1, 'n_estimators': 278, 'num_leaves': 113, 'colsample_bytree': 0.9712223303913177, 'subsample': 0.7443715832551345, 'subsample_freq': 3, 'min_child_samples': 100}. Best is trial 0 with value: 0.7748517129890805.[0m


Fold: 9	 AUC: 0.7595416560300344

Average AUC: 0.7595416560300344
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7793609973332508

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7780654963307576

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7771068227357004

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7743038784215692

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7771683436973033

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7792777120689347

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7755480387236134

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7761481491406433

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7783116372414131

Fitted LGBMClassifier


[32m[I 2021-03-22 15:48:31,484][0m Trial 2 finished with value: 0.776966013275581 and parameters: {'random_state': 80, 'reg_alpha': 1.4874281280228713e-05, 'reg_lambda': 3.7474775963656076e-05, 'max_depth': 0, 'n_estimators': 401, 'num_leaves': 176, 'colsample_bytree': 0.9233645688008493, 'subsample': 0.8318454482716555, 'subsample_freq': 1, 'min_child_samples': 95}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7743690570626237

Average AUC: 0.7743690570626237
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7716629109644175

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7717540653880001

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7698996429398942

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7659626780213252

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7698209798793529

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7699833675988901

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7712802723817892

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7663411623269804

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7719345025279574

Fitted LGBMClassifier


[32m[I 2021-03-22 15:54:13,777][0m Trial 3 finished with value: 0.7693936702386086 and parameters: {'random_state': 26, 'reg_alpha': 0.02304443766443416, 'reg_lambda': 0.00023986021923115463, 'max_depth': 2, 'n_estimators': 491, 'num_leaves': 180, 'colsample_bytree': 0.6299648299104261, 'subsample': 0.6590852202024423, 'subsample_freq': 4, 'min_child_samples': 68}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7652971203574792

Average AUC: 0.7652971203574792
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7764928457073457

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7763878246724382

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7757613562810419

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7703063283950012

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7767829546181076

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7772627450763772

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7748221073546885

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7732725962822736

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7771725265063985

Fitted LGBMClassifier


[32m[I 2021-03-22 15:57:07,634][0m Trial 4 finished with value: 0.7749222622638012 and parameters: {'random_state': 87, 'reg_alpha': 4.8538412341497015, 'reg_lambda': 0.4402351886068279, 'max_depth': 6, 'n_estimators': 161, 'num_leaves': 44, 'colsample_bytree': 0.47961199131211063, 'subsample': 0.5244731757610268, 'subsample_freq': 7, 'min_child_samples': 60}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7709613377443411

Average AUC: 0.7709613377443411
Fitted LGBMClassifier
Fold: 0	 AUC: 0.775357889249304

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7753107028258283

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7728316551614117

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7701175541689755

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7752195484022457

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7759533489304369

Fitted LGBMClassifier
Fold: 6	 AUC: 0.774338134087324

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7709739857578147

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7758858019846534

Fitted LGBMClassifier


[32m[I 2021-03-22 16:02:56,037][0m Trial 5 finished with value: 0.7736801452012172 and parameters: {'random_state': 90, 'reg_alpha': 1.0119255910300629, 'reg_lambda': 0.002789337135652381, 'max_depth': 5, 'n_estimators': 405, 'num_leaves': 6, 'colsample_bytree': 0.9063006803103841, 'subsample': 0.5407238082912899, 'subsample_freq': 3, 'min_child_samples': 33}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7708128314441767

Average AUC: 0.7708128314441767
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7684010046411264

Fitted LGBMClassifier
Fold: 1	 AUC: 0.769783037863427

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7657993544481841

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7607824886543157

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7672387543551146

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7668325082946872

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7652876194260474

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7666456057758693

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7688761728956208

Fitted LGBMClassifier


[32m[I 2021-03-22 16:05:49,882][0m Trial 6 finished with value: 0.7661998481481952 and parameters: {'random_state': 55, 'reg_alpha': 1.8501815780142468e-05, 'reg_lambda': 3.376357340406781e-07, 'max_depth': 3, 'n_estimators': 144, 'num_leaves': 163, 'colsample_bytree': 0.5479853415897755, 'subsample': 0.9833669536886586, 'subsample_freq': 7, 'min_child_samples': 50}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7623519351275592

Average AUC: 0.7623519351275592
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7789006315436853

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7762929553665623

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7766404309619541

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7701934096687053

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7745981416093044

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7766394951083504

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7749659777270039

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7730819503490403

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7770471164170613

Fitted LGBMClassifier


[32m[I 2021-03-22 16:14:20,750][0m Trial 7 finished with value: 0.7750611536670855 and parameters: {'random_state': 96, 'reg_alpha': 0.003483046331187322, 'reg_lambda': 4.4640302968341976e-05, 'max_depth': -1, 'n_estimators': 254, 'num_leaves': 254, 'colsample_bytree': 0.4400444663403229, 'subsample': 0.49806889471870425, 'subsample_freq': 4, 'min_child_samples': 25}. Best is trial 2 with value: 0.776966013275581.[0m


Fold: 9	 AUC: 0.7722514279191878

Average AUC: 0.7722514279191878
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7784888559580013

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7790482453304158

Fitted LGBMClassifier
Fold: 2	 AUC: 0.779267092413101

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7756748240612343

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7786568017349175

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7790944959002879

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7788562241488989

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7745449577337699

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7776680552832006

Fitted LGBMClassifier


[32m[I 2021-03-22 16:18:06,913][0m Trial 8 finished with value: 0.7776772200332646 and parameters: {'random_state': 3, 'reg_alpha': 0.030970364241333752, 'reg_lambda': 0.007657665642184858, 'max_depth': -1, 'n_estimators': 82, 'num_leaves': 202, 'colsample_bytree': 0.9114658574454664, 'subsample': 0.5970475960826434, 'subsample_freq': 2, 'min_child_samples': 66}. Best is trial 8 with value: 0.7776772200332646.[0m


Fold: 9	 AUC: 0.7754726477688184

Average AUC: 0.7754726477688184
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7688446962585554

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7665678500368098

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7670814282340321

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7613201137240169

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7645593484231129

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7630764484428776

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7622584267967666

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7634456883409828

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7661056125400366

Fitted LGBMClassifier


[32m[I 2021-03-22 16:23:00,750][0m Trial 9 finished with value: 0.7644110051743676 and parameters: {'random_state': 56, 'reg_alpha': 7.463966789086342e-08, 'reg_lambda': 0.00022104010818645902, 'max_depth': 1, 'n_estimators': 366, 'num_leaves': 229, 'colsample_bytree': 0.42535866126584376, 'subsample': 0.6142795525247925, 'subsample_freq': 4, 'min_child_samples': 70}. Best is trial 8 with value: 0.7776772200332646.[0m


Fold: 9	 AUC: 0.7608504389464846

Average AUC: 0.7608504389464846
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7772535006688271

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7770527030065622

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7741146648008637

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7707014583406708

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7762661294653941

Fitted LGBMClassifier
Fold: 5	 AUC: 0.777202599363058

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7751940977493611

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7708037288790109

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7768204287072318

Fitted LGBMClassifier


[32m[I 2021-03-22 16:26:12,227][0m Trial 10 finished with value: 0.7744885585723679 and parameters: {'random_state': 3, 'reg_alpha': 0.05371719088181714, 'reg_lambda': 1.6025217603694422, 'max_depth': 10, 'n_estimators': 64, 'num_leaves': 110, 'colsample_bytree': 0.7890269737124365, 'subsample': 0.404399868268499, 'subsample_freq': 1, 'min_child_samples': 6}. Best is trial 8 with value: 0.7776772200332646.[0m


Fold: 9	 AUC: 0.769476274742698

Average AUC: 0.769476274742698
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7800388349572182

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7782593606873602

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7798533076866838

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7764646160257715

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7796483272153235

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7795502394860786

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7752607088415409

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7752116792429793

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7787174153750385

Fitted LGBMClassifier


[32m[I 2021-03-22 16:30:47,947][0m Trial 11 finished with value: 0.7779609075227527 and parameters: {'random_state': 72, 'reg_alpha': 0.00011030701399700635, 'reg_lambda': 0.0388996595539981, 'max_depth': -1, 'n_estimators': 328, 'num_leaves': 152, 'colsample_bytree': 0.8232174503974193, 'subsample': 0.7973354562081402, 'subsample_freq': 1, 'min_child_samples': 95}. Best is trial 11 with value: 0.7779609075227527.[0m


Fold: 9	 AUC: 0.7766045857095325

Average AUC: 0.7766045857095325
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7813093532336287

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7791231935086638

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7793096566328019

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7754323695007502

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7787678487540055

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7796506383172112

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7786470894005655

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7773192044395253

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7774640106654442

Fitted LGBMClassifier


[32m[I 2021-03-22 16:40:13,713][0m Trial 12 finished with value: 0.7784043427156473 and parameters: {'random_state': 72, 'reg_alpha': 0.0002072908529811167, 'reg_lambda': 0.04449291635757082, 'max_depth': -1, 'n_estimators': 295, 'num_leaves': 138, 'colsample_bytree': 0.8014957354718788, 'subsample': 0.867248658762283, 'subsample_freq': 2, 'min_child_samples': 84}. Best is trial 12 with value: 0.7784043427156473.[0m


Fold: 9	 AUC: 0.7770200627038774

Average AUC: 0.7770200627038774
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7817540092367837

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7790880305214273

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7805787996609287

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7749854309278297

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7794604888429016

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7798421915719261

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7789871637634929

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7767306780640549

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7784004805631081

Fitted LGBMClassifier


[32m[I 2021-03-22 16:54:31,144][0m Trial 13 finished with value: 0.7784814157849066 and parameters: {'random_state': 72, 'reg_alpha': 9.893609479693223e-05, 'reg_lambda': 0.04843690799503364, 'max_depth': 8, 'n_estimators': 304, 'num_leaves': 69, 'colsample_bytree': 0.7790805720861474, 'subsample': 0.8679924052832362, 'subsample_freq': 2, 'min_child_samples': 88}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7749868846966136

Average AUC: 0.7749868846966136
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7794281048843547

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7777874393883571

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7787919241586065

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7746356442305504

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7779757456875809

Fitted LGBMClassifier
Fold: 5	 AUC: 0.781153898819754

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7796224371677591

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7758700921982428

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7775694996271536

Fitted LGBMClassifier


[32m[I 2021-03-22 17:02:02,654][0m Trial 14 finished with value: 0.7776669328387131 and parameters: {'random_state': 69, 'reg_alpha': 3.466245842018993e-06, 'reg_lambda': 0.1969220510129864, 'max_depth': 8, 'n_estimators': 255, 'num_leaves': 76, 'colsample_bytree': 0.7202237643673648, 'subsample': 0.8966109150109464, 'subsample_freq': 2, 'min_child_samples': 83}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7738345422247721

Average AUC: 0.7738345422247721
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7817789919628663

Fitted LGBMClassifier
Fold: 1	 AUC: 0.778597591875202

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7791088589705368

Fitted LGBMClassifier
Fold: 3	 AUC: 0.775536483214176

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7792694035149886

Fitted LGBMClassifier
Fold: 5	 AUC: 0.780690286074697

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7780784556206009

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7773122711338626

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7784282423178801

Fitted LGBMClassifier


[32m[I 2021-03-22 17:07:51,019][0m Trial 15 finished with value: 0.7783372557277624 and parameters: {'random_state': 71, 'reg_alpha': 0.0012804202823178098, 'reg_lambda': 5.1895854827648735, 'max_depth': 10, 'n_estimators': 307, 'num_leaves': 74, 'colsample_bytree': 0.7965732992275475, 'subsample': 0.9101595952018134, 'subsample_freq': 5, 'min_child_samples': 84}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.774571972592812

Average AUC: 0.774571972592812
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7660774113905846

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7654106898813751

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7658872619162753

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7589322204831831

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7626142394782264

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7615944843162463

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7603008264888386

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7618735056443727

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7649294956427

Fitted LGBMClassifier


[32m[I 2021-03-22 17:13:12,191][0m Trial 16 finished with value: 0.7627688928296839 and parameters: {'random_state': 36, 'reg_alpha': 0.00020569077258377502, 'reg_lambda': 0.03639103649341312, 'max_depth': 8, 'n_estimators': 203, 'num_leaves': 3, 'colsample_bytree': 0.6779181996454812, 'subsample': 0.8976309474116398, 'subsample_freq': 2, 'min_child_samples': 83}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.760068793055037

Average AUC: 0.760068793055037
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7810215839568762

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7773164254108357

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7765969594207714

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7738527855716756

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7784758966681581

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7785767634260925

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7771600351433572

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7733531025179005

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7772201808566762

Fitted LGBMClassifier


[32m[I 2021-03-22 17:17:46,100][0m Trial 17 finished with value: 0.7767464387657715 and parameters: {'random_state': 66, 'reg_alpha': 7.869830896191237e-07, 'reg_lambda': 0.07247527402713776, 'max_depth': 7, 'n_estimators': 218, 'num_leaves': 80, 'colsample_bytree': 0.728026982430669, 'subsample': 0.7604393266842182, 'subsample_freq': 3, 'min_child_samples': 80}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7738906546853725

Average AUC: 0.7738906546853725
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7804349007564917

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7776745491941832

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7797182137951167

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7772983759904155

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7790630477953445

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7811710123865703

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7789973440246467

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7751552484119535

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7793114998078875

Fitted LGBMClassifier


[32m[I 2021-03-22 17:23:23,118][0m Trial 18 finished with value: 0.7783376325114837 and parameters: {'random_state': 98, 'reg_alpha': 6.678525074977078e-05, 'reg_lambda': 5.198624778657669, 'max_depth': 8, 'n_estimators': 350, 'num_leaves': 42, 'colsample_bytree': 0.847159597813053, 'subsample': 0.8450860390640311, 'subsample_freq': 5, 'min_child_samples': 98}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7745521329522274

Average AUC: 0.7745521329522274
Fitted LGBMClassifier
Fold: 0	 AUC: 0.778335712646014

Fitted LGBMClassifier
Fold: 1	 AUC: 0.778696147531249

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7759552206376445

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7739480942722313

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7773538994999597

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7778313788563417

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7751940977493611

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7747096850873166

Fitted LGBMClassifier
Fold: 8	 AUC: 0.778101127244796

Fitted LGBMClassifier


[32m[I 2021-03-22 17:31:41,074][0m Trial 19 finished with value: 0.7762080900065912 and parameters: {'random_state': 83, 'reg_alpha': 1.8738786216365535e-08, 'reg_lambda': 1.8037876745260654e-08, 'max_depth': 5, 'n_estimators': 304, 'num_leaves': 125, 'colsample_bytree': 0.6106164669710735, 'subsample': 0.9583267412111518, 'subsample_freq': 2, 'min_child_samples': 88}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7719555365409976

Average AUC: 0.7719555365409976
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7806356984187566

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7796423297632649

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7765414359112274

Fitted LGBMClassifier
Fold: 3	 AUC: 0.775100632223891

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7783199743274812

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7803664179571043

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7762184751151162

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7752079928928082

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7785943449197107

Fitted LGBMClassifier


[32m[I 2021-03-22 17:34:13,646][0m Trial 20 finished with value: 0.7774434634859647 and parameters: {'random_state': 62, 'reg_alpha': 0.001465149944566875, 'reg_lambda': 0.6802622704681884, 'max_depth': 9, 'n_estimators': 214, 'num_leaves': 35, 'colsample_bytree': 0.7561729533866004, 'subsample': 0.7032958752183506, 'subsample_freq': 1, 'min_child_samples': 74}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.7738073333302865

Average AUC: 0.7738073333302865
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7814370744248536

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7785776992796963

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7780243358914625

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7751760483289408

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7786850314164914

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7813454520744691

Fitted LGBMClassifier
Fold: 6	 AUC: 0.778237156989967

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7761217911662771

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7809480110269119

Fitted LGBMClassifier


[32m[I 2021-03-22 17:39:32,597][0m Trial 21 finished with value: 0.7782634273877438 and parameters: {'random_state': 96, 'reg_alpha': 5.122639355317028e-05, 'reg_lambda': 9.145792440333492, 'max_depth': 7, 'n_estimators': 348, 'num_leaves': 36, 'colsample_bytree': 0.8433809047398819, 'subsample': 0.8409386798280204, 'subsample_freq': 6, 'min_child_samples': 100}. Best is trial 13 with value: 0.7784814157849066.[0m


Fold: 9	 AUC: 0.774081673278368

Average AUC: 0.774081673278368
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7820135773640843

Fitted LGBMClassifier
Fold: 1	 AUC: 0.779295322094675

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7799444621102664

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7749900531316046

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7787761858400739

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7802072486609362

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7778998616557292

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7769721967709353

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7785013473210426

Fitted LGBMClassifier


[32m[I 2021-03-22 17:54:20,666][0m Trial 22 finished with value: 0.7786212639623306 and parameters: {'random_state': 100, 'reg_alpha': 5.811138644985759e-06, 'reg_lambda': 9.66053736844112, 'max_depth': 9, 'n_estimators': 401, 'num_leaves': 98, 'colsample_bytree': 0.8608126069266873, 'subsample': 0.8524756463121624, 'subsample_freq': 5, 'min_child_samples': 93}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7776123846739581

Average AUC: 0.7776123846739581
Fitted LGBMClassifier
Fold: 0	 AUC: 0.780011541129248

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7796066988492264

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7792795837761424

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7759255871756648

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7776981281398601

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7798139618903522

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7778262887257648

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7780238679646606

Fitted LGBMClassifier
Fold: 8	 AUC: 0.779019079795238

Fitted LGBMClassifier


[32m[I 2021-03-22 18:02:35,190][0m Trial 23 finished with value: 0.7781480835778674 and parameters: {'random_state': 76, 'reg_alpha': 2.0540372920678053e-06, 'reg_lambda': 0.012937590358078482, 'max_depth': 9, 'n_estimators': 488, 'num_leaves': 91, 'colsample_bytree': 0.9993999738254409, 'subsample': 0.9458336734739123, 'subsample_freq': 5, 'min_child_samples': 90}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7742760983325166

Average AUC: 0.7742760983325166
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7811182679057155

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7795738469638774

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7782024619295325

Fitted LGBMClassifier
Fold: 3	 AUC: 0.776748698952353

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7813153792178095

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7797963803967338

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7777207997640554

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7759519736821532

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7785938769929088

Fitted LGBMClassifier


[32m[I 2021-03-22 18:09:16,866][0m Trial 24 finished with value: 0.7783608396905753 and parameters: {'random_state': 36, 'reg_alpha': 9.802472986564822e-06, 'reg_lambda': 1.4551127130310493, 'max_depth': 10, 'n_estimators': 395, 'num_leaves': 147, 'colsample_bytree': 0.8722581730393842, 'subsample': 0.7815839705752803, 'subsample_freq': 6, 'min_child_samples': 77}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7745867111006144

Average AUC: 0.7745867111006144
Fitted LGBMClassifier
Fold: 0	 AUC: 0.779140775002282

Fitted LGBMClassifier
Fold: 1	 AUC: 0.778396326286135

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7779410506271462

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7755920067237201

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7780562519232077

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7814310484406727

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7776055984679939

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7754143486124523

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7789765155755369

Fitted LGBMClassifier


[32m[I 2021-03-22 18:17:59,622][0m Trial 25 finished with value: 0.7776215016810388 and parameters: {'random_state': 91, 'reg_alpha': 0.0002534476646070017, 'reg_lambda': 0.19948200673550567, 'max_depth': 6, 'n_estimators': 450, 'num_leaves': 137, 'colsample_bytree': 0.6677511845148127, 'subsample': 0.8712538544237614, 'subsample_freq': 3, 'min_child_samples': 59}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7736610951512415

Average AUC: 0.7736610951512415
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7806028465334077

Fitted LGBMClassifier
Fold: 1	 AUC: 0.778106685302175

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7794711370308575

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7764877555767687

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7796090099511139

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7812265358961143

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7785721412223175

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7747624295681713

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7787484240853019

Fitted LGBMClassifier


[32m[I 2021-03-22 18:22:35,359][0m Trial 26 finished with value: 0.7780462451633273 and parameters: {'random_state': 17, 'reg_alpha': 0.005046025131942895, 'reg_lambda': 0.001544503469050444, 'max_depth': 9, 'n_estimators': 289, 'num_leaves': 99, 'colsample_bytree': 0.7741280248691558, 'subsample': 0.7171203039805842, 'subsample_freq': 6, 'min_child_samples': 88}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7728754864670457

Average AUC: 0.7728754864670457
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7808878653135929

Fitted LGBMClassifier
Fold: 1	 AUC: 0.777666212108115

Fitted LGBMClassifier
Fold: 2	 AUC: 0.776420676557787

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7764646160257715

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7779322456142761

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7805292736034434

Fitted LGBMClassifier
Fold: 6	 AUC: 0.776711224863229

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7770175400193255

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7782792532828661

Fitted LGBMClassifier


[32m[I 2021-03-22 18:29:02,502][0m Trial 27 finished with value: 0.7776077873373557 and parameters: {'random_state': 47, 'reg_alpha': 0.0006402014822484011, 'reg_lambda': 3.2003778975319863e-06, 'max_depth': 7, 'n_estimators': 428, 'num_leaves': 59, 'colsample_bytree': 0.9580623654720174, 'subsample': 0.8077207447780798, 'subsample_freq': 5, 'min_child_samples': 75}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7741689659851502

Average AUC: 0.7741689659851502
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7779780567894684

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7761139220070107

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7739791029824946

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7710433758786838

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7758487958223312

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7770587004586207

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7741141968740617

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7724189437221239

Fitted LGBMClassifier
Fold: 8	 AUC: 0.776379019659568

Fitted LGBMClassifier


[32m[I 2021-03-22 18:37:53,027][0m Trial 28 finished with value: 0.7745535522514416 and parameters: {'random_state': 80, 'reg_alpha': 1.8149004908135086e-07, 'reg_lambda': 0.011448305596901166, 'max_depth': 4, 'n_estimators': 372, 'num_leaves': 127, 'colsample_bytree': 0.8867767356061373, 'subsample': 0.9382374089436435, 'subsample_freq': 2, 'min_child_samples': 93}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7706014083200537

Average AUC: 0.7706014083200537
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7750687161921459

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7743589625364335

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7720233458496525

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7687586604976718

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7737019533615759

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7737287792627441

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7729792404160203

Fitted LGBMClassifier
Fold: 7	 AUC: 0.771773050662024

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7760121193954723

Fitted LGBMClassifier


[32m[I 2021-03-22 19:44:58,880][0m Trial 29 finished with value: 0.7729303813984666 and parameters: {'random_state': 58, 'reg_alpha': 3.788434100738878e-06, 'reg_lambda': 0.10710225567319921, 'max_depth': 3, 'n_estimators': 442, 'num_leaves': 59, 'colsample_bytree': 0.8260419479154154, 'subsample': 0.8706329948220672, 'subsample_freq': 3, 'min_child_samples': 42}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7708989858109253

Average AUC: 0.7708989858109253
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7792171269609357

Fitted LGBMClassifier
Fold: 1	 AUC: 0.776730210137253

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7749377765775519

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7733845506228436

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7778142652895254

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7792559762983434

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7779151320474598

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7746648097657283

Fitted LGBMClassifier
Fold: 8	 AUC: 0.779140775002282

Fitted LGBMClassifier


[32m[I 2021-03-22 19:50:47,574][0m Trial 30 finished with value: 0.7764769028189324 and parameters: {'random_state': 66, 'reg_alpha': 4.002624998704606e-05, 'reg_lambda': 0.0005292700904098723, 'max_depth': 6, 'n_estimators': 256, 'num_leaves': 103, 'colsample_bytree': 0.7305799662232784, 'subsample': 0.9798477284065219, 'subsample_freq': 4, 'min_child_samples': 60}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7717084054874016

Average AUC: 0.7717084054874016
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7832484647264565

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7782496483530085

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7792189701360213

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7761749750418115

Fitted LGBMClassifier
Fold: 4	 AUC: 0.777539426770494

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7815740114915062

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7771368955923601

Fitted LGBMClassifier
Fold: 7	 AUC: 0.776528944548186

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7795303468905728

Fitted LGBMClassifier


[32m[I 2021-03-22 19:58:26,448][0m Trial 31 finished with value: 0.7784213509020396 and parameters: {'random_state': 34, 'reg_alpha': 1.2268571525996644e-05, 'reg_lambda': 1.7434409990658855, 'max_depth': 10, 'n_estimators': 388, 'num_leaves': 131, 'colsample_bytree': 0.8674540046508605, 'subsample': 0.7764751075217927, 'subsample_freq': 6, 'min_child_samples': 77}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7750118254699799

Average AUC: 0.7750118254699799
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7807064923200315

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7783458929071677

Fitted LGBMClassifier
Fold: 2	 AUC: 0.7772937537866406

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7757655105580149

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7792337726009502

Fitted LGBMClassifier
Fold: 5	 AUC: 0.778583696731755

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7774487402737135

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7778707246526733

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7794049653333576

Fitted LGBMClassifier


[32m[I 2021-03-22 20:04:42,190][0m Trial 32 finished with value: 0.7780846055781698 and parameters: {'random_state': 22, 'reg_alpha': 6.381622454224719e-07, 'reg_lambda': 3.4668176840868177, 'max_depth': 9, 'n_estimators': 324, 'num_leaves': 119, 'colsample_bytree': 0.9622074346915549, 'subsample': 0.7446375569488948, 'subsample_freq': 6, 'min_child_samples': 100}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7761925066173931

Average AUC: 0.7761925066173931
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7811765704439491

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7797954730752522

Fitted LGBMClassifier
Fold: 2	 AUC: 0.777970187630202

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7756007832044682

Fitted LGBMClassifier
Fold: 4	 AUC: 0.7782644508179372

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7799550817660998

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7767038236307646

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7754587560072386

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7779562924867549

Fitted LGBMClassifier


[32m[I 2021-03-22 20:14:36,377][0m Trial 33 finished with value: 0.7777384808808194 and parameters: {'random_state': 41, 'reg_alpha': 9.105395525604685e-06, 'reg_lambda': 0.6320956492490926, 'max_depth': 10, 'n_estimators': 461, 'num_leaves': 135, 'colsample_bytree': 0.8660654205739193, 'subsample': 0.8138277897176545, 'subsample_freq': 5, 'min_child_samples': 89}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7745033897455281

Average AUC: 0.7745033897455281
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7799065200943404

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7784615621300311

Fitted LGBMClassifier
Fold: 2	 AUC: 0.777891085174981

Fitted LGBMClassifier
Fold: 3	 AUC: 0.7752422200264409

Fitted LGBMClassifier
Fold: 4	 AUC: 0.777403864952125

Fitted LGBMClassifier
Fold: 5	 AUC: 0.7800739979444545

Fitted LGBMClassifier
Fold: 6	 AUC: 0.7790579576647676

Fitted LGBMClassifier
Fold: 7	 AUC: 0.7772997797708213

Fitted LGBMClassifier
Fold: 8	 AUC: 0.7771572561146677

Fitted LGBMClassifier


[32m[I 2021-03-22 20:21:11,206][0m Trial 34 finished with value: 0.7776556077510413 and parameters: {'random_state': 30, 'reg_alpha': 0.00046742618080696435, 'reg_lambda': 8.018427327396678, 'max_depth': 8, 'n_estimators': 379, 'num_leaves': 179, 'colsample_bytree': 0.8021309331006635, 'subsample': 0.8574056191501539, 'subsample_freq': 7, 'min_child_samples': 72}. Best is trial 22 with value: 0.7786212639623306.[0m


Fold: 9	 AUC: 0.7740618336377837

Average AUC: 0.7740618336377837
Fitted LGBMClassifier
Fold: 0	 AUC: 0.7798981830082723

Fitted LGBMClassifier
Fold: 1	 AUC: 0.7787886772031153



Best Params: 
    
    reg_alpha: 0.362136938773081
    reg_lambda: 2.930297242488071
    max_depth: 10
    n_estimators: 306
    num_leaves: 71
    colsample_bytree: 0.7121396258381646
    subsample: 0.793959734582999
    subsample_freq: 2
    min_child_samples: 18

In [10]:
if OPTUNA_OPTIMIZATION:
    final_model = LGBMClassifier(**trial.params)
else:
    final_model = LGBMClassifier(**trial)

In [11]:
test_preds = []

skf = StratifiedKFold(N_SPLITS, shuffle = True, random_state = 29)
aucs = []
for kfold, (train_idx, val_idx) in enumerate(skf.split(train_preprocessed[features].values, 
                                                      train_preprocessed['target'].values)):
        
        final_model.fit(train_preprocessed.loc[train_idx, features], train_preprocessed.loc[train_idx, 'target'])
        print('Fitted {}'.format(type(final_model).__name__))
        val_true = train.loc[val_idx, 'target'].values
        
        preds = final_model.predict(train_preprocessed.loc[val_idx, features])
        
        auc = roc_auc_score(val_true, preds)
        aucs.append(auc)
        print('Fold: {}\t Validation AUC: {}\n'.format(kfold, auc))
        
        test_preds.append(final_model.predict_proba(test_preprocessed[features])[:, 1])
        
print("Best Parameters mean AUC: {}".format(np.mean(aucs)))

Fitted LGBMClassifier
Fold: 0	 Validation AUC: 0.7813630335680872

Fitted LGBMClassifier
Fold: 1	 Validation AUC: 0.7797455076230869

Fitted LGBMClassifier
Fold: 2	 Validation AUC: 0.7801258351038276

Fitted LGBMClassifier
Fold: 3	 Validation AUC: 0.7765705729142831

Fitted LGBMClassifier
Fold: 4	 Validation AUC: 0.7798306360624886

Fitted LGBMClassifier
Fold: 5	 Validation AUC: 0.7797695544955657

Fitted LGBMClassifier
Fold: 6	 Validation AUC: 0.778813192002396

Fitted LGBMClassifier
Fold: 7	 Validation AUC: 0.7770425227454083

Fitted LGBMClassifier
Fold: 8	 Validation AUC: 0.7792939183142692

Fitted LGBMClassifier
Fold: 9	 Validation AUC: 0.7769395743604536

Best Parameters mean AUC: 0.7789494347189867


<a id = "submission"></a>

### Submission

In [12]:
test_predictions = np.mean(test_preds, axis = 0)

In [13]:
len(test_predictions) == len(test)

True

In [14]:
sample_submission['target'] = test_predictions

sample_submission.to_csv("submission.csv", index = False)