# Introduction

### Competition: [Titanic Kaggle](https://www.kaggle.com/c/titanic/overview)

This is notebook contains a simple data science project framework, for learning and portfolio construction purposes.

# Libs

In [28]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
from sklearn.pipeline import Pipeline

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras.layers import Input, Dense, BatchNormalization, Dropout, Embedding,  Flatten
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.optimizers import RMSprop

from tensorflow.data import Dataset
from sklearn.model_selection import train_test_split, KFold
from sklearn.preprocessing import QuantileTransformer,  KBinsDiscretizer, StandardScaler
from tensorflow import keras
from sklearn import metrics
from sklearn.impute import SimpleImputer

from sklearn.model_selection import GridSearchCV, cross_val_score

import optuna

# Load Dataset

This step we simply get our data to our working environment. Because we are not dealing with live data, a simple pandas usage is enough.

In [2]:
%%time

train = pd.read_csv("data/train.csv")
test = pd.read_csv("data/test.csv")

Wall time: 137 ms


# Preprocessing

In [3]:
%%time

train['Survived'] = train['Survived'].astype(str)

train['n_missing'] = train.isna().sum(axis=1)
test['n_missing'] = test.isna().sum(axis=1)

test['Pclass']= test['Pclass'].astype(str)
test['Pclass']= test['Pclass'].astype(str)

features = [col for col in train.columns if col not in ['Survived', 'PassengerId']]

Wall time: 85.8 ms


### *Name* Column 

In [4]:
print(len(train['Name'].unique()))
print(train['Name'].unique()[0:5])

891
['Braund, Mr. Owen Harris'
 'Cumings, Mrs. John Bradley (Florence Briggs Thayer)'
 'Heikkinen, Miss. Laina' 'Futrelle, Mrs. Jacques Heath (Lily May Peel)'
 'Allen, Mr. William Henry']


**With the *Name* column the way it is, we can't use it in our models.** The reason is because as every person has a unique name, then the name has no information about our variable of interest (*Survived*).

One thing we can see in this column is the presence of titles. **We can probably assume different survival rates when considering different titles.**

In [5]:
name_and_title = [name.split(", ")[1] for name in train['Name']]
title = [name.split(".")[0] for name in name_and_title]
print(len(title))

891


In [6]:
print(len(np.unique(title)))
np.unique(title)

17


array(['Capt', 'Col', 'Don', 'Dr', 'Jonkheer', 'Lady', 'Major', 'Master',
       'Miss', 'Mlle', 'Mme', 'Mr', 'Mrs', 'Ms', 'Rev', 'Sir',
       'the Countess'], dtype='<U12')

In [7]:
train['Name'] = title
test['Name'] = [name.split(".")[0] for name in [name.split(", ")[1] for name in test['Name']]] 

In [8]:
train = pd.concat([train, pd.get_dummies(train['Name']).filter(['Miss', 'Mr', 'Mrs', 'Ms'])], axis = 1)
train.drop('Name', axis = 1, inplace = True)

test = pd.concat([test, pd.get_dummies(test['Name']).filter(['Miss', 'Mr', 'Mrs', 'Ms'])], axis = 1)
test.drop('Name', axis = 1, inplace = True)

### Dealing with the Ticket feature

In [9]:
train['Ticket'][0:5]

0           A/5 21171
1            PC 17599
2    STON/O2. 3101282
3              113803
4              373450
Name: Ticket, dtype: object

**One hypothesis we can make** is that the numbers don't contain any relevant information and the prefix may contain relevant information.

In [10]:
ticket_prefixes = [ticket.split()[0] for ticket in train['Ticket']]
ticket_prefixes[0:5]

['A/5', 'PC', 'STON/O2.', '113803', '373450']

In [11]:
for i in range(len(ticket_prefixes)):
    try: 
        int(ticket_prefixes[i])
        ticket_prefixes[i] = "number_only"
    
    except Exception:
        pass

In [12]:
ticket_prefixes[0:5]

['A/5', 'PC', 'STON/O2.', 'number_only', 'number_only']

In [13]:
print(len(np.unique(ticket_prefixes)))
np.unique(ticket_prefixes)

44


array(['A./5.', 'A.5.', 'A/4', 'A/4.', 'A/5', 'A/5.', 'A/S', 'A4.', 'C',
       'C.A.', 'C.A./SOTON', 'CA', 'CA.', 'F.C.', 'F.C.C.', 'Fa', 'LINE',
       'P/PP', 'PC', 'PP', 'S.C./A.4.', 'S.C./PARIS', 'S.O./P.P.',
       'S.O.C.', 'S.O.P.', 'S.P.', 'S.W./PP', 'SC', 'SC/AH', 'SC/PARIS',
       'SC/Paris', 'SCO/W', 'SO/C', 'SOTON/O.Q.', 'SOTON/O2', 'SOTON/OQ',
       'STON/O', 'STON/O2.', 'SW/PP', 'W./C.', 'W.E.P.', 'W/C', 'WE/P',
       'number_only'], dtype='<U11')

In [14]:
ticket_prefixes = [s.replace(".", "") for s in ticket_prefixes]
ticket_prefixes = [s.replace(",", "") for s in ticket_prefixes]
ticket_prefixes = [s.upper() for s in ticket_prefixes]

In [15]:
print(len(np.unique(ticket_prefixes)))
np.unique(ticket_prefixes)

34


array(['A/4', 'A/5', 'A/S', 'A4', 'A5', 'C', 'CA', 'CA/SOTON', 'FA', 'FC',
       'FCC', 'LINE', 'NUMBER_ONLY', 'P/PP', 'PC', 'PP', 'SC', 'SC/A4',
       'SC/AH', 'SC/PARIS', 'SCO/W', 'SO/C', 'SO/PP', 'SOC', 'SOP',
       'SOTON/O2', 'SOTON/OQ', 'SP', 'STON/O', 'STON/O2', 'SW/PP', 'W/C',
       'WE/P', 'WEP'], dtype='<U11')

In [16]:
test_ticket_prefixes = [ticket.split()[0] for ticket in test['Ticket']]
for i in range(len(test_ticket_prefixes)):
    try: 
        int(test_ticket_prefixes[i])
        test_ticket_prefixes[i] = "number_only"
    
    except Exception:
        pass

test_ticket_prefixes = [s.replace(".", "") for s in test_ticket_prefixes]
test_ticket_prefixes = [s.replace(",", "") for s in test_ticket_prefixes]
test_ticket_prefixes = [s.upper() for s in test_ticket_prefixes]

In [17]:
train['Ticket'] = ticket_prefixes
test['Ticket'] = test_ticket_prefixes

In [18]:
train = pd.concat([train, pd.get_dummies(train['Ticket']).filter(['PC', 'CA', 'NUMBER_ONLY'])], axis = 1)
train.drop('Ticket', axis = 1, inplace = True)

test = pd.concat([test, pd.get_dummies(test['Ticket']).filter(['PC', 'CA', 'NUMBER_ONLY'])], axis = 1)
test.drop('Ticket', axis = 1, inplace = True)

### Dealing with the Cabin feature

Same as the *Ticket* feature. I will assume that the number doesn't have relevant information.

In [19]:
train['Cabin'].unique()

array([nan, 'C85', 'C123', 'E46', 'G6', 'C103', 'D56', 'A6',
       'C23 C25 C27', 'B78', 'D33', 'B30', 'C52', 'B28', 'C83', 'F33',
       'F G73', 'E31', 'A5', 'D10 D12', 'D26', 'C110', 'B58 B60', 'E101',
       'F E69', 'D47', 'B86', 'F2', 'C2', 'E33', 'B19', 'A7', 'C49', 'F4',
       'A32', 'B4', 'B80', 'A31', 'D36', 'D15', 'C93', 'C78', 'D35',
       'C87', 'B77', 'E67', 'B94', 'C125', 'C99', 'C118', 'D7', 'A19',
       'B49', 'D', 'C22 C26', 'C106', 'C65', 'E36', 'C54',
       'B57 B59 B63 B66', 'C7', 'E34', 'C32', 'B18', 'C124', 'C91', 'E40',
       'T', 'C128', 'D37', 'B35', 'E50', 'C82', 'B96 B98', 'E10', 'E44',
       'A34', 'C104', 'C111', 'C92', 'E38', 'D21', 'E12', 'E63', 'A14',
       'B37', 'C30', 'D20', 'B79', 'E25', 'D46', 'B73', 'C95', 'B38',
       'B39', 'B22', 'C86', 'C70', 'A16', 'C101', 'C68', 'A10', 'E68',
       'B41', 'A20', 'D19', 'D50', 'D9', 'A23', 'B50', 'A26', 'D48',
       'E58', 'C126', 'B71', 'B51 B53 B55', 'D49', 'B5', 'B20', 'F G63',
       'C62 C64',

In [20]:
cabin_prefix = []
for i in range(len(train['Cabin'])):
    try:
        cabin_prefix.append(train['Cabin'][i][0: 1: 1])
    
    except:
        cabin_prefix.append(train['Cabin'][i])        

In [21]:
np.unique(cabin_prefix)

array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'T', 'nan'], dtype='<U32')

In [22]:
cabin_test_prefix = []
for i in range(len(test['Cabin'])):
    try:
        cabin_test_prefix.append(test['Cabin'][i][0: 1: 1])
    
    except:
        cabin_test_prefix.append(test['Cabin'][i])

In [23]:
train['Cabin'] = cabin_prefix
test['Cabin'] = cabin_test_prefix

In [24]:
train = pd.concat([train, pd.get_dummies(train['Cabin']).filter(['NaN', 'B', 'C'])], axis = 1)
train.drop('Cabin', axis = 1, inplace = True)

test = pd.concat([test, pd.get_dummies(test['Cabin']).filter(['NaN', 'B', 'C'])], axis = 1)
test.drop('Cabin', axis = 1, inplace = True)

## Pclass, Sex and Embarked variables

In [25]:
train = pd.get_dummies(train, columns = ['Pclass', 'Sex', 'Embarked'])
test = pd.get_dummies(test, columns = ['Pclass', 'Sex', 'Embarked'])

## Imputer and Scaler

In [26]:
%%time

features = [col for col in train.columns if col not in ['Survived', 'PassengerId']]
numerical_features = [col for col in features if col in ['Age', 'SibSp', 'Parch', 'Fare', 'n_missing']]

pipe = Pipeline([
        ('imputer', SimpleImputer(strategy='mean',missing_values=np.nan)),
        ("scaler", StandardScaler())
        ])

train[numerical_features] = pipe.fit_transform(train[numerical_features])
test[numerical_features] = pipe.transform(test[numerical_features])

Wall time: 13 ms


# Base Models

## Light GBM

In [27]:
import lightgbm as lgb

def objective(trial):
    
    param = {
        'objective': 'binary',
        'metric': 'binary_logloss',
        'verbosity': -1,
        'boosting_type': 'gbdt',
        'lambda_l1': trial.suggest_float('lambda_l1', 1e-8, 10.0, log=True),
        'lambda_l2': trial.suggest_float('lambda_l2', 1e-8, 10.0, log=True),
        'num_leaves': trial.suggest_int('num_leaves', 2, 32),
        'feature_fraction': trial.suggest_float('feature_fraction', 0.4, 1.0),
        'bagging_fraction': trial.suggest_float('bagging_fraction', 0.4, 1.0),
        'bagging_freq': trial.suggest_int('bagging_freq', 1, 7),
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
    }
    
    kf = KFold(5, shuffle = True, random_state = 0)
    kf.split(train)
    
    accuracy_scores = []
    
    for train_ix, test_ix in kf.split(train):
        dtrain = lgb.Dataset(train[features].iloc[train_ix,:], label = train['Survived'].iloc[train_ix])
        
        gbm = lgb.train(param, dtrain)
        preds = np.rint(gbm.predict(train[features].iloc[test_ix]))
        
        accuracy_scores.append(metrics.accuracy_score(train['Survived'].iloc[test_ix], preds))
        
    return np.mean(accuracy_scores)
    

# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=1000)

[32m[I 2021-12-21 18:22:15,338][0m A new study created in memory with name: no-name-f821181d-7759-4b23-b414-c30db2bc2008[0m
[33m[W 2021-12-21 18:22:15,528][0m Trial 0 failed because of the following error: ValueError('Series.dtypes must be int, float or bool')[0m
Traceback (most recent call last):
  File "F:\Python\Python37\lib\site-packages\optuna\study\_optimize.py", line 213, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-27-a6b7f4e955c9>", line 28, in objective
    gbm = lgb.train(param, dtrain)
  File "F:\Python\Python37\lib\site-packages\lightgbm\engine.py", line 272, in train
    booster = Booster(params=params, train_set=train_set)
  File "F:\Python\Python37\lib\site-packages\lightgbm\basic.py", line 2605, in __init__
    train_set.construct()
  File "F:\Python\Python37\lib\site-packages\lightgbm\basic.py", line 1819, in construct
    categorical_feature=self.categorical_feature, params=self.params)
  File "F:\Python\Python37\lib\site-packages\ligh

Traceback (most recent call last):
  File "F:\Python\Python37\lib\site-packages\IPython\core\interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-27-a6b7f4e955c9>", line 38, in <module>
    study.optimize(objective, n_trials=1000)
  File "F:\Python\Python37\lib\site-packages\optuna\study\study.py", line 409, in optimize
    show_progress_bar=show_progress_bar,
  File "F:\Python\Python37\lib\site-packages\optuna\study\_optimize.py", line 76, in _optimize
    progress_bar=progress_bar,
  File "F:\Python\Python37\lib\site-packages\optuna\study\_optimize.py", line 163, in _optimize_sequential
    trial = _run_trial(study, func, catch)
  File "F:\Python\Python37\lib\site-packages\optuna\study\_optimize.py", line 264, in _run_trial
    raise func_err
  File "F:\Python\Python37\lib\site-packages\optuna\study\_optimize.py", line 213, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-27-a6b7f4e955c9>", lin

TypeError: object of type 'NoneType' has no len()

In [94]:
'''
{'lambda_l1': 5.5693205859882666e-08,
 'lambda_l2': 0.0029379573632802307,
 'num_leaves': 218,
 'feature_fraction': 0.4449630393801182,
 'bagging_fraction': 0.6190711470746258,
 'bagging_freq': 1,
 'min_child_samples': 24}
'''

"\n{'lambda_l1': 5.5693205859882666e-08,\n 'lambda_l2': 0.0029379573632802307,\n 'num_leaves': 218,\n 'feature_fraction': 0.4449630393801182,\n 'bagging_fraction': 0.6190711470746258,\n 'bagging_freq': 1,\n 'min_child_samples': 24}\n"

In [53]:
kf_stacking = KFold(4, shuffle = True, random_state = 0)
kf_stacking.split(train)

lv1_lgbm_preds = train['PassengerId']

for train_ix, test_ix in kf_stacking.split(train):
    lgbm = lgb.train(
        {'lambda_l1': 5.5693205859882666e-08,
         'lambda_l2': 0.0029379573632802307,
         'num_leaves': 218,
         'feature_fraction': 0.4449630393801182,
         'bagging_fraction': 0.6190711470746258,
         'bagging_freq': 1,
         'min_child_samples': 24}, 
        lgb.Dataset(train[features].iloc[train_ix,:], label = train['Survived'].iloc[train_ix]))
    
    lv1_lgbm_preds.iloc[test_ix] = lgbm.predict(train[features].iloc[test_ix])

You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 225
[LightGBM] [Info] Number of data points in the train set: 668, number of used features: 21
[LightGBM] [Info] Start training from score 0.386228
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 225
[LightGBM] [Info] Number of data points in the train set: 668, number of used features: 21
[LightGBM] [Info] Start training from score 0.393713
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 225
[LightGBM] [Info] Number of data points in the train set: 668, number of used features: 21
[LightGBM] [Info] Start training from score 0.383234


You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 224
[LightGBM] [Info] Number of data points in the train set: 669, number of used features: 21
[LightGBM] [Info] Start training from score 0.372197


## Grandient Boosting Machine

In [31]:
train['Survived'] = train['Survived'].astype(int)

In [40]:
from sklearn.ensemble import GradientBoostingClassifier 

def objective(trial):
    
    learning_rate = trial.suggest_float('learning_rate', 1e-6, 1e-2)
    n_estimators = trial.suggest_int('n_estimators', 20, 200)
    subsample = trial.suggest_float('subsample', 0.75, 1)
    min_samples_split = trial.suggest_int('min_samples_split', 2, 20)
    min_samples_leaf = trial.suggest_int('min_samples_leaf', 2, 20)
    max_depth = trial.suggest_int('max_depth', 3, 7)
    min_impurity_decrease = trial.suggest_float('min_impurity_decrease', 0.75, 1)
    max_features = trial.suggest_int('max_features', 3, 20)
    
    gbm = GradientBoostingClassifier(learning_rate = learning_rate,
                                     n_estimators = n_estimators,
                                     subsample = subsample,
                                     min_samples_split = min_samples_split,
                                     min_samples_leaf = min_samples_leaf,
                                     max_depth = max_depth,
                                     min_impurity_decrease = min_impurity_decrease,
                                     random_state = 42,
                                     max_features = max_features)
    
    kf = KFold(5, shuffle = True, random_state = 0)
    kf.split(train)
    
    accuracy_scores = []
    
    for train_ix, test_ix in kf.split(train):
        gbm.fit(train[features].iloc[train_ix,:], train['Survived'].iloc[train_ix])
        preds = np.rint(gbm.predict(train[features].iloc[test_ix]))
        
        accuracy_scores.append(metrics.accuracy_score(train['Survived'].iloc[test_ix], preds))
        
    return np.mean(accuracy_scores)
    

# 3. Create a study object and optimize the objective function.
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=1000)

[32m[I 2021-12-21 20:50:09,281][0m A new study created in memory with name: no-name-a3391b4b-7ab1-4ef8-8a9f-7cde2f1c035e[0m
[32m[I 2021-12-21 20:50:10,362][0m Trial 0 finished with value: 0.7844830832967171 and parameters: {'learning_rate': 0.0016267243045861315, 'n_estimators': 184, 'subsample': 0.9516362903057367, 'min_samples_split': 7, 'min_samples_leaf': 12, 'max_depth': 5, 'min_impurity_decrease': 0.7822172257788828, 'max_features': 17}. Best is trial 0 with value: 0.7844830832967171.[0m
[32m[I 2021-12-21 20:50:10,497][0m Trial 1 finished with value: 0.6161634548992531 and parameters: {'learning_rate': 0.0029507164515593194, 'n_estimators': 26, 'subsample': 0.8081284642369604, 'min_samples_split': 17, 'min_samples_leaf': 6, 'max_depth': 4, 'min_impurity_decrease': 0.9862176711089738, 'max_features': 6}. Best is trial 0 with value: 0.7844830832967171.[0m
[32m[I 2021-12-21 20:50:11,239][0m Trial 2 finished with value: 0.8080848659845584 and parameters: {'learning_rate': 

[32m[I 2021-12-21 20:50:25,352][0m Trial 22 finished with value: 0.8103383340656581 and parameters: {'learning_rate': 0.006976392350428734, 'n_estimators': 197, 'subsample': 0.8450461189042513, 'min_samples_split': 20, 'min_samples_leaf': 14, 'max_depth': 6, 'min_impurity_decrease': 0.8577797063005945, 'max_features': 10}. Best is trial 15 with value: 0.8136714581633294.[0m
[32m[I 2021-12-21 20:50:26,027][0m Trial 23 finished with value: 0.8125729709371665 and parameters: {'learning_rate': 0.008399819167141961, 'n_estimators': 130, 'subsample': 0.8840706807989264, 'min_samples_split': 18, 'min_samples_leaf': 11, 'max_depth': 7, 'min_impurity_decrease': 0.8922791938404232, 'max_features': 14}. Best is trial 15 with value: 0.8136714581633294.[0m
[32m[I 2021-12-21 20:50:26,578][0m Trial 24 finished with value: 0.8091707990709935 and parameters: {'learning_rate': 0.008352234417092546, 'n_estimators': 126, 'subsample': 0.8848295874587235, 'min_samples_split': 17, 'min_samples_leaf':

[32m[I 2021-12-21 20:50:40,323][0m Trial 44 finished with value: 0.8080597577051032 and parameters: {'learning_rate': 0.0029962924290753814, 'n_estimators': 181, 'subsample': 0.9995665224491982, 'min_samples_split': 10, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7514945267596694, 'max_features': 17}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:50:41,180][0m Trial 45 finished with value: 0.8182035026049841 and parameters: {'learning_rate': 0.009430402633075033, 'n_estimators': 145, 'subsample': 0.9625621911558581, 'min_samples_split': 13, 'min_samples_leaf': 9, 'max_depth': 7, 'min_impurity_decrease': 0.7842794564062001, 'max_features': 19}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:50:42,080][0m Trial 46 finished with value: 0.8181972255351202 and parameters: {'learning_rate': 0.007922175066003838, 'n_estimators': 152, 'subsample': 0.9597870717376773, 'min_samples_split': 13, 'min_samples_leaf': 

[32m[I 2021-12-21 20:50:58,783][0m Trial 66 finished with value: 0.8069298851296214 and parameters: {'learning_rate': 0.007573187697919891, 'n_estimators': 112, 'subsample': 0.9493154666155414, 'min_samples_split': 13, 'min_samples_leaf': 9, 'max_depth': 7, 'min_impurity_decrease': 0.8204027900224531, 'max_features': 12}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:50:59,581][0m Trial 67 finished with value: 0.8136965664427844 and parameters: {'learning_rate': 0.00975755512889231, 'n_estimators': 138, 'subsample': 0.9188059428105653, 'min_samples_split': 11, 'min_samples_leaf': 6, 'max_depth': 6, 'min_impurity_decrease': 0.7601806415784129, 'max_features': 17}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:51:00,424][0m Trial 68 finished with value: 0.8125980792166217 and parameters: {'learning_rate': 0.009325152114399242, 'n_estimators': 157, 'subsample': 0.889997894501716, 'min_samples_split': 13, 'min_samples_leaf': 8, 

[32m[I 2021-12-21 20:51:14,286][0m Trial 88 finished with value: 0.8092084614901764 and parameters: {'learning_rate': 0.008611869983474253, 'n_estimators': 119, 'subsample': 0.9838274502534743, 'min_samples_split': 13, 'min_samples_leaf': 11, 'max_depth': 7, 'min_impurity_decrease': 0.7818252089280183, 'max_features': 16}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:51:14,762][0m Trial 89 finished with value: 0.8058000125541398 and parameters: {'learning_rate': 0.007423180976077583, 'n_estimators': 76, 'subsample': 0.9729756990944748, 'min_samples_split': 15, 'min_samples_leaf': 10, 'max_depth': 6, 'min_impurity_decrease': 0.7636204430875523, 'max_features': 14}. Best is trial 26 with value: 0.8192706044818279.[0m
[32m[I 2021-12-21 20:51:15,508][0m Trial 90 finished with value: 0.6161634548992531 and parameters: {'learning_rate': 0.0003162787778600582, 'n_estimators': 109, 'subsample': 0.9935946657408345, 'min_samples_split': 15, 'min_samples_leaf':

[32m[I 2021-12-21 20:51:33,227][0m Trial 110 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.009786770231879014, 'n_estimators': 146, 'subsample': 0.9382612281930442, 'min_samples_split': 16, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.77380462528675, 'max_features': 20}. Best is trial 98 with value: 0.8215554579122466.[0m
[32m[I 2021-12-21 20:51:34,114][0m Trial 111 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.00970496133710955, 'n_estimators': 147, 'subsample': 0.9378095627733649, 'min_samples_split': 16, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7747916277227089, 'max_features': 20}. Best is trial 98 with value: 0.8215554579122466.[0m
[32m[I 2021-12-21 20:51:35,075][0m Trial 112 finished with value: 0.8170673529596384 and parameters: {'learning_rate': 0.009187595115672974, 'n_estimators': 162, 'subsample': 0.923437649832741, 'min_samples_split': 17, 'min_samples_leaf': 5,

[32m[I 2021-12-21 20:51:53,201][0m Trial 132 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008602703046680024, 'n_estimators': 153, 'subsample': 0.9632494020864221, 'min_samples_split': 16, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7556005661392138, 'max_features': 19}. Best is trial 98 with value: 0.8215554579122466.[0m
[32m[I 2021-12-21 20:51:54,201][0m Trial 133 finished with value: 0.8181972255351203 and parameters: {'learning_rate': 0.00861223946836814, 'n_estimators': 168, 'subsample': 0.9915775270808823, 'min_samples_split': 16, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7582639283123033, 'max_features': 19}. Best is trial 98 with value: 0.8215554579122466.[0m
[32m[I 2021-12-21 20:51:55,222][0m Trial 134 finished with value: 0.8170736300295023 and parameters: {'learning_rate': 0.008779460402590067, 'n_estimators': 170, 'subsample': 0.9789901039805987, 'min_samples_split': 16, 'min_samples_leaf':

[32m[I 2021-12-21 20:52:12,110][0m Trial 154 finished with value: 0.8193082669010104 and parameters: {'learning_rate': 0.009218488241503614, 'n_estimators': 142, 'subsample': 0.9413109415400751, 'min_samples_split': 19, 'min_samples_leaf': 6, 'max_depth': 7, 'min_impurity_decrease': 0.7630921832825459, 'max_features': 18}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:12,950][0m Trial 155 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009060438858021198, 'n_estimators': 135, 'subsample': 0.9546753456352217, 'min_samples_split': 20, 'min_samples_leaf': 6, 'max_depth': 7, 'min_impurity_decrease': 0.7567968679632993, 'max_features': 19}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:13,725][0m Trial 156 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.008779405238227686, 'n_estimators': 137, 'subsample': 0.7650369449209494, 'min_samples_split': 19, 'min_samples_lea

[32m[I 2021-12-21 20:52:30,601][0m Trial 176 finished with value: 0.8092335697696316 and parameters: {'learning_rate': 0.00915013840214796, 'n_estimators': 130, 'subsample': 0.9439667156360924, 'min_samples_split': 20, 'min_samples_leaf': 6, 'max_depth': 7, 'min_impurity_decrease': 0.9727670831718491, 'max_features': 19}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:31,419][0m Trial 177 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.009425785940142973, 'n_estimators': 128, 'subsample': 0.9672192022414277, 'min_samples_split': 20, 'min_samples_leaf': 6, 'max_depth': 7, 'min_impurity_decrease': 0.750607074629242, 'max_features': 20}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:32,215][0m Trial 178 finished with value: 0.8193208210407381 and parameters: {'learning_rate': 0.009352894798186868, 'n_estimators': 128, 'subsample': 0.9673029641578667, 'min_samples_split': 20, 'min_samples_leaf'

[32m[I 2021-12-21 20:52:48,452][0m Trial 198 finished with value: 0.8159374803841566 and parameters: {'learning_rate': 0.009915017474361234, 'n_estimators': 132, 'subsample': 0.9606357544886812, 'min_samples_split': 19, 'min_samples_leaf': 4, 'max_depth': 7, 'min_impurity_decrease': 0.7678333835198141, 'max_features': 20}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:49,253][0m Trial 199 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.00954573827418227, 'n_estimators': 125, 'subsample': 0.9501448280928562, 'min_samples_split': 19, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7596487335441461, 'max_features': 20}. Best is trial 149 with value: 0.8226790534178645.[0m
[32m[I 2021-12-21 20:52:50,044][0m Trial 200 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.009638091215613322, 'n_estimators': 123, 'subsample': 0.9512018488061647, 'min_samples_split': 19, 'min_samples_leaf

[32m[I 2021-12-21 20:53:05,817][0m Trial 220 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009670508374217509, 'n_estimators': 129, 'subsample': 0.9440297623156109, 'min_samples_split': 20, 'min_samples_leaf': 4, 'max_depth': 7, 'min_impurity_decrease': 0.7606016111236046, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:06,653][0m Trial 221 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.00997554305978691, 'n_estimators': 133, 'subsample': 0.9550422567379198, 'min_samples_split': 19, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7560798815672841, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:07,499][0m Trial 222 finished with value: 0.8193145439708743 and parameters: {'learning_rate': 0.009339379357258565, 'n_estimators': 137, 'subsample': 0.9646624657464875, 'min_samples_split': 19, 'min_samples_leaf

[32m[I 2021-12-21 20:53:23,827][0m Trial 242 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.009963804938429906, 'n_estimators': 134, 'subsample': 0.9569174846059586, 'min_samples_split': 19, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7550042296893562, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:24,723][0m Trial 243 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.008177467368662263, 'n_estimators': 141, 'subsample': 0.9432211809629968, 'min_samples_split': 19, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7580131552634749, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:25,536][0m Trial 244 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009680792685571265, 'n_estimators': 130, 'subsample': 0.9566402513025207, 'min_samples_split': 19, 'min_samples_lea

[32m[I 2021-12-21 20:53:41,981][0m Trial 264 finished with value: 0.8159312033142928 and parameters: {'learning_rate': 0.009772991658978937, 'n_estimators': 129, 'subsample': 0.8425779265054473, 'min_samples_split': 19, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7689308182911284, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:42,871][0m Trial 265 finished with value: 0.8193208210407382 and parameters: {'learning_rate': 0.009461003725523632, 'n_estimators': 139, 'subsample': 0.9678616036730191, 'min_samples_split': 20, 'min_samples_leaf': 5, 'max_depth': 7, 'min_impurity_decrease': 0.7545206098855853, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:53:43,753][0m Trial 266 finished with value: 0.8181846713953925 and parameters: {'learning_rate': 0.008339357147604023, 'n_estimators': 140, 'subsample': 0.9610588685810056, 'min_samples_split': 19, 'min_samples_lea

[32m[I 2021-12-21 20:54:00,167][0m Trial 286 finished with value: 0.8215554579122465 and parameters: {'learning_rate': 0.009997163502107345, 'n_estimators': 121, 'subsample': 0.9556290684000991, 'min_samples_split': 18, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7610131207570648, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:00,918][0m Trial 287 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009158434276183403, 'n_estimators': 111, 'subsample': 0.9574961819836815, 'min_samples_split': 18, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.760565388599001, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:01,570][0m Trial 288 finished with value: 0.7957504237022157 and parameters: {'learning_rate': 0.008440154962125291, 'n_estimators': 120, 'subsample': 0.952686821064846, 'min_samples_split': 18, 'min_samples_leaf'

[32m[I 2021-12-21 20:54:17,889][0m Trial 308 finished with value: 0.8159374803841567 and parameters: {'learning_rate': 0.009571598698116935, 'n_estimators': 150, 'subsample': 0.9389805262581936, 'min_samples_split': 19, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7639908153516786, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:18,672][0m Trial 309 finished with value: 0.8137153976523759 and parameters: {'learning_rate': 0.009807696709011849, 'n_estimators': 135, 'subsample': 0.9110629536779387, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 4, 'min_impurity_decrease': 0.7569758823976271, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:19,389][0m Trial 310 finished with value: 0.8136965664427844 and parameters: {'learning_rate': 0.008498100950521495, 'n_estimators': 142, 'subsample': 0.9470264915829858, 'min_samples_split': 18, 'min_samples_leaf

[32m[I 2021-12-21 20:54:37,064][0m Trial 330 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.008671635516935191, 'n_estimators': 141, 'subsample': 0.9428746569766878, 'min_samples_split': 8, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7671629186014461, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:37,974][0m Trial 331 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.008728576273173628, 'n_estimators': 144, 'subsample': 0.9541568844941088, 'min_samples_split': 8, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.767280413402491, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:38,876][0m Trial 332 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.008647333932469593, 'n_estimators': 141, 'subsample': 0.9498780166447407, 'min_samples_split': 7, 'min_samples_leaf': 

[32m[I 2021-12-21 20:54:57,507][0m Trial 352 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009066693509234523, 'n_estimators': 150, 'subsample': 0.9491197624617616, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.769597883330427, 'max_features': 17}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:58,439][0m Trial 353 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.0077132154523169875, 'n_estimators': 146, 'subsample': 0.9428720386800478, 'min_samples_split': 8, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7712286023793558, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:54:59,377][0m Trial 354 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.00793831589073763, 'n_estimators': 151, 'subsample': 0.9292504708553073, 'min_samples_split': 5, 'min_samples_leaf': 

[32m[I 2021-12-21 20:55:17,841][0m Trial 374 finished with value: 0.8137028435126483 and parameters: {'learning_rate': 0.005402341364597142, 'n_estimators': 153, 'subsample': 0.93464866130703, 'min_samples_split': 8, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7767603747276643, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:55:18,952][0m Trial 375 finished with value: 0.7979411210846776 and parameters: {'learning_rate': 0.002471511947388039, 'n_estimators': 158, 'subsample': 0.9400663491285227, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7918549056466897, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:55:19,896][0m Trial 376 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.007062797530982539, 'n_estimators': 148, 'subsample': 0.9445691846403544, 'min_samples_split': 6, 'min_samples_leaf': 3

[32m[I 2021-12-21 20:55:38,293][0m Trial 396 finished with value: 0.817079907099366 and parameters: {'learning_rate': 0.009262756068036905, 'n_estimators': 173, 'subsample': 0.9348392297098376, 'min_samples_split': 6, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.77153645839194, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:55:39,287][0m Trial 397 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008499567484484142, 'n_estimators': 164, 'subsample': 0.9391157089304174, 'min_samples_split': 8, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7681352479311384, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:55:40,238][0m Trial 398 finished with value: 0.817079907099366 and parameters: {'learning_rate': 0.009034809151900846, 'n_estimators': 158, 'subsample': 0.9499488117521464, 'min_samples_split': 8, 'min_samples_leaf': 3, 

[32m[I 2021-12-21 20:55:58,733][0m Trial 418 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.008783162061710419, 'n_estimators': 154, 'subsample': 0.9427255896576542, 'min_samples_split': 10, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7742649598933085, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:55:59,617][0m Trial 419 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.008782473269896299, 'n_estimators': 143, 'subsample': 0.8363771790166957, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.771332136760455, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:00,536][0m Trial 420 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008900799410851791, 'n_estimators': 148, 'subsample': 0.9366619173075736, 'min_samples_split': 6, 'min_samples_leaf':

[32m[I 2021-12-21 20:56:18,411][0m Trial 440 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.0085600851754, 'n_estimators': 158, 'subsample': 0.9371953076362594, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7895790866594372, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:19,400][0m Trial 441 finished with value: 0.8159563115937478 and parameters: {'learning_rate': 0.008752100942983601, 'n_estimators': 166, 'subsample': 0.9444274939140839, 'min_samples_split': 7, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7794474076523237, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:20,247][0m Trial 442 finished with value: 0.8181783943255289 and parameters: {'learning_rate': 0.008035593201624737, 'n_estimators': 145, 'subsample': 0.9539677609311942, 'min_samples_split': 8, 'min_samples_leaf': 4, '

[32m[I 2021-12-21 20:56:37,942][0m Trial 462 finished with value: 0.8137028435126483 and parameters: {'learning_rate': 0.008973783716833642, 'n_estimators': 149, 'subsample': 0.9419098900622047, 'min_samples_split': 11, 'min_samples_leaf': 11, 'max_depth': 7, 'min_impurity_decrease': 0.7713508460272814, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:38,886][0m Trial 463 finished with value: 0.815956311593748 and parameters: {'learning_rate': 0.009418180862784026, 'n_estimators': 161, 'subsample': 0.9373151073532977, 'min_samples_split': 12, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7851168994643094, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:39,822][0m Trial 464 finished with value: 0.8204318624066287 and parameters: {'learning_rate': 0.008700571693538921, 'n_estimators': 157, 'subsample': 0.927834437821751, 'min_samples_split': 9, 'min_samples_leaf'

[32m[I 2021-12-21 20:56:58,066][0m Trial 484 finished with value: 0.8137153976523759 and parameters: {'learning_rate': 0.00800370130588815, 'n_estimators': 157, 'subsample': 0.9234173985389661, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 5, 'min_impurity_decrease': 0.7929348396287961, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:58,954][0m Trial 485 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.009189148492444706, 'n_estimators': 141, 'subsample': 0.9523573185721373, 'min_samples_split': 8, 'min_samples_leaf': 4, 'max_depth': 7, 'min_impurity_decrease': 0.7652349351246976, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:56:59,938][0m Trial 486 finished with value: 0.813709120582512 and parameters: {'learning_rate': 0.008767905403815964, 'n_estimators': 163, 'subsample': 0.9315712998874127, 'min_samples_split': 10, 'min_samples_leaf': 

[32m[I 2021-12-21 20:57:17,386][0m Trial 506 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.008349521976298678, 'n_estimators': 155, 'subsample': 0.9429370945067207, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7952742872550995, 'max_features': 17}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:57:18,303][0m Trial 507 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008689216032143196, 'n_estimators': 144, 'subsample': 0.9508118573241223, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.799266961595352, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:57:19,221][0m Trial 508 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009013834641567785, 'n_estimators': 155, 'subsample': 0.9325403869118223, 'min_samples_split': 6, 'min_samples_leaf': 

[32m[I 2021-12-21 20:57:37,868][0m Trial 528 finished with value: 0.8125980792166217 and parameters: {'learning_rate': 0.00925740550734669, 'n_estimators': 138, 'subsample': 0.9526520739455435, 'min_samples_split': 7, 'min_samples_leaf': 7, 'max_depth': 7, 'min_impurity_decrease': 0.8160505898197337, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:57:38,809][0m Trial 529 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008440552718962435, 'n_estimators': 154, 'subsample': 0.9440386048119167, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.8038346592565913, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:57:39,714][0m Trial 530 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.008783869806656963, 'n_estimators': 149, 'subsample': 0.9307178932467258, 'min_samples_split': 6, 'min_samples_leaf': 

[32m[I 2021-12-21 20:57:58,151][0m Trial 550 finished with value: 0.8182035026049841 and parameters: {'learning_rate': 0.00895828582689948, 'n_estimators': 161, 'subsample': 0.9238190068011091, 'min_samples_split': 10, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7726818031565332, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:57:59,102][0m Trial 551 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.008168062435125144, 'n_estimators': 149, 'subsample': 0.9500200785736045, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7881164833731865, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:00,016][0m Trial 552 finished with value: 0.8170861841692298 and parameters: {'learning_rate': 0.009160489985400502, 'n_estimators': 144, 'subsample': 0.9595806539088357, 'min_samples_split': 12, 'min_samples_leaf'

[32m[I 2021-12-21 20:58:18,603][0m Trial 572 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.008106184600177493, 'n_estimators': 145, 'subsample': 0.9454324384102369, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7535018740576255, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:19,538][0m Trial 573 finished with value: 0.8193082669010104 and parameters: {'learning_rate': 0.00859229317633687, 'n_estimators': 153, 'subsample': 0.9399360545560271, 'min_samples_split': 6, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7950254482738477, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:20,394][0m Trial 574 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009542822415767151, 'n_estimators': 136, 'subsample': 0.9501203754662213, 'min_samples_split': 8, 'min_samples_leaf': 

[32m[I 2021-12-21 20:58:38,627][0m Trial 594 finished with value: 0.8159625886636117 and parameters: {'learning_rate': 0.008468558383118824, 'n_estimators': 186, 'subsample': 0.8897094785547406, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7877310436589139, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:39,510][0m Trial 595 finished with value: 0.8170673529596384 and parameters: {'learning_rate': 0.008766249866385713, 'n_estimators': 149, 'subsample': 0.9105755543025648, 'min_samples_split': 13, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.8153905654516264, 'max_features': 17}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:40,470][0m Trial 596 finished with value: 0.8159625886636119 and parameters: {'learning_rate': 0.008953367260594683, 'n_estimators': 174, 'subsample': 0.9274439478490528, 'min_samples_split': 7, 'min_samples_leaf'

[32m[I 2021-12-21 20:58:58,413][0m Trial 616 finished with value: 0.8215617349821104 and parameters: {'learning_rate': 0.009797757606624557, 'n_estimators': 142, 'subsample': 0.962211666528173, 'min_samples_split': 6, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.8005352134991581, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:59,303][0m Trial 617 finished with value: 0.8193208210407382 and parameters: {'learning_rate': 0.009632098974552116, 'n_estimators': 141, 'subsample': 0.969446745783037, 'min_samples_split': 8, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.826468245018116, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:58:59,835][0m Trial 618 finished with value: 0.809189630280585 and parameters: {'learning_rate': 0.00950611391771956, 'n_estimators': 67, 'subsample': 0.958664063552098, 'min_samples_split': 6, 'min_samples_leaf': 2, 'ma

[32m[I 2021-12-21 20:59:17,784][0m Trial 638 finished with value: 0.8137153976523759 and parameters: {'learning_rate': 0.009208330211179177, 'n_estimators': 147, 'subsample': 0.9510449297731962, 'min_samples_split': 8, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.8179274898713601, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:18,603][0m Trial 639 finished with value: 0.8103446111355218 and parameters: {'learning_rate': 0.008742093480458775, 'n_estimators': 145, 'subsample': 0.9601124381089619, 'min_samples_split': 9, 'min_samples_leaf': 2, 'max_depth': 4, 'min_impurity_decrease': 0.778410934909539, 'max_features': 15}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:19,478][0m Trial 640 finished with value: 0.8226790534178645 and parameters: {'learning_rate': 0.009713231336526624, 'n_estimators': 135, 'subsample': 0.9674089936524229, 'min_samples_split': 6, 'min_samples_leaf': 

[32m[I 2021-12-21 20:59:36,484][0m Trial 660 finished with value: 0.8181846713953925 and parameters: {'learning_rate': 0.009291313984539395, 'n_estimators': 151, 'subsample': 0.9457177451225093, 'min_samples_split': 8, 'min_samples_leaf': 4, 'max_depth': 7, 'min_impurity_decrease': 0.7766420493866172, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:37,482][0m Trial 661 finished with value: 0.8159437574540205 and parameters: {'learning_rate': 0.006296442153074821, 'n_estimators': 146, 'subsample': 0.957822982401928, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7914282984144622, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:38,371][0m Trial 662 finished with value: 0.8181846713953927 and parameters: {'learning_rate': 0.009406910659646254, 'n_estimators': 139, 'subsample': 0.9752731324018185, 'min_samples_split': 15, 'min_samples_leaf':

[32m[I 2021-12-21 20:59:55,824][0m Trial 682 finished with value: 0.8137216747222397 and parameters: {'learning_rate': 0.00976659138874314, 'n_estimators': 131, 'subsample': 0.9805882724471342, 'min_samples_split': 4, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.8987756387937926, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:56,684][0m Trial 683 finished with value: 0.8159688657334756 and parameters: {'learning_rate': 0.009084604159499056, 'n_estimators': 129, 'subsample': 0.9736457005548077, 'min_samples_split': 4, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.8151307259335503, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 20:59:57,784][0m Trial 684 finished with value: 0.8092335697696316 and parameters: {'learning_rate': 0.008767421856574427, 'n_estimators': 199, 'subsample': 0.9722214250299555, 'min_samples_split': 10, 'min_samples_leaf':

[32m[I 2021-12-21 21:00:15,460][0m Trial 704 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.009068224333171561, 'n_estimators': 139, 'subsample': 0.9635701219109555, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7632686540759455, 'max_features': 19}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:16,331][0m Trial 705 finished with value: 0.8159249262444291 and parameters: {'learning_rate': 0.008395333346852081, 'n_estimators': 149, 'subsample': 0.9543210586996888, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7584889917748044, 'max_features': 13}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:17,241][0m Trial 706 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009477300589617018, 'n_estimators': 143, 'subsample': 0.9696053784644663, 'min_samples_split': 10, 'min_samples_leaf'

[32m[I 2021-12-21 21:00:35,512][0m Trial 726 finished with value: 0.8181846713953925 and parameters: {'learning_rate': 0.009136086948701787, 'n_estimators': 148, 'subsample': 0.9317941654201393, 'min_samples_split': 7, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7857280553183565, 'max_features': 17}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:36,480][0m Trial 727 finished with value: 0.8193082669010104 and parameters: {'learning_rate': 0.008791285618358784, 'n_estimators': 158, 'subsample': 0.9363487085418205, 'min_samples_split': 8, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7718326809457765, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:37,357][0m Trial 728 finished with value: 0.8125918021467579 and parameters: {'learning_rate': 0.008969073298352033, 'n_estimators': 152, 'subsample': 0.9439888910794679, 'min_samples_split': 8, 'min_samples_leaf':

[32m[I 2021-12-21 21:00:55,485][0m Trial 748 finished with value: 0.8159437574540205 and parameters: {'learning_rate': 0.009465539878191294, 'n_estimators': 143, 'subsample': 0.9461615378178644, 'min_samples_split': 14, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.750535452651408, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:56,421][0m Trial 749 finished with value: 0.8148138848785387 and parameters: {'learning_rate': 0.008527872571944437, 'n_estimators': 149, 'subsample': 0.9505006082957305, 'min_samples_split': 20, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7597800487940518, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:00:57,290][0m Trial 750 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009245693491819008, 'n_estimators': 139, 'subsample': 0.9435265119875856, 'min_samples_split': 9, 'min_samples_leaf'

[32m[I 2021-12-21 21:01:21,187][0m Trial 770 finished with value: 0.8204381394764925 and parameters: {'learning_rate': 0.00965825398727561, 'n_estimators': 139, 'subsample': 0.9539554861839884, 'min_samples_split': 11, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7715592248647991, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:01:20,322][0m Trial 771 finished with value: 0.8238026489234824 and parameters: {'learning_rate': 0.009070389444516705, 'n_estimators': 133, 'subsample': 0.9451345684831808, 'min_samples_split': 10, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7667506084407588, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:01:21,230][0m Trial 772 finished with value: 0.8181783943255289 and parameters: {'learning_rate': 0.009142664228468454, 'n_estimators': 128, 'subsample': 0.8521802222378673, 'min_samples_split': 9, 'min_samples_leaf'

[32m[I 2021-12-21 21:01:40,795][0m Trial 792 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.009394298565325466, 'n_estimators': 135, 'subsample': 0.9502126277129438, 'min_samples_split': 12, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7760354760189445, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:01:41,839][0m Trial 793 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.009422859773380846, 'n_estimators': 139, 'subsample': 0.9600526043168139, 'min_samples_split': 11, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.775122898718492, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:01:42,753][0m Trial 794 finished with value: 0.8193208210407382 and parameters: {'learning_rate': 0.009747441134141725, 'n_estimators': 132, 'subsample': 0.9112414006061482, 'min_samples_split': 11, 'min_samples_leaf

[32m[I 2021-12-21 21:02:01,184][0m Trial 814 finished with value: 0.8204318624066287 and parameters: {'learning_rate': 0.009121686252234437, 'n_estimators': 136, 'subsample': 0.9434793890074566, 'min_samples_split': 10, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7601972596685124, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:02,122][0m Trial 815 finished with value: 0.8103571652752496 and parameters: {'learning_rate': 0.008614796906601432, 'n_estimators': 148, 'subsample': 0.9309324034087807, 'min_samples_split': 9, 'min_samples_leaf': 9, 'max_depth': 5, 'min_impurity_decrease': 0.7860097678341411, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:03,279][0m Trial 816 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009525974945427668, 'n_estimators': 131, 'subsample': 0.957385984056856, 'min_samples_split': 12, 'min_samples_leaf'

[32m[I 2021-12-21 21:02:22,341][0m Trial 836 finished with value: 0.8125792480070304 and parameters: {'learning_rate': 0.009975945165795704, 'n_estimators': 126, 'subsample': 0.9353955579809877, 'min_samples_split': 12, 'min_samples_leaf': 11, 'max_depth': 7, 'min_impurity_decrease': 0.7890797615531784, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:23,285][0m Trial 837 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.008857757604200217, 'n_estimators': 146, 'subsample': 0.9053639574640918, 'min_samples_split': 11, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7667948112679991, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:23,981][0m Trial 838 finished with value: 0.8170422446801833 and parameters: {'learning_rate': 0.008438849336458791, 'n_estimators': 152, 'subsample': 0.9689410812143809, 'min_samples_split': 9, 'min_samples_lea

[32m[I 2021-12-21 21:02:41,759][0m Trial 858 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.00840231085819277, 'n_estimators': 158, 'subsample': 0.9703909980823949, 'min_samples_split': 9, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7928592886435151, 'max_features': 17}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:42,745][0m Trial 859 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.008755757525525015, 'n_estimators': 151, 'subsample': 0.9426481932292541, 'min_samples_split': 16, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7728959585038317, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:02:43,896][0m Trial 860 finished with value: 0.7552758772205135 and parameters: {'learning_rate': 0.0018125318238350827, 'n_estimators': 137, 'subsample': 0.961442443497516, 'min_samples_split': 10, 'min_samples_leaf'

[32m[I 2021-12-21 21:03:01,999][0m Trial 880 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009068152950058991, 'n_estimators': 137, 'subsample': 0.9525627344674001, 'min_samples_split': 11, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7723265632665813, 'max_features': 18}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:03:02,935][0m Trial 881 finished with value: 0.8193145439708743 and parameters: {'learning_rate': 0.009504828176727058, 'n_estimators': 139, 'subsample': 0.9634821768426818, 'min_samples_split': 11, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.784731438415291, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:03:03,976][0m Trial 882 finished with value: 0.8148389931579938 and parameters: {'learning_rate': 0.00830014181723378, 'n_estimators': 154, 'subsample': 0.9483081519276383, 'min_samples_split': 10, 'min_samples_leaf'

[32m[I 2021-12-21 21:03:21,971][0m Trial 902 finished with value: 0.8193082669010107 and parameters: {'learning_rate': 0.009542700268857552, 'n_estimators': 133, 'subsample': 0.9647064292004305, 'min_samples_split': 10, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7813017302985632, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:03:22,873][0m Trial 903 finished with value: 0.8204318624066287 and parameters: {'learning_rate': 0.009672165487628507, 'n_estimators': 129, 'subsample': 0.9742250583828858, 'min_samples_split': 12, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.783130259908266, 'max_features': 20}. Best is trial 213 with value: 0.8238026489234826.[0m
[32m[I 2021-12-21 21:03:23,779][0m Trial 904 finished with value: 0.8226790534178645 and parameters: {'learning_rate': 0.009323823374016896, 'n_estimators': 127, 'subsample': 0.9687490942281991, 'min_samples_split': 12, 'min_samples_leaf

[32m[I 2021-12-21 21:03:41,861][0m Trial 924 finished with value: 0.8249262444291006 and parameters: {'learning_rate': 0.009620110160365543, 'n_estimators': 135, 'subsample': 0.9699332268604093, 'min_samples_split': 12, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.7763728807219239, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:03:42,712][0m Trial 925 finished with value: 0.8024919967359236 and parameters: {'learning_rate': 0.009668740774527543, 'n_estimators': 130, 'subsample': 0.9756570244684117, 'min_samples_split': 13, 'min_samples_leaf': 2, 'max_depth': 3, 'min_impurity_decrease': 0.803228896761399, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:03:43,653][0m Trial 926 finished with value: 0.8204318624066287 and parameters: {'learning_rate': 0.00930509373941025, 'n_estimators': 137, 'subsample': 0.9640136845665945, 'min_samples_split': 11, 'min_samples_leaf'

[32m[I 2021-12-21 21:04:01,371][0m Trial 946 finished with value: 0.8024543343167411 and parameters: {'learning_rate': 0.0039172893223174765, 'n_estimators': 122, 'subsample': 0.9742780393020316, 'min_samples_split': 13, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7914854941543356, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:02,255][0m Trial 947 finished with value: 0.8226790534178645 and parameters: {'learning_rate': 0.009248429682993721, 'n_estimators': 127, 'subsample': 0.9677939638153237, 'min_samples_split': 12, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7718577272503505, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:03,198][0m Trial 948 finished with value: 0.8215554579122465 and parameters: {'learning_rate': 0.009053812042918358, 'n_estimators': 139, 'subsample': 0.9549511686842808, 'min_samples_split': 11, 'min_samples_le

[32m[I 2021-12-21 21:04:21,524][0m Trial 968 finished with value: 0.8204318624066286 and parameters: {'learning_rate': 0.009033628135985634, 'n_estimators': 141, 'subsample': 0.9530008962063653, 'min_samples_split': 11, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.7770863376452538, 'max_features': 19}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:22,407][0m Trial 969 finished with value: 0.8215554579122466 and parameters: {'learning_rate': 0.009778356520872888, 'n_estimators': 125, 'subsample': 0.9814299346234919, 'min_samples_split': 12, 'min_samples_leaf': 2, 'max_depth': 7, 'min_impurity_decrease': 0.782759070463546, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:23,274][0m Trial 970 finished with value: 0.8170610758897746 and parameters: {'learning_rate': 0.009264197852050913, 'n_estimators': 130, 'subsample': 0.9341472623616297, 'min_samples_split': 12, 'min_samples_leaf

[32m[I 2021-12-21 21:04:39,963][0m Trial 990 finished with value: 0.8148389931579938 and parameters: {'learning_rate': 0.009743591365172312, 'n_estimators': 136, 'subsample': 0.9665818463404542, 'min_samples_split': 13, 'min_samples_leaf': 3, 'max_depth': 7, 'min_impurity_decrease': 0.8650366400987024, 'max_features': 19}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:40,866][0m Trial 991 finished with value: 0.8215554579122465 and parameters: {'learning_rate': 0.0094436062437568, 'n_estimators': 122, 'subsample': 0.9705801382560607, 'min_samples_split': 11, 'min_samples_leaf': 3, 'max_depth': 6, 'min_impurity_decrease': 0.7750278590621598, 'max_features': 20}. Best is trial 924 with value: 0.8249262444291006.[0m
[32m[I 2021-12-21 21:04:41,780][0m Trial 992 finished with value: 0.8215554579122465 and parameters: {'learning_rate': 0.009997650461490336, 'n_estimators': 139, 'subsample': 0.8777270194616824, 'min_samples_split': 11, 'min_samples_leaf'

In [None]:
gbm = lgb.train(study.best_params, lgb.Dataset(train[features], label=train['Survived']))
preds = gbm.predict(test[features])

In [96]:
submission = pd.read_csv('data/submission.csv')

In [97]:
submission['Survived'] = np.abs(np.rint(preds))

In [98]:
submission.to_csv("data/submission.csv", index = False)