An overview of boosting tree algorithms, their main differences, performance comparisons, and hyperparameter optimization. In this notebook, we will delve deeper into boosted trees, specifically comparing XGBoost, CatBoost, and LightGBM. We will explore their main differences, parameters in each algorithm, compare their performance on different datasets, assess their CPU and GPU usage, conduct Optuna optimization, and examine SHAP values.

- [This Notebook repository](https://github.com/joaomh/xgboost-catboost-lgbm)
- [Undergraduate GitHub repository](https://github.com/joaomh/study_boosting_optuna_USP_undergraduate_thesis)
- [Link to my Undergraduate thesis in PT-BR](https://bdta.abcd.usp.br/item/003122385)


# Tables of Content:

**1. [Introduction](#Introduction)**

**2. [Ensemble Learning](#Ensemble)**

**3. [AdaBoost](#Ada)** 

**4. [GBMs](#GBM)** 

**5. [XGBoost vs. CatBoost vs. LightGBM](#XGBoost)** 
- 5.1 [Tree Symmetry](#Tree)
- 5.2 [Splitting Method](Splitting)

**6. [Models Performance CPU vs GPU](#Models)** 

**7. [Optuna in XGBoost vs. CatBoost vs. LightGBM](#Optuna)** 

**8. [Shap in XGBoost vs. CatBoost vs. LightGBM](#Shap)** 

**9. [Conclusions](#Conclusion)**

**10. [Bibliography](#Bibliography)**


# Introduction

The purpose of this post is to introduce the fundamentals of boosting algorithms and the main difference between XGBoost, CatBoost and LightGBM. We will give the reader some necessary keys to well understand and use related methods and be able to design adapted solutions when needed.

If we look at the [2022 Kaggle Data Science & ML Survey](https://www.kaggle.com/kaggle-survey-2022), we can see that Gradient Boosting Machines (GBMs) have been widely used in recent years. They are supervised machine learning algorithms that have consistently produced excellent results across a wide range of problems and have won numerous machine learning competitions.


![png1](./img/kaggle_state.png)

They achieve this because boosting algorithms are very effective on tabular datasets and offer the same performance as other state-of-the-art deep learning techniques, but they are easier to implement and cost less in terms of computer resources.

# Ensemble 
Many machine learning models primarily aim for high prediction accuracy using a single model, where boosting algorithms strive to enhance predictions by sequentially training a series of weak models, with each model compensating for the weaknesses of its predecessors.

First of all we need to understand Ensemble Learning, it's based on the idea of combining several simpler prediction models (weak learner), training them for the same task, and producing from them a more complex grouped model (strong learner) that is the sum of its parts.

For example, when creating an ensemble model based on several decision trees, which are simple yet high-variance models (often considered 'weak learners'), we need to aggregate them to enhance their resistance to data variations. Therefore, it makes sense to train the trees separately, allowing each one to adapt to different parts of the dataset. This way, each tree gains knowledge about various data variations, collectively improving the ensemble's predictive performance.

There are various ensemble learning methods, but in this text, we will primarily focus on Boosting, which is used in GBMs, but we can mention three algorithms that aims at combining weak learners:

**Bagging**: It is generally done with homogeneous predictors, each one operating independently in relation to the others, in a parallel manner. The final algorithm is then constructed by aggregating the results obtained from the base models in some form of average. Random Forest is one of the most famous algorithm.

**Boosting**: Generally implemented with homogeneous predictors, applied sequentially where the posterior model depends on the predecessor, and then these models are combined in the final ensemble. GBMs work like this

**Stacking**: It is typically done with heterogeneous predictors, training them in parallel, and then combining their outputs by training a meta-model that generates predictions based on the predictions of the various weak models. Here we can for combine RandomForest with DecisionTree for example.

![png1](./img/boosting_bagging.png) [Image from Ensemble Learning: Bagging & Boosting](https://towardsdatascience.com/ensemble-learning-bagging-boosting-3098079e5422)

# AdaBoost
AdaBoost is a specific Boosting algorithm developed for classification problems hte original AdaBoost algorithm is designed for classification problems, where the output is either −1 or 1, and the final prediction for a given instance is a weighted sum of each generated weak classifier

$$
G(x) = sign\bigr[\sum^M_{m=1}\alpha_m\cdot G_m(x)\bigr]
$$
Here, the weights $\alpha_m$
are computed by the boosting algorithm, and the idea is to increase the influence of weak learners that are more accurate while simultaneously penalizing those that are not.
The weakness is identified by the weak estimator error rate
$$err_m = \frac{\sum_{i=1}^Nw_i\mathbf{I}(y_i\neq G_m(x_i))}{\sum_{i=1}^Nw_i}$$


1. Initialize the observation weights $w_i = 1/N, i = 1, 2, . . . , N .$
2. For $m=1$ to $M$:

    2.1. Fit a classifier $G_m(x)$ to the training data using weights $w_i$

    2.2. Compute $err_m = \frac{\sum_{i=1}^Nw_i\mathbf{1}(y_i\neq G_m(x_i))}{\sum_{i=1}^Nw_i}$

    2.3. Compute $\alpha_m = log((1-err_m)/err_m)$

    2.4. Set $w_i \rightarrow w_i\cdot exp[\alpha_m \cdot \mathbf{1}(y_i\neq G_m(x_i))],i=1,2,...,N$

3. Output $G(x) = sign\bigr[\sum^M_{m=1}\alpha_m\cdot G_m(x)\bigr]$

From [1][2]

![png](img/ada.png) [Marsh, Brendan (2016). Multivariate Analysis of the Vector Boson Fusion Higgs Boson](https://www.researchgate.net/publication/306054843_Multivariate_Analysis_of_the_Vector_Boson_Fusion_Higgs_Boson)

Scikit-Learn have a implementation of AdaBoost

In [1]:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = AdaBoostClassifier(n_estimators=100, random_state=0)
clf.fit(X, y)
clf.predict([[0, 0, 0, 0]])
clf.score(X, y)

0.983

In [2]:
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an AdaBoostClassifier with a base DecisionTreeClassifier
clf = AdaBoostClassifier(n_estimators=100, random_state=0)

# Fit the classifier to the training data
clf.fit(X_train, y_train)

# Make predictions on the test data
y_pred = clf.predict(X_test)

# Evaluate the model
clf.score(X, y)

0.96

# GBMs
The Gradient Boosting Machines algorithm works by optimizing any given differentiable loss function, using gradient descent [3].

We can write de GBM model as 

$$F_M(x) = F_0(x) + \sum_{m=1}^MF_m(x)$$
$ \beta_mh(x; a_m)$ are the base functions learners, where $\beta_m$ is the weight, and $a_m$ the parameters of the learner $h$. And we have a loss function $L(y_i,F_m(x_i))$, so we would like to find all optimal values of this parameters that would minimize this loss funciton.
$$    \{\beta_m,\alpha_m\}_1^M = {\arg\min}_{\{\beta'_m,\alpha'_m\}_1^M}\sum_{i=1}^n L\Biggl(y^{(i)},\sum_{m=1}^M\beta'_mh(\mathbf{x}^{(i)};\alpha'_m)\Biggl)$$

In this situations where is infeasible we can try a 'greedy-stagewise' approach for $m=1,2,3,...,M$

$$(\beta_m,\alpha_m) = {\arg\min}_{\beta,\alpha}\sum_{i=1}^n L\Biggl(y^{(i)},F_{m-1}\mathbf{x}^{(i)} + \beta h(\mathbf{x}^{(i)};\alpha)\Biggl)$$
And then we can use a vectorized notation and make similar to the gradient descent formula. The learning rate, $\eta$ shrinks the influence of the new learner.
$$F_m(\mathbf{X}) = F_{m-1}(\mathbf{X}) + \eta \Delta_m(X)$$


The gradient of the loss function $L$ with relation to the last estimate $F_{m−1}(x)$ is,
$$-g_m(\mathbf{x}^{(i)}) = -\Bigg[\frac{\partial L(y^{(i)},c^{(i)})}{\partial F(\mathbf{x}^{(i)})}\Bigg]$$


Gradient of the loss function $L$ with respect to the last prediction is sometimes called pseudo-residual, and written as $r_{m−1}$ can be written as
$$\mathbf{r}_{m_1} = \nabla F_{m-1}(\mathbf{X})L(y,F_{m-1}(\mathbf{X})) = \nabla \hat{y}_{m-1}L(y,\hat{y}_{\mathbf{m-1}})$$


1. $F_0(\mathbf{X} = \arg\min_v\sum_{i=1}^n L(y^{(i)},v)$
2. For $m=1$ to $M$:

    2.1. $\mathbf{r}_{m_1} = \nabla \hat{y}_{m-1}L(y,\hat{y}_{\mathbf{m-1}})$ # Train a base learner minimizing squared error

    2.2. $\alpha = {\arg\min}_{\alpha,\beta}\sum_{i=1}^n(\mathbf{r}_{m-1}^{(i)}-\beta h(\mathbf{x}^{(i)};\alpha))^2$

    2.3. $\beta = {\arg\min}_{\beta}\sum_{i=1}^nL(y^{(i)},F_{m-1}(\mathbf{x}^{(i))}+\beta h(\mathbf{x}^{(i))};\alpha_m)$

    2.4. $\Delta_m(X) = \beta_mh(\mathbf{X};\alpha_m)$
    
    2.5 $F_m(\mathbf{X}) = F_{m-1}(\mathbf{X}) + \eta \Delta_m(X)$                                                              

3. Output $F_m$

From [3]


As you can see, it is an iterative algorithm that usually works with decision trees. We train a sequence of decision trees to gradually reduce the training error (each new tree tries to predict the residual error, this is the error at that current iteration and then we multiplied by the learning rate

![boost](img/boosting_tree.png)

As you can see the final prediction is:

initial_prediction + learning_rate*residual_0 + learning_rate*residual_1 _+ ... learning_rate*residual_N

Or

$F_m(\mathbf{X}) = F_{m-1}(\mathbf{X}) + \eta \Delta_m(X)$ 



We also can find Gradient Boosting function in scikit-learn

In [3]:
import pandas as pd
import numpy as np 
from sklearn.metrics import classification_report
from sklearn.model_selection import KFold
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import GradientBoostingClassifier
X = pd.DataFrame(load_breast_cancer()['data'],columns=load_breast_cancer()['feature_names'])
y = pd.DataFrame(load_breast_cancer()['target'],columns=['target'])
kf = KFold(n_splits=5,random_state=42,shuffle=True)
for train_index, index in kf.split(X):
    X_train,X_val = X.iloc[train_index],X.iloc[index],
    y_train,y_val = y.iloc[train_index],y.iloc[index],
gradient_booster = GradientBoostingClassifier()
gradient_booster.fit(X_train,y_train.values.ravel())
print(classification_report(y_val,gradient_booster.predict(X_val)))

              precision    recall  f1-score   support

           0       0.98      0.93      0.96        46
           1       0.96      0.99      0.97        67

    accuracy                           0.96       113
   macro avg       0.97      0.96      0.96       113
weighted avg       0.96      0.96      0.96       113



# XGBoost vs. CatBoost vs. LightGBM

XGBoost, Catboost, and LightGBM are all variations of gradient boosting algorithms, each employing decision trees as weak learners. I strongly recommend reading the papers [4], [5], [6]. Now, I'm going to highlight the main differences in each algorithm.

|   | XGBoost  | CatBoost  | LightGBM  |
|---|---|---|---|
|Developer |DMLC| Yandex| Microsoft|
|Release Year |2014 |2017 |2016|
|Tree Symmetry |Asymmetric: Level-wise tree growth |Symmetric |Asymmetric: Leaf-wise tree growth|
|Splitting Method |Pre-sorted and histogram-based algorithms |Greedy| GOSS|
|Categorical Columns |Support but must use numerical columns, cannot interpret ordinal category| Support |Support|
|Text Columns | Not Support| Support: Bag-of-Words, Naive-Bayes or BM25 to calculate numerical features from text| Not Support| 
|Missing Values | Handle | Handle |Handle|
|Training on|CPU and GPU|CPU and GPU|CPU and GPU|
|Others things| Works with Spark | Easy generate the Learning Curve |Have a RandomForest "boosting method"|

All of the models have different loss functions in their objectives, some of which are as follows:

For Regression:

* L2: mean squared error (default, recovers the mean value)
* L1: mean absolute error (good for outliers)
* MAPE: mean absolute percentage error (good for time series)
* Quantile: predict quantiles
* Poisson

For Classification:
* Logloss for binary classification
* Multiclass and cross-entropy for multi-class problems

For other loss functions, you can refer to the documentation of all three algorithms.


# Tree Symmetry
||||
|---|---|---|
![](https://github.com/joaomh/study_boosting_optuna_USP_undergraduate_thesis/blob/main/LaTeX-overleaf/images/CatBoost.png?raw=true)|![](https://github.com/joaomh/study_boosting_optuna_USP_undergraduate_thesis/blob/main/LaTeX-overleaf/images/XGboost.png?raw=true)|![](https://github.com/joaomh/study_boosting_optuna_USP_undergraduate_thesis/blob/main/LaTeX-overleaf/images/LGBM.png?raw=true)

CatBoost produces symmetric trees (or balanced trees). This refers to the splitting condition across all the nodes at the same depth. On the other hand, XGBoost and LightGBM produce asymmetric trees, meaning that the splitting condition at each node can be different.

Another important thing to note is that LightGBM grows leaf-wise (horizontally), while XGBoost grows level-wise (vertically). The picture below can show in more detail the differences in these growth types. This approach can lead to deeper trees with fewer nodes, potentially making it faster to train but may require more memory.

![](https://www.researchgate.net/publication/353155099/figure/fig2/AS:1044071766310913@1625937515739/Level-wise-vs-leaf-wise-tree-growth.png)

# Splitting Method
This determines how the splitting is determined in each algorithm.

In XGBoost, the pre-sorted algorithm considers all features and sorts them by value. The histogram algorithm groups feature values into discrete bins and finds the split point based on these bins. However, it is slower than GOSS.

CatBoost uses a greedy method where a list of possible candidates for feature splits is assigned to the leaf, and the split that results in the smallest penalty is selected.

In LightGBM, Gradient-based One-Side Sampling (GOSS) retains all the data with large gradients and performs random sampling for data instances with small gradients (small traning error). This results in fewer data instances used to train the model.

# Prevent Overfitting

All of the tree models come equipped with excellent parameters designed to mitigate overfitting. We will utilize many of these parameters in our Optuna hyperparameter optimization, some of them are:

**early_stopping_rounds:** This parameter employs an integer to halt the learning process. It identifies a point at which the validation score no longer improves, and in some cases, it may even start to deteriorate, while the training score continues to improve. This is not a hyperparameter that we intend to tune, but it's a crucial parameter to use, and it's not active by default.

**reg_alpha or lambda_l1:** These parameters represent the coefficient at the L1 regularization term of the cost function.

**reg_lambda or l2_leaf_reg:** These parameters represent the coefficient at the L2 regularization term of the cost function.

**learning_rate:** This setting is used to control the gradient step size and, in turn, affects the overall training time. Smaller values require more iterations for training.

**depth or max_depth:** This parameter limits the maximum depth of the tree model. It is employed to combat overfitting when dealing with small datasets.

**num_leaves or max_leaves:** The maximum number of leafs in the resulting tree

**random_strength:** This parameter determines the amount of randomness applied when scoring splits during the selection of the tree structure. You can adjust this parameter to mitigate the risk of overfitting in your model.

# Hyperparameter Tuning
As you can see all three libraries offer a variety of hyperparameters to tune, and their effectiveness can vary depending on the dataset. We will use Optuna in ours tests

# Models Performance CPU vs GPU

In this section, we are going to use three different datasets: epsilon, higgs, and breast cancer. However, we will not delve deeply into the typical steps of a data science project, such as EDA (Exploratory Data Analysis), pre-processing, handling missing values, plotting some variables, and analyzing correlations. Our primary focus will be the performance of out-of-the-box models, as they are designed to handle certain aspects by default, such as missing values.

## Epsilon
[Epsilon dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#epsilon).
This dataset is best suited for binary classification.

The training dataset contains 400000 objects. Each object is described by 2001 columns. The first column contains the label value, all other columns contain numerical features.

The validation dataset contains 100000 objects. The structure is identical to the training dataset.


In [4]:
from catboost.datasets import epsilon
epsilon_train, epsilon_test = epsilon()

In [5]:
epsilon_train.shape, epsilon_test.shape

((400000, 2001), (100000, 2001))

In [6]:
# convert target -1 to 0
epsilon_train[epsilon_train[0] <= 0] = 0
epsilon_test[epsilon_test[0] <= 0] = 0

In [7]:
X_train = epsilon_train.loc[:,1:]
X_test = epsilon_test.loc[:,1:]
y_train = epsilon_train.loc[:,0]
y_test = epsilon_test.loc[:,0]
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((400000, 2000), (100000, 2000), (400000,), (100000,))

In [8]:
import timeit
from catboost import CatBoostClassifier
from xgboost import XGBClassifier
from lightgbm import LGBMClassifier

In [9]:
from sklearn.metrics import roc_auc_score
from sklearn import metrics
time_cpu = ['CPU']
results_auc_cpu = []
models = [XGBClassifier(),
          CatBoostClassifier(verbose=False),
          LGBMClassifier(verbose=-1)]
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_cpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_cpu.append(metrics.auc(fpr, tpr))

In [10]:
time_cpu

['CPU', 82.3515075289979, 160.66510974999983, 26.330691929004388]

In [11]:
results_auc_cpu

[1.0, 1.0, 1.0]

In [12]:
models = [XGBClassifier(tree_method='gpu_hist'),
          CatBoostClassifier(verbose=False,task_type="GPU"),
          LGBMClassifier(verbose=-1,device='gpu')]

time_gpu = ['GPU']
results_auc_gpu = []
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_gpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_gpu.append(metrics.auc(fpr, tpr))

In [13]:
time_gpu

['GPU', 7.428800901994691, 17.33331783200265, 11.865153876002296]

In [14]:
results_auc_gpu

[1.0, 1.0, 1.0]

## Higgs

[Higgs](https://archive.ics.uci.edu/dataset/280/higgs)
This is a classification problem to distinguish between a signal process which produces Higgs bosons and a background process which does not. 

The training dataset contains 10500000 objects. Each object is described by 29 columns. The first column contains the label value, all other columns contain numerical features.

The validation dataset contains 5000000 objects. The structure is identical to the training dataset.
Method call format

In [15]:
from catboost.datasets import higgs
higgs_train, higgs_test = higgs()

In [16]:
higgs_train.shape, higgs_test.shape

((10500000, 29), (500000, 29))

In [17]:
X_train = higgs_train.loc[:,1:]
X_test = higgs_test.loc[:,1:]
y_train = higgs_train.loc[:,0]
y_test = higgs_test.loc[:,0]
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((10500000, 28), (500000, 28), (10500000,), (500000,))

In [18]:
from sklearn.metrics import roc_auc_score
from sklearn import metrics
time_cpu = ['CPU']
results_auc_cpu = []
models = [XGBClassifier(),
          CatBoostClassifier(verbose=False),
          LGBMClassifier(verbose=-1)]
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_cpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_cpu.append(metrics.auc(fpr, tpr))

In [19]:
time_cpu

['CPU', 175.2232390040008, 277.0851479319972, 15.447586549002153]

In [20]:
results_auc_cpu

[0.823429407021503, 0.8412804649425808, 0.8118326628507959]

In [21]:
models = [XGBClassifier(tree_method='gpu_hist'),
          CatBoostClassifier(verbose=False,task_type="GPU"),
          LGBMClassifier(verbose=-1,device='gpu')]

time_gpu = ['GPU']
results_auc_gpu = []
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_gpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_gpu.append(metrics.auc(fpr, tpr))

In [22]:
time_gpu

['GPU', 5.057594018006057, 39.93262233299902, 13.239043381996453]

In [23]:
results_auc_gpu

[0.8237744245413271, 0.8106215247755848, 0.8118326583877677]

## Breast Cancer

[Breast Cancer](https://archive.ics.uci.edu/dataset/17/breast+cancer+wisconsin+diagnostic) 

The breast cancer dataset is a classic and very easy binary classification dataset.

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass.  They describe characteristics of the cell nuclei present in the image

In [24]:
from sklearn.datasets import load_breast_cancer
breast_cancer =  load_breast_cancer(as_frame=True).frame

In [25]:
from sklearn.model_selection import train_test_split
X = breast_cancer.drop(columns=['target'])
y = breast_cancer.loc[:,'target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((398, 30), (171, 30), (398,), (171,))

In [26]:
from sklearn.metrics import roc_auc_score
from sklearn import metrics
time_cpu = ['CPU']
results_auc_cpu = []
models = [XGBClassifier(),
          CatBoostClassifier(verbose=False),
          LGBMClassifier(verbose=-1)]
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_cpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_cpu.append(metrics.auc(fpr, tpr))

In [27]:
time_cpu

['CPU', 0.01815694000106305, 0.9057804709955235, 0.025702106999233365]

In [28]:
results_auc_cpu

[0.9947089947089948, 0.9972075249853027, 0.9944150499706055]

In [29]:
models = [XGBClassifier(tree_method='gpu_hist'),
          CatBoostClassifier(verbose=False,task_type="GPU"),
          LGBMClassifier(verbose=-1,device='gpu')]

time_gpu = ['GPU']
results_auc_gpu = []
for i in range(len(models)):
    start = timeit.default_timer()
    model_ = models[i].fit(X_train, y_train)
    stop = timeit.default_timer()
    time_gpu.append(stop-start)
    y_prob = model_.predict_proba(X_test)[:,1]
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    results_auc_gpu.append(metrics.auc(fpr, tpr))

In [30]:
time_gpu

['GPU', 0.05239353800425306, 16.459107301001495, 0.4470300850007334]

In [31]:
results_auc_gpu

[0.9938271604938271, 0.9977954144620811, 0.9954438565549677]

# Optuna

Now, let's attempt to utilize Optuna with the three algorithms, applying it to our largest dataset. We will employ early_stopping to determine the optimal number of iterations that minimizes the validation loss, and we'll also consider class weights using the 'balanced' option.

Another valuable aspect to explore is the use of sample weights, which can be passed as an array of shape n_samples. This feature proves exceptionally useful in applications such as churn modeling, where we aim to prevent the churn of high-value customers with greater profitability.

Here, we have a function that calculates numerous classification metrics. While our primary optimization focus will be on AUC, feel free to make adjustments as needed.

In [32]:
import sys
import timeit
import gc
from sklearn import metrics
import optuna
from optuna.visualization import plot_optimization_history
from optuna.visualization import plot_param_importances
from catboost import CatBoostClassifier
from xgboost import XGBClassifier
import lightgbm as lightgbm
from lightgbm import LGBMClassifier

def metrics_validation(y_test, y_prob):
    '''
    Input:
        y_prob: model predict prob
        y_test: target
    Output: Metrics of validation
        auc, ks, log_loss, accuracy
    '''
    fpr, tpr, thresholds = metrics.roc_curve(y_test, y_prob)
    auc = metrics.auc(fpr, tpr)
    log_loss = metrics.log_loss(y_test, y_prob)
    ks = max(tpr - fpr) # Kolmogorov-Smirnov
    accu = metrics.accuracy_score(y_test, y_prob.round())
    precision = metrics.precision_score(y_test, y_prob.round()) # tp / (tp + fp)
    recall = metrics.recall_score(y_test, y_prob.round()) # tp / (tp + fn)
    f1_score = metrics.f1_score(y_test, y_prob.round()) # 2 * (precision * recall) / (precision + recall)
    return auc, log_loss, ks, 

Creating our objective function and the set of hyperparameter space."

In [33]:
def objective(trial, X_train, y_train, X_test, y_test, balanced, method):
    '''
    Input:
        trial: trial of the test
        X_train:
        y_train:
        X_test:
        y_test:
        balanced:balanced or None
        method: XGBoost, CatBoost or LGBM
    Output: Metrics of validation
        auc, ks, log_loss
        metrics_validation(y_test, y_pred)[0]
    '''
    gc.collect()
    if method=='LGBM':
        param_grid = {'learning_rate': trial.suggest_float('learning_rate', 0.0001, 0.1, log=True),
                      'num_leaves': trial.suggest_int('num_leaves', 2, 256),
                      'lambda_l1': trial.suggest_float("lambda_l1", 1e-8, 10.0, log=True),
                      'lambda_l2': trial.suggest_float("lambda_l2", 1e-8, 10.0, log=True),
                      'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 5, 100),
                      'max_depth': trial.suggest_int('max_depth', 5, 64),
                      'feature_fraction': trial.suggest_float("feature_fraction", 0.4, 1.0),
                      'bagging_fraction': trial.suggest_float("bagging_fraction", 0.4, 1.0),
                      'device':'gpu',
                      'bagging_freq': trial.suggest_int("bagging_freq", 1, 7),
  
                     }
        model = LGBMClassifier(**param_grid)

        print('LGBM - Optimization using optuna')
        model.fit(X_train, y_train)
        
        y_pred = model.predict_proba(X_test)[:,1]

    if method=='CATBoost':
        param_grid = {'learning_rate': trial.suggest_float('learning_rate', 0.0001, 0.1, log=True),
                      'depth': trial.suggest_int("depth", 4, 10),
                      'max_bin': trial.suggest_int('max_bin', 200, 400),
                      'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 1, 300),
                      'l2_leaf_reg': trial.suggest_float('l2_leaf_reg', 1e-8, 10, log = True),
                      'random_seed': 42,
                      'random_strength': trial.suggest_float("random_strength", 1e-8, 10.0, log=True),
                      'bagging_temperature': trial.suggest_float("bagging_temperature", 0.0, 10.0),
                      'od_type': trial.suggest_categorical("od_type", ["IncToDec", "Iter"]),
                      'task_type':'GPU',
                      'od_wait': trial.suggest_int("od_wait", 10, 50),
                     }
        if len(X_train._get_numeric_data().columns) != len(X_train.columns):
            categorical_features_indices = list(X_train.select_dtypes(exclude='number').columns)
            model = CatBoostClassifier(**param_grid)
            print('CATBoost - Optimization using optuna')
            model.fit(X_train, y_train,cat_features=categorical_features_indices,verbose=False)
            y_pred = model.predict_proba(X_test)[:,1]
        else:
            model = CatBoostClassifier(**param_grid)
            print('CATBoost - Optimization using optuna')
            model.fit(X_train, y_train,verbose=False)
            y_pred = model.predict_proba(X_test)[:,1]
        
    if method=='XGBoost':
        param_grid = {'learning_rate': trial.suggest_float('learning_rate', 0.0001, 0.1, log=True),
                      'max_depth': trial.suggest_int('max_depth', 3, 16),
                      'min_child_weight': trial.suggest_int('min_child_weight', 1, 300),
                      'gamma': trial.suggest_float('gamma', 1e-8, 1.0, log = True),
                      'alpha': trial.suggest_float('alpha', 1e-8, 1.0, log = True),
                      'lambda': trial.suggest_float('lambda', 0.0001, 10.0, log = True),
                      'colsample_bytree': trial.suggest_float('colsample_bytree', 0.1, 0.8),
                      'booster': 'gbtree',
                      'tree_method':'gpu_hist',
                      'random_state': 42,
                     }
        model = XGBClassifier(**param_grid)
        print('XGBoost - Optimization using optuna')
        model.fit(X_train, y_train,verbose=False)
        y_pred = model.predict_proba(X_test)[:,1]
    
    auc_res, log_loss_res, ks_res = metrics_validation(y_test, y_pred)
    print('auc:'+str(auc_res),', log_loss:'+str(log_loss_res),', ks:'+str(ks_res))
    return metrics_validation(y_test, y_pred)[0]

Tuning the model: Here, the study will be created, and an important aspect to note is the **time_max_tuning,** which represents the maximum time in seconds to stop.

In [37]:
def tuning(X_train, y_train, X_test, y_test, balanced, method):
    '''
    Input:
        trial: 
        x_train:
        y_train:
        X_test:
        y_test:
        balanced:balanced or not balanced
        method: XGBoost, CatBoost or LGBM
    Output: Metrics of validation
        auc, ks, log_loss
        metrics_validation(y_test, y_pred)[0]
    '''
    study = optuna.create_study(direction='maximize', study_name=method+' Classifier')
    func = lambda trial: objective(trial, X_train, y_train, X_test, y_test, balanced, method)
    print('Starting the optimization')
    time_max_tuning = 15*60 # max time in seconds to stop
    study.optimize(func, timeout=time_max_tuning)
    return study

Train the model while implementing **early_stopping**, and then return the best model.

In [35]:
def train(X_train, y_train, X_test, y_test, balanced, method):
    '''
    Input:
        X_train:
        y_train:
        X_test:
        y_test:
        balanced:balanced or None
        method: XGBoost, CatBoost or LGBM
    Output: predict model
    '''
    print('Tuning')
    study = tuning(X_train, y_train, X_test, y_test, balanced, method)
    if method=='LGBM':
        model = LGBMClassifier(**study.best_params)
        print('Last Fit')
        model.fit(X_train, y_train, eval_set=[(X_test,y_test)],
                 callbacks = [lightgbm.early_stopping(stopping_rounds=100), lightgbm.log_evaluation(period=5000)])
    if method=='CATBoost':
        model = CatBoostClassifier(**study.best_params)
        if len(X_train._get_numeric_data().columns) != len(X_train.columns):
            categorical_features_indices = list(X_train.select_dtypes(exclude='number').columns)
            print('Last Fit')
            model.fit(X_train, y_train,cat_features=categorical_features_indices, eval_set=[(X_test,y_test)],
                 early_stopping_rounds=100,verbose = False)
        else:
            print('Last Fit')
            model.fit(X_train, y_train, eval_set=[(X_test,y_test)],
                 early_stopping_rounds=100,verbose = False)
    if method=='XGBoost':
        model = XGBClassifier(**study.best_params)
        print('Last Fit')
        model.fit(X_train, y_train, eval_set=[(X_test,y_test)],
                 early_stopping_rounds=100,verbose = False)
    return model, study

In [36]:
X_train = higgs_train.loc[:,1:]
X_test = higgs_test.loc[:,1:]
y_train = higgs_train.loc[:,0]
y_test = higgs_test.loc[:,0]
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((10500000, 28), (500000, 28), (10500000,), (500000,))

In [38]:
xgb_model, study_lgbm = train(X_train, y_train, X_test, y_test, balanced='balanced', method='XGBoost')

[I 2023-10-02 18:19:46,797] A new study created in memory with name: XGBoost Classifier


Tuning
Starting the optimization
XGBoost - Optimization using optuna
auc:0.7823937789568013 , log_loss:0.5783660516621543 , ks:0.4161117447746004


[I 2023-10-02 18:19:54,363] Trial 0 finished with value: 0.7823937789568013 and parameters: {'learning_rate': 0.05371518813357488, 'max_depth': 5, 'min_child_weight': 175, 'gamma': 0.0035416020979473044, 'alpha': 1.8148619430935994e-05, 'lambda': 0.0004628148705354307, 'colsample_bytree': 0.24118386631381142}. Best is trial 0 with value: 0.7823937789568013.


XGBoost - Optimization using optuna
auc:0.7642073236693494 , log_loss:0.6704264957156165 , ks:0.38831003181042867


[I 2023-10-02 18:20:01,994] Trial 1 finished with value: 0.7642073236693494 and parameters: {'learning_rate': 0.0022732346204839453, 'max_depth': 5, 'min_child_weight': 22, 'gamma': 0.0009470349674428265, 'alpha': 1.1868156471657653e-07, 'lambda': 0.6074789905304752, 'colsample_bytree': 0.5526828609424803}. Best is trial 0 with value: 0.7823937789568013.


XGBoost - Optimization using optuna
auc:0.8198535618291272 , log_loss:0.5309685602510851 , ks:0.4765444966127411


[I 2023-10-02 18:20:13,733] Trial 2 finished with value: 0.8198535618291272 and parameters: {'learning_rate': 0.02608567975142055, 'max_depth': 11, 'min_child_weight': 59, 'gamma': 0.0007491402407264414, 'alpha': 2.4029817414900412e-05, 'lambda': 0.0001127104924026119, 'colsample_bytree': 0.7379012814553924}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.7917025801503007 , log_loss:0.6301449620067747 , ks:0.4306073105347797


[I 2023-10-02 18:20:22,395] Trial 3 finished with value: 0.7917025801503007 and parameters: {'learning_rate': 0.005217966681059962, 'max_depth': 8, 'min_child_weight': 274, 'gamma': 4.073036354235151e-07, 'alpha': 0.001273442926506005, 'lambda': 1.5785163951806644, 'colsample_bytree': 0.6975204824296858}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.7497758238190272 , log_loss:0.6915410864330644 , ks:0.35388767041989394


[I 2023-10-02 18:20:39,348] Trial 4 finished with value: 0.7497758238190272 and parameters: {'learning_rate': 0.00033796972118804345, 'max_depth': 15, 'min_child_weight': 274, 'gamma': 6.517641349767847e-08, 'alpha': 0.009016995617355018, 'lambda': 0.2637332153348816, 'colsample_bytree': 0.16041514843700955}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.8109771061946858 , log_loss:0.5693109034195492 , ks:0.4616117789172391


[I 2023-10-02 18:20:50,817] Trial 5 finished with value: 0.8109771061946858 and parameters: {'learning_rate': 0.014987524294242613, 'max_depth': 11, 'min_child_weight': 131, 'gamma': 0.05075632241445482, 'alpha': 4.6291840218966005e-05, 'lambda': 0.00204354534381981, 'colsample_bytree': 0.5715332899722632}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.7618527085986158 , log_loss:0.6849587484900231 , ks:0.37722010937860545


[I 2023-10-02 18:21:03,474] Trial 6 finished with value: 0.7618527085986158 and parameters: {'learning_rate': 0.001400767331552854, 'max_depth': 12, 'min_child_weight': 123, 'gamma': 1.0711195624987995e-07, 'alpha': 1.1234980872440672e-05, 'lambda': 0.00343975297527974, 'colsample_bytree': 0.18654226297928445}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.7991935978625836 , log_loss:0.6890809606387004 , ks:0.44216887165139074


[I 2023-10-02 18:21:15,235] Trial 7 finished with value: 0.7991935978625836 and parameters: {'learning_rate': 0.00025118191507558503, 'max_depth': 11, 'min_child_weight': 45, 'gamma': 4.6540418799874446e-07, 'alpha': 0.48724054977227305, 'lambda': 1.4999507111959365, 'colsample_bytree': 0.5757643885922734}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.7527912489135099 , log_loss:0.6801087082021309 , ks:0.35929397084241765


[I 2023-10-02 18:21:37,730] Trial 8 finished with value: 0.7527912489135099 and parameters: {'learning_rate': 0.0029270149411113593, 'max_depth': 16, 'min_child_weight': 149, 'gamma': 3.552544594583368e-08, 'alpha': 0.00014584099954033188, 'lambda': 0.24531593175596997, 'colsample_bytree': 0.15472358750979268}. Best is trial 2 with value: 0.8198535618291272.


XGBoost - Optimization using optuna
auc:0.8259640287634815 , log_loss:0.5165477435408504 , ks:0.48729112441413747


[I 2023-10-02 18:21:53,662] Trial 9 finished with value: 0.8259640287634815 and parameters: {'learning_rate': 0.05556019785954943, 'max_depth': 14, 'min_child_weight': 295, 'gamma': 0.007876761340863507, 'alpha': 1.9364328486421596e-08, 'lambda': 4.591795485652263, 'colsample_bytree': 0.4519080169255252}. Best is trial 9 with value: 0.8259640287634815.


XGBoost - Optimization using optuna
auc:0.8267147649750772 , log_loss:0.5134224978497866 , ks:0.48896068693450406


[I 2023-10-02 18:22:10,239] Trial 10 finished with value: 0.8267147649750772 and parameters: {'learning_rate': 0.08139036293259214, 'max_depth': 14, 'min_child_weight': 217, 'gamma': 0.8880785730434243, 'alpha': 1.0596805438224355e-08, 'lambda': 8.51454196123011, 'colsample_bytree': 0.3720851222139315}. Best is trial 10 with value: 0.8267147649750772.


XGBoost - Optimization using optuna
auc:0.827227177242662 , log_loss:0.5123200712796782 , ks:0.48983489732236957


[I 2023-10-02 18:22:26,054] Trial 11 finished with value: 0.827227177242662 and parameters: {'learning_rate': 0.08486464526544805, 'max_depth': 14, 'min_child_weight': 221, 'gamma': 0.87974626784109, 'alpha': 1.4140881074174516e-08, 'lambda': 7.685929809523991, 'colsample_bytree': 0.3687155058627552}. Best is trial 11 with value: 0.827227177242662.


XGBoost - Optimization using optuna
auc:0.8219489777036827 , log_loss:0.5219580258794635 , ks:0.48040548348408874


[I 2023-10-02 18:22:42,158] Trial 12 finished with value: 0.8219489777036827 and parameters: {'learning_rate': 0.07166052737281897, 'max_depth': 14, 'min_child_weight': 223, 'gamma': 0.45980386133205564, 'alpha': 1.5537596369520426e-08, 'lambda': 7.933546011793787, 'colsample_bytree': 0.3406093318746448}. Best is trial 11 with value: 0.827227177242662.


XGBoost - Optimization using optuna
auc:0.7897127681422361 , log_loss:0.6016702913874707 , ks:0.42667532117238055


[I 2023-10-02 18:22:50,658] Trial 13 finished with value: 0.7897127681422361 and parameters: {'learning_rate': 0.016949959114790632, 'max_depth': 8, 'min_child_weight': 212, 'gamma': 0.4662137689680311, 'alpha': 5.731958885402894e-07, 'lambda': 8.255365008653042, 'colsample_bytree': 0.347831266042493}. Best is trial 11 with value: 0.827227177242662.


XGBoost - Optimization using optuna
auc:0.8167254385157039 , log_loss:0.5301925852154598 , ks:0.4710839368936671


[I 2023-10-02 18:23:04,601] Trial 14 finished with value: 0.8167254385157039 and parameters: {'learning_rate': 0.07030194869311346, 'max_depth': 13, 'min_child_weight': 228, 'gamma': 0.748335293406293, 'alpha': 5.385245287373647e-07, 'lambda': 0.05247428326378511, 'colsample_bytree': 0.31518384335159233}. Best is trial 11 with value: 0.827227177242662.


XGBoost - Optimization using optuna
auc:0.8335408696397085 , log_loss:0.5021885635034686 , ks:0.5004045579673235


[I 2023-10-02 18:23:25,555] Trial 15 finished with value: 0.8335408696397085 and parameters: {'learning_rate': 0.0982584552406863, 'max_depth': 16, 'min_child_weight': 180, 'gamma': 0.05260267726799461, 'alpha': 1.4729498676309783e-08, 'lambda': 9.424926522981938, 'colsample_bytree': 0.42323510419271393}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8197329385966945 , log_loss:0.5457622564206894 , ks:0.47650922886546204


[I 2023-10-02 18:23:54,653] Trial 16 finished with value: 0.8197329385966945 and parameters: {'learning_rate': 0.02514792228490469, 'max_depth': 16, 'min_child_weight': 101, 'gamma': 0.04280466381651527, 'alpha': 7.461259530348864e-07, 'lambda': 1.7306010237902931, 'colsample_bytree': 0.45472849967155493}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8121309633427647 , log_loss:0.5344949472884848 , ks:0.46435731050915346


[I 2023-10-02 18:24:17,928] Trial 17 finished with value: 0.8121309633427647 and parameters: {'learning_rate': 0.0952593705466923, 'max_depth': 16, 'min_child_weight': 181, 'gamma': 5.598583139201215e-05, 'alpha': 1.7688359929053663e-07, 'lambda': 0.08582451076108771, 'colsample_bytree': 0.23719935773777753}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.751420024724527 , log_loss:0.6359552958993241 , ks:0.3688318182193998


[I 2023-10-02 18:24:24,946] Trial 18 finished with value: 0.751420024724527 and parameters: {'learning_rate': 0.011495299261571968, 'max_depth': 3, 'min_child_weight': 185, 'gamma': 0.04516301362283347, 'alpha': 7.452184123053356e-08, 'lambda': 0.02124618540681466, 'colsample_bytree': 0.43954336715591813}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.7563119754356102 , log_loss:0.624718831675335 , ks:0.3703855377556188


[I 2023-10-02 18:24:34,181] Trial 19 finished with value: 0.7563119754356102 and parameters: {'learning_rate': 0.03393446400395561, 'max_depth': 9, 'min_child_weight': 91, 'gamma': 5.0187152254910854e-05, 'alpha': 2.4896975995898982e-06, 'lambda': 2.485962186414188, 'colsample_bytree': 0.11050525835624464}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.7863348232012412 , log_loss:0.6383280271914523 , ks:0.4207975149605612


[I 2023-10-02 18:24:48,833] Trial 20 finished with value: 0.7863348232012412 and parameters: {'learning_rate': 0.00864837565082739, 'max_depth': 13, 'min_child_weight': 250, 'gamma': 0.08178379581711187, 'alpha': 2.9959675622767465e-06, 'lambda': 0.8557730460526317, 'colsample_bytree': 0.26874146357661405}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8279949376268823 , log_loss:0.5109198398219515 , ks:0.49098380402506714


[I 2023-10-02 18:25:04,992] Trial 21 finished with value: 0.8279949376268823 and parameters: {'learning_rate': 0.08884895795357545, 'max_depth': 14, 'min_child_weight': 198, 'gamma': 0.5693858267390489, 'alpha': 1.0651095173699517e-08, 'lambda': 9.78874174113668, 'colsample_bytree': 0.3910039890228238}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.816360488847377 , log_loss:0.5434453392206499 , ks:0.4705477189303823


[I 2023-10-02 18:25:25,130] Trial 22 finished with value: 0.816360488847377 and parameters: {'learning_rate': 0.035617563330533744, 'max_depth': 15, 'min_child_weight': 187, 'gamma': 0.1489146352809749, 'alpha': 5.5145020332101146e-08, 'lambda': 3.806562457366596, 'colsample_bytree': 0.3879139322085034}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8090317147537218 , log_loss:0.5511324730093403 , ks:0.4581908987466445


[I 2023-10-02 18:25:39,573] Trial 23 finished with value: 0.8090317147537218 and parameters: {'learning_rate': 0.042592033139970696, 'max_depth': 13, 'min_child_weight': 250, 'gamma': 0.17625659389717827, 'alpha': 1.0165742664304093e-08, 'lambda': 3.542740239669613, 'colsample_bytree': 0.29551974625587923}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8327426176787638 , log_loss:0.5036474474493999 , ks:0.4992704128906933


[I 2023-10-02 18:26:00,519] Trial 24 finished with value: 0.8327426176787638 and parameters: {'learning_rate': 0.09393093638104298, 'max_depth': 15, 'min_child_weight': 155, 'gamma': 0.8262248091956066, 'alpha': 4.419055291085465e-08, 'lambda': 9.58658503933874, 'colsample_bytree': 0.40635168537961946}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8194340086251113 , log_loss:0.5442020207716302 , ks:0.4756509400575171


[I 2023-10-02 18:26:21,770] Trial 25 finished with value: 0.8194340086251113 and parameters: {'learning_rate': 0.024625985930555278, 'max_depth': 15, 'min_child_weight': 158, 'gamma': 0.015092289627596235, 'alpha': 7.636507011820832e-08, 'lambda': 9.527451126686696, 'colsample_bytree': 0.4931111196740595}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.815071263713613 , log_loss:0.5388885324978625 , ks:0.46865206895731554


[I 2023-10-02 18:26:45,034] Trial 26 finished with value: 0.815071263713613 and parameters: {'learning_rate': 0.05164152573661999, 'max_depth': 16, 'min_child_weight': 153, 'gamma': 0.24628113044165342, 'alpha': 1.5469246996386793e-07, 'lambda': 0.5032395295396335, 'colsample_bytree': 0.2959859454826089}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8182331605348393 , log_loss:0.5339191355107057 , ks:0.4741393087066892


[I 2023-10-02 18:26:58,206] Trial 27 finished with value: 0.8182331605348393 and parameters: {'learning_rate': 0.04244059596725235, 'max_depth': 12, 'min_child_weight': 131, 'gamma': 0.12344084351916261, 'alpha': 6.468881952112836e-08, 'lambda': 2.264934896550536, 'colsample_bytree': 0.41033836464420814}. Best is trial 15 with value: 0.8335408696397085.


XGBoost - Optimization using optuna
auc:0.8365957634913157 , log_loss:0.49712276241063763 , ks:0.5066125694062924


[I 2023-10-02 18:27:16,754] Trial 28 finished with value: 0.8365957634913157 and parameters: {'learning_rate': 0.08684579563106676, 'max_depth': 15, 'min_child_weight': 197, 'gamma': 0.0184790670508487, 'alpha': 3.800980410024799e-08, 'lambda': 0.8947997949504739, 'colsample_bytree': 0.5074143246162668}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8318858317072431 , log_loss:0.5081083043257892 , ks:0.4978697804688742


[I 2023-10-02 18:27:40,643] Trial 29 finished with value: 0.8318858317072431 and parameters: {'learning_rate': 0.05078400325688043, 'max_depth': 15, 'min_child_weight': 100, 'gamma': 0.013770664402295362, 'alpha': 4.3709578222154863e-07, 'lambda': 0.9116619501033274, 'colsample_bytree': 0.5157032269222086}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8106967353884293 , log_loss:0.5560608961373376 , ks:0.46107280781143856


[I 2023-10-02 18:27:51,094] Trial 30 finished with value: 0.8106967353884293 and parameters: {'learning_rate': 0.01904177266034532, 'max_depth': 10, 'min_child_weight': 171, 'gamma': 0.0023159447708256038, 'alpha': 1.6213596570434002e-06, 'lambda': 2.751885936785615, 'colsample_bytree': 0.6241619178615433}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8336733315278403 , log_loss:0.5040214265351642 , ks:0.5008637587546162


[I 2023-10-02 18:28:14,720] Trial 31 finished with value: 0.8336733315278403 and parameters: {'learning_rate': 0.05859908838840551, 'max_depth': 15, 'min_child_weight': 99, 'gamma': 0.017507575071512363, 'alpha': 3.4024479889353074e-07, 'lambda': 1.0053353450075297, 'colsample_bytree': 0.5018851131266177}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8104782815359604 , log_loss:0.5301213301310123 , ks:0.4618456937667317


[I 2023-10-02 18:28:22,322] Trial 32 finished with value: 0.8104782815359604 and parameters: {'learning_rate': 0.09726166326542013, 'max_depth': 6, 'min_child_weight': 73, 'gamma': 0.02078003054271481, 'alpha': 3.588314294359692e-08, 'lambda': 0.44592228097726433, 'colsample_bytree': 0.48559027067508653}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8333973808532669 , log_loss:0.505165774739421 , ks:0.5001468903202523


[I 2023-10-02 18:28:45,939] Trial 33 finished with value: 0.8333973808532669 and parameters: {'learning_rate': 0.05138252640803841, 'max_depth': 16, 'min_child_weight': 166, 'gamma': 0.004393129986802058, 'alpha': 1.7773139783425032e-07, 'lambda': 1.1754077515778456, 'colsample_bytree': 0.536470183255995}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8296890032954447 , log_loss:0.5212498064452225 , ks:0.4934674189423774


[I 2023-10-02 18:29:27,859] Trial 34 finished with value: 0.8296890032954447 and parameters: {'learning_rate': 0.03202185699502544, 'max_depth': 16, 'min_child_weight': 18, 'gamma': 0.003551329375131336, 'alpha': 2.1980387866704266e-07, 'lambda': 1.22460066233145, 'colsample_bytree': 0.5321441573011773}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8297780944796264 , log_loss:0.5084331644700811 , ks:0.4939455704534717


[I 2023-10-02 18:29:40,958] Trial 35 finished with value: 0.8297780944796264 and parameters: {'learning_rate': 0.05660467890195708, 'max_depth': 12, 'min_child_weight': 129, 'gamma': 0.0017092866931638449, 'alpha': 1.8766105933057282e-07, 'lambda': 0.24472123728729495, 'colsample_bytree': 0.5905901567341558}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8265882928238798 , log_loss:0.5258422092088089 , ks:0.48829695704684695


[I 2023-10-02 18:30:04,626] Trial 36 finished with value: 0.8265882928238798 and parameters: {'learning_rate': 0.025945704642234568, 'max_depth': 16, 'min_child_weight': 200, 'gamma': 0.0005241700516744424, 'alpha': 4.191869973676395e-08, 'lambda': 0.741446760246208, 'colsample_bytree': 0.6223771586617618}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8346973942417686 , log_loss:0.502697965095643 , ks:0.5022779725402662


[I 2023-10-02 18:30:31,634] Trial 37 finished with value: 0.8346973942417686 and parameters: {'learning_rate': 0.05886617503724109, 'max_depth': 15, 'min_child_weight': 48, 'gamma': 0.0050484119621108225, 'alpha': 5.171893386424542e-06, 'lambda': 1.5188226305471852, 'colsample_bytree': 0.5292817285870244}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.82316421660339 , log_loss:0.5266326644421359 , ks:0.4831728976882023


[I 2023-10-02 18:30:48,684] Trial 38 finished with value: 0.82316421660339 and parameters: {'learning_rate': 0.03703959463208744, 'max_depth': 13, 'min_child_weight': 37, 'gamma': 0.027765700621110626, 'alpha': 7.152957220158352e-06, 'lambda': 4.465134829328522, 'colsample_bytree': 0.4959201170347567}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8363085432613456 , log_loss:0.49944906238045417 , ks:0.5058456517076041


[I 2023-10-02 18:31:13,697] Trial 39 finished with value: 0.8363085432613456 and parameters: {'learning_rate': 0.062024269117272095, 'max_depth': 15, 'min_child_weight': 69, 'gamma': 0.006612161537792014, 'alpha': 4.317327071993851e-06, 'lambda': 0.5348521428111821, 'colsample_bytree': 0.5584822111487052}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8193480910633058 , log_loss:0.5934884887002937 , ks:0.47643217510227165


[I 2023-10-02 18:31:41,912] Trial 40 finished with value: 0.8193480910633058 and parameters: {'learning_rate': 0.007636871922544521, 'max_depth': 15, 'min_child_weight': 69, 'gamma': 0.007660813718897632, 'alpha': 2.0127476631307534e-05, 'lambda': 0.30688724105019954, 'colsample_bytree': 0.660684805447978}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8355550200411989 , log_loss:0.5008902733117271 , ks:0.5046730617695612


[I 2023-10-02 18:32:09,087] Trial 41 finished with value: 0.8355550200411989 and parameters: {'learning_rate': 0.0628483128148, 'max_depth': 15, 'min_child_weight': 41, 'gamma': 0.053880697773627785, 'alpha': 5.361330942731109e-06, 'lambda': 1.6050718969092113, 'colsample_bytree': 0.5152501658030217}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8357541783848894 , log_loss:0.4998031271609703 , ks:0.5049529573837275


[I 2023-10-02 18:32:30,429] Trial 42 finished with value: 0.8357541783848894 and parameters: {'learning_rate': 0.06532516367964265, 'max_depth': 14, 'min_child_weight': 39, 'gamma': 0.000545933619310899, 'alpha': 4.868643305703409e-06, 'lambda': 1.5705579533748877, 'colsample_bytree': 0.54638663793835}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8356856474000139 , log_loss:0.5000087950834632 , ks:0.5049910844229542


[I 2023-10-02 18:32:53,299] Trial 43 finished with value: 0.8356856474000139 and parameters: {'learning_rate': 0.0645240919878397, 'max_depth': 14, 'min_child_weight': 36, 'gamma': 0.0005915588764368326, 'alpha': 4.6332218108276404e-05, 'lambda': 1.5776597104884087, 'colsample_bytree': 0.5508978037509017}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8218543768229822 , log_loss:0.5483685802565119 , ks:0.48062336076309153


[I 2023-10-02 18:33:18,274] Trial 44 finished with value: 0.8218543768229822 and parameters: {'learning_rate': 0.019078975200334167, 'max_depth': 14, 'min_child_weight': 4, 'gamma': 0.0005108147741057644, 'alpha': 6.008708223797764e-05, 'lambda': 0.653087504639324, 'colsample_bytree': 0.5677950742489165}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8281706580412338 , log_loss:0.509864209851945 , ks:0.4910528510676372


[I 2023-10-02 18:33:29,702] Trial 45 finished with value: 0.8281706580412338 and parameters: {'learning_rate': 0.06404163070449874, 'max_depth': 11, 'min_child_weight': 25, 'gamma': 0.000161949333377938, 'alpha': 1.2338382244640714e-05, 'lambda': 1.9765351024325604, 'colsample_bytree': 0.5570409320525556}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8218725106520962 , log_loss:0.5262392211915061 , ks:0.4808516545410377


[I 2023-10-02 18:33:43,324] Trial 46 finished with value: 0.8218725106520962 and parameters: {'learning_rate': 0.03999676894566501, 'max_depth': 12, 'min_child_weight': 57, 'gamma': 0.0009239834249527471, 'alpha': 3.5986509533078894e-05, 'lambda': 0.42039825737969766, 'colsample_bytree': 0.47288282153825767}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8358892952610263 , log_loss:0.4985596610338678 , ks:0.504317030893025


[I 2023-10-02 18:34:01,070] Trial 47 finished with value: 0.8358892952610263 and parameters: {'learning_rate': 0.07220505422462492, 'max_depth': 13, 'min_child_weight': 2, 'gamma': 0.0019178257675897129, 'alpha': 0.00032803762550415416, 'lambda': 0.15764123010018016, 'colsample_bytree': 0.595978714789866}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8355330731565783 , log_loss:0.4992249990597801 , ks:0.5040972956778891


[I 2023-10-02 18:34:18,811] Trial 48 finished with value: 0.8355330731565783 and parameters: {'learning_rate': 0.07016887937806793, 'max_depth': 13, 'min_child_weight': 3, 'gamma': 0.0018862966891314622, 'alpha': 0.00019438206239135198, 'lambda': 0.1938533908959018, 'colsample_bytree': 0.5989441267343402}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8299277506251002 , log_loss:0.5164700785104678 , ks:0.4940057310612048


[I 2023-10-02 18:34:42,509] Trial 49 finished with value: 0.8299277506251002 and parameters: {'learning_rate': 0.02904432582357365, 'max_depth': 14, 'min_child_weight': 26, 'gamma': 0.00022515439419978954, 'alpha': 0.00027383846560024666, 'lambda': 0.14731548821038506, 'colsample_bytree': 0.7108260793389256}. Best is trial 28 with value: 0.8365957634913157.


XGBoost - Optimization using optuna
auc:0.8320791374156545 , log_loss:0.5038430738703974 , ks:0.498138605830747


[I 2023-10-02 18:34:55,471] Trial 50 finished with value: 0.8320791374156545 and parameters: {'learning_rate': 0.07280910837799989, 'max_depth': 12, 'min_child_weight': 84, 'gamma': 0.0012198177812870255, 'alpha': 0.00044684863879203296, 'lambda': 0.1355095220732296, 'colsample_bytree': 0.556349011421445}. Best is trial 28 with value: 0.8365957634913157.


Last Fit


In [39]:
cat_model, study_lgbm = train(X_train, y_train, X_test, y_test, balanced='balanced', method='CATBoost')

[I 2023-10-02 18:38:59,284] A new study created in memory with name: CATBoost Classifier


Tuning
Starting the optimization
CATBoost - Optimization using optuna
auc:0.7998754143643698 , log_loss:0.5452211683603745 , ks:0.4438787464411975


[I 2023-10-02 18:40:23,579] Trial 0 finished with value: 0.7998754143643698 and parameters: {'learning_rate': 0.015164908470991641, 'depth': 10, 'max_bin': 294, 'min_data_in_leaf': 47, 'l2_leaf_reg': 0.9145165239547186, 'random_strength': 1.531782669160816e-07, 'bagging_temperature': 8.333557651780637, 'od_type': 'Iter', 'od_wait': 25}. Best is trial 0 with value: 0.7998754143643698.


CATBoost - Optimization using optuna
auc:0.7545641522481075 , log_loss:0.6432285925227534 , ks:0.3692005619994447


[I 2023-10-02 18:41:12,781] Trial 1 finished with value: 0.7545641522481075 and parameters: {'learning_rate': 0.0004159918549688078, 'depth': 7, 'max_bin': 339, 'min_data_in_leaf': 11, 'l2_leaf_reg': 0.010019575051730078, 'random_strength': 0.011947716010460267, 'bagging_temperature': 7.60686171918655, 'od_type': 'Iter', 'od_wait': 11}. Best is trial 0 with value: 0.7998754143643698.


CATBoost - Optimization using optuna
auc:0.831141041525409 , log_loss:0.502755420526781 , ks:0.49559178284414235


[I 2023-10-02 18:42:00,691] Trial 2 finished with value: 0.831141041525409 and parameters: {'learning_rate': 0.07229805214307072, 'depth': 7, 'max_bin': 248, 'min_data_in_leaf': 158, 'l2_leaf_reg': 0.4567372443510217, 'random_strength': 0.002244106060933411, 'bagging_temperature': 0.650258545284349, 'od_type': 'Iter', 'od_wait': 22}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.7979539981900935 , log_loss:0.5436187149014403 , ks:0.44114854417894994


[I 2023-10-02 18:42:35,646] Trial 3 finished with value: 0.7979539981900935 and parameters: {'learning_rate': 0.06099993058861054, 'depth': 4, 'max_bin': 352, 'min_data_in_leaf': 119, 'l2_leaf_reg': 2.114655903402055e-07, 'random_strength': 0.0055972735484440895, 'bagging_temperature': 6.222659858735327, 'od_type': 'Iter', 'od_wait': 46}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.740318192854663 , log_loss:0.6227894451399613 , ks:0.3543830297138743


[I 2023-10-02 18:43:07,229] Trial 4 finished with value: 0.740318192854663 and parameters: {'learning_rate': 0.0012029049956248032, 'depth': 4, 'max_bin': 248, 'min_data_in_leaf': 24, 'l2_leaf_reg': 1.5833932424061307e-06, 'random_strength': 0.4231335301207912, 'bagging_temperature': 6.3022295853677015, 'od_type': 'IncToDec', 'od_wait': 21}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.8002058396926548 , log_loss:0.5463485753681627 , ks:0.4431808379938268


[I 2023-10-02 18:44:29,368] Trial 5 finished with value: 0.8002058396926548 and parameters: {'learning_rate': 0.0038063894575142074, 'depth': 10, 'max_bin': 396, 'min_data_in_leaf': 9, 'l2_leaf_reg': 5.086871611795204e-07, 'random_strength': 0.05637363727068574, 'bagging_temperature': 3.0381702454571604, 'od_type': 'IncToDec', 'od_wait': 25}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.7281847289350338 , log_loss:0.6513772983389806 , ks:0.33982713412013626


[I 2023-10-02 18:45:07,731] Trial 6 finished with value: 0.7281847289350338 and parameters: {'learning_rate': 0.00038778482529993734, 'depth': 5, 'max_bin': 398, 'min_data_in_leaf': 242, 'l2_leaf_reg': 0.003769218323388098, 'random_strength': 6.705219336366326e-05, 'bagging_temperature': 0.22174372838153378, 'od_type': 'Iter', 'od_wait': 10}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.827566862391065 , log_loss:0.5078568007619951 , ks:0.4899872403953969


[I 2023-10-02 18:46:19,521] Trial 7 finished with value: 0.827566862391065 and parameters: {'learning_rate': 0.06361804747837987, 'depth': 9, 'max_bin': 318, 'min_data_in_leaf': 125, 'l2_leaf_reg': 8.511404211735159e-06, 'random_strength': 2.0984130670166948e-06, 'bagging_temperature': 4.17746351513982, 'od_type': 'Iter', 'od_wait': 50}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.736004085132682 , log_loss:0.6649831831864397 , ks:0.3513493444930033


[I 2023-10-02 18:46:58,609] Trial 8 finished with value: 0.736004085132682 and parameters: {'learning_rate': 0.00019955001621704686, 'depth': 6, 'max_bin': 234, 'min_data_in_leaf': 135, 'l2_leaf_reg': 0.4540355164883225, 'random_strength': 8.548675645041327e-08, 'bagging_temperature': 4.951551928233224, 'od_type': 'Iter', 'od_wait': 50}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.7997564932072481 , log_loss:0.545120775585044 , ks:0.44389382950161443


[I 2023-10-02 18:48:12,656] Trial 9 finished with value: 0.7997564932072481 and parameters: {'learning_rate': 0.019307845592998762, 'depth': 10, 'max_bin': 241, 'min_data_in_leaf': 275, 'l2_leaf_reg': 7.98096557488578e-06, 'random_strength': 0.0002977246905740741, 'bagging_temperature': 9.352095920501428, 'od_type': 'IncToDec', 'od_wait': 36}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.7992129015187963 , log_loss:0.5462302061045254 , ks:0.4414171069927582


[I 2023-10-02 18:49:07,557] Trial 10 finished with value: 0.7992129015187963 and parameters: {'learning_rate': 0.0056106203511202945, 'depth': 8, 'max_bin': 201, 'min_data_in_leaf': 204, 'l2_leaf_reg': 7.519243261807048, 'random_strength': 6.425013534723226, 'bagging_temperature': 0.032140931866062594, 'od_type': 'IncToDec', 'od_wait': 36}. Best is trial 2 with value: 0.831141041525409.


CATBoost - Optimization using optuna
auc:0.8347870809351778 , log_loss:0.49811041510636417 , ks:0.5027846320769751


[I 2023-10-02 18:50:08,552] Trial 11 finished with value: 0.8347870809351778 and parameters: {'learning_rate': 0.09193395513931629, 'depth': 8, 'max_bin': 292, 'min_data_in_leaf': 94, 'l2_leaf_reg': 2.4060335322115672e-05, 'random_strength': 1.117086010478237e-05, 'bagging_temperature': 2.8645589742944164, 'od_type': 'Iter', 'od_wait': 39}. Best is trial 11 with value: 0.8347870809351778.


CATBoost - Optimization using optuna
auc:0.8310206605246304 , log_loss:0.5029541733579296 , ks:0.4957847079590007


[I 2023-10-02 18:51:00,567] Trial 12 finished with value: 0.8310206605246304 and parameters: {'learning_rate': 0.07283872747486758, 'depth': 7, 'max_bin': 282, 'min_data_in_leaf': 80, 'l2_leaf_reg': 1.147860019718912e-08, 'random_strength': 2.1594337288172327e-05, 'bagging_temperature': 1.6664819689851402, 'od_type': 'Iter', 'od_wait': 37}. Best is trial 11 with value: 0.8347870809351778.


CATBoost - Optimization using optuna
auc:0.8219309957138494 , log_loss:0.5151226314732229 , ks:0.4800984281764775


[I 2023-10-02 18:52:01,498] Trial 13 finished with value: 0.8219309957138494 and parameters: {'learning_rate': 0.024199478762047436, 'depth': 8, 'max_bin': 271, 'min_data_in_leaf': 193, 'l2_leaf_reg': 0.0002764152298680795, 'random_strength': 0.0004802035165731746, 'bagging_temperature': 1.8277027107502768, 'od_type': 'Iter', 'od_wait': 17}. Best is trial 11 with value: 0.8347870809351778.


CATBoost - Optimization using optuna
auc:0.8343414892210204 , log_loss:0.49873451376957095 , ks:0.5017425434252296


[I 2023-10-02 18:52:56,946] Trial 14 finished with value: 0.8343414892210204 and parameters: {'learning_rate': 0.08985162470484294, 'depth': 8, 'max_bin': 208, 'min_data_in_leaf': 169, 'l2_leaf_reg': 0.00020093403735971233, 'random_strength': 3.1926070198674467e-06, 'bagging_temperature': 2.9550835616794195, 'od_type': 'Iter', 'od_wait': 30}. Best is trial 11 with value: 0.8347870809351778.


CATBoost - Optimization using optuna
auc:0.8319333098981236 , log_loss:0.5018822542548416 , ks:0.49768661286598603


[I 2023-10-02 18:53:52,027] Trial 15 finished with value: 0.8319333098981236 and parameters: {'learning_rate': 0.09777944001405149, 'depth': 8, 'max_bin': 207, 'min_data_in_leaf': 80, 'l2_leaf_reg': 0.00021496491667450296, 'random_strength': 1.1395309760736173e-08, 'bagging_temperature': 3.5176935452677727, 'od_type': 'Iter', 'od_wait': 31}. Best is trial 11 with value: 0.8347870809351778.


CATBoost - Optimization using optuna
auc:0.8118436357833898 , log_loss:0.5289626734569585 , ks:0.46263545659866717


[I 2023-10-02 18:55:05,433] Trial 16 finished with value: 0.8118436357833898 and parameters: {'learning_rate': 0.009377081210354962, 'depth': 9, 'max_bin': 321, 'min_data_in_leaf': 179, 'l2_leaf_reg': 0.00010611211598248217, 'random_strength': 7.326450123117985e-06, 'bagging_temperature': 2.4954379330081213, 'od_type': 'Iter', 'od_wait': 43}. Best is trial 11 with value: 0.8347870809351778.


Last Fit


In [40]:
lgbm_model, study_lgbm = train(X_train, y_train, X_test, y_test, balanced='balanced', method='LGBM')

[I 2023-10-02 18:57:19,240] A new study created in memory with name: LGBM Classifier


Tuning
Starting the optimization
LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.085043 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (250.78 MB) transferred to GPU in 0.073710 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histo

auc:0.7898983089924385 , log_loss:0.6848058696956106 , ks:0.42700184528429935


[I 2023-10-02 18:57:47,245] Trial 0 finished with value: 0.7898983089924385 and parameters: {'learning_rate': 0.0004577801504792331, 'num_leaves': 173, 'lambda_l1': 2.1027251445636707e-08, 'lambda_l2': 1.1747600378139125, 'min_data_in_leaf': 12, 'max_depth': 31, 'feature_fraction': 0.6035724807065478, 'bagging_fraction': 0.8944699443091366, 'bagging_freq': 3}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.080306 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (270.05 MB) transferred to GPU in 0.081826 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.7748585040114045 , log_loss:0.6859171923672249 , ks:0.4001549692375768


[I 2023-10-02 18:58:13,779] Trial 1 finished with value: 0.7748585040114045 and parameters: {'learning_rate': 0.0003163382323306302, 'num_leaves': 85, 'lambda_l1': 3.3992066902339477e-07, 'lambda_l2': 0.0001164532088766597, 'min_data_in_leaf': 70, 'max_depth': 24, 'feature_fraction': 0.8550777091007526, 'bagging_fraction': 0.9631456557452717, 'bagging_freq': 3}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082531 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.754143461485649 , log_loss:0.6835572675421818 , ks:0.3753967464780959


[I 2023-10-02 18:58:28,270] Trial 2 finished with value: 0.754143461485649 and parameters: {'learning_rate': 0.0005255977942091657, 'num_leaves': 28, 'lambda_l1': 0.004762506139590355, 'lambda_l2': 0.2648308159146597, 'min_data_in_leaf': 33, 'max_depth': 16, 'feature_fraction': 0.9013110326880486, 'bagging_fraction': 0.509817344139247, 'bagging_freq': 1}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083892 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (203.27 MB) transferred to GPU in 0.059395 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 18:58:43,200] Trial 3 finished with value: 0.7878588209287359 and parameters: {'learning_rate': 0.03469538765654423, 'num_leaves': 16, 'lambda_l1': 7.463945825675288e-07, 'lambda_l2': 4.397542383691434e-05, 'min_data_in_leaf': 98, 'max_depth': 61, 'feature_fraction': 0.9003447865371764, 'bagging_fraction': 0.7248251591209198, 'bagging_freq': 6}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084181 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (263.05 MB) transferred to GPU in 0.077780 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 18:59:05,531] Trial 4 finished with value: 0.7794157824858057 and parameters: {'learning_rate': 0.0007265862551473972, 'num_leaves': 75, 'lambda_l1': 0.00208603313731418, 'lambda_l2': 0.0008779565882204583, 'min_data_in_leaf': 10, 'max_depth': 38, 'feature_fraction': 0.479242608758704, 'bagging_fraction': 0.9382300253647821, 'bagging_freq': 7}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083575 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (163.17 MB) transferred to GPU in 0.047475 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.775303776305997 , log_loss:0.6823514362885492 , ks:0.40138631978922445


[I 2023-10-02 18:59:24,681] Trial 5 finished with value: 0.775303776305997 and parameters: {'learning_rate': 0.0005318465260717794, 'num_leaves': 80, 'lambda_l1': 1.738576682020406e-06, 'lambda_l2': 0.955850407754849, 'min_data_in_leaf': 38, 'max_depth': 28, 'feature_fraction': 0.842011571334443, 'bagging_fraction': 0.5816503243100073, 'bagging_freq': 3}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.087428 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (159.30 MB) transferred to GPU in 0.046389 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 18:59:42,739] Trial 6 finished with value: 0.7761950621270942 and parameters: {'learning_rate': 0.00013075264982747935, 'num_leaves': 88, 'lambda_l1': 0.03257682469470468, 'lambda_l2': 0.0008248763683191462, 'min_data_in_leaf': 89, 'max_depth': 26, 'feature_fraction': 0.8466106417800919, 'bagging_fraction': 0.5678877504584484, 'bagging_freq': 5}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083707 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (212.23 MB) transferred to GPU in 0.059407 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 18:59:58,986] Trial 7 finished with value: 0.7650149754337711 and parameters: {'learning_rate': 0.004785834632715618, 'num_leaves': 22, 'lambda_l1': 3.0786161404888898e-06, 'lambda_l2': 0.00015516057342932925, 'min_data_in_leaf': 79, 'max_depth': 61, 'feature_fraction': 0.6616122441353125, 'bagging_fraction': 0.7568259157771406, 'bagging_freq': 5}. Best is trial 0 with value: 0.7898983089924385.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.080165 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (240.76 MB) transferred to GPU in 0.069198 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:00:27,099] Trial 8 finished with value: 0.7936487149663938 and parameters: {'learning_rate': 0.0015663706015796482, 'num_leaves': 247, 'lambda_l1': 5.972062673443564, 'lambda_l2': 2.717978213367896e-07, 'min_data_in_leaf': 93, 'max_depth': 16, 'feature_fraction': 0.7907479032414989, 'bagging_fraction': 0.8586988182955024, 'bagging_freq': 4}. Best is trial 8 with value: 0.7936487149663938.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081461 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (275.95 MB) transferred to GPU in 0.080674 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:00:49,816] Trial 9 finished with value: 0.776805457913564 and parameters: {'learning_rate': 0.0009564839890942769, 'num_leaves': 51, 'lambda_l1': 3.393619931190932, 'lambda_l2': 0.022708090200074447, 'min_data_in_leaf': 99, 'max_depth': 58, 'feature_fraction': 0.4844369302395759, 'bagging_fraction': 0.9841943614428375, 'bagging_freq': 4}. Best is trial 8 with value: 0.7936487149663938.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083118 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.49 MB) transferred to GPU in 0.032591 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.41 MB) transferred to GPU in 0.034486 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.44 MB) transferred to GPU in 0.034082 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.31 MB) transferred to GPU in 0.034689 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.42 MB) transferred to GPU in 0.033910 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.43 MB) transferred to GPU in 0.033872 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.44 MB) transferred to GPU in 0.033845 secs. 1 sparse feature groups
[LightGBM] [Info

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.51 MB) transferred to GPU in 0.033232 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.39 MB) transferred to GPU in 0.034503 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.39 MB) transferred to GPU in 0.034199 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.44 MB) transferred to GPU in 0.033371 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.47 MB) transferred to GPU in 0.033088 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (117.54 MB) transferred to GPU in 0.033335 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:01:09,227] Trial 10 finished with value: 0.7581999682250344 and parameters: {'learning_rate': 0.0032531925701715396, 'num_leaves': 251, 'lambda_l1': 3.5166182746699146, 'lambda_l2': 5.959012273935866e-08, 'min_data_in_leaf': 61, 'max_depth': 6, 'feature_fraction': 0.992245750902684, 'bagging_fraction': 0.41880411441060167, 'bagging_freq': 1}. Best is trial 8 with value: 0.7936487149663938.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.080655 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (237.39 MB) transferred to GPU in 0.069183 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.7938479075052278 , log_loss:0.6617758931612024 , ks:0.43390853250644956


[I 2023-10-02 19:01:37,069] Trial 11 finished with value: 0.7938479075052278 and parameters: {'learning_rate': 0.002009140739578417, 'num_leaves': 217, 'lambda_l1': 1.1721103632489103e-08, 'lambda_l2': 8.614372756517842, 'min_data_in_leaf': 13, 'max_depth': 42, 'feature_fraction': 0.6759664841513724, 'bagging_fraction': 0.8466530491756354, 'bagging_freq': 3}. Best is trial 11 with value: 0.7938479075052278.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081554 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.63 MB) transferred to GPU in 0.064122 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.61 MB) transferred to GPU in 0.063722 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.61 MB) transferred to GPU in 0.063189 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.61 MB) transferred to GPU in 0.063733 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.56 MB) transferred to GPU in 0.063625 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.60 MB) transferred to GPU in 0.063472 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (222.61 MB) transferred to GPU in 0.063215 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:02:06,900] Trial 12 finished with value: 0.8001432690978955 and parameters: {'learning_rate': 0.007605267226389075, 'num_leaves': 256, 'lambda_l1': 2.621451244979515e-05, 'lambda_l2': 1.3216661818177138e-07, 'min_data_in_leaf': 44, 'max_depth': 45, 'feature_fraction': 0.7379499469661692, 'bagging_fraction': 0.7939663501062716, 'bagging_freq': 2}. Best is trial 12 with value: 0.8001432690978955.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081248 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.18 MB) transferred to GPU in 0.066091 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.16 MB) transferred to GPU in 0.064526 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.14 MB) transferred to GPU in 0.065504 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.16 MB) transferred to GPU in 0.066840 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.09 MB) transferred to GPU in 0.066556 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.15 MB) transferred to GPU in 0.064970 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (228.16 MB) transferred to GPU in 0.065469 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:02:35,340] Trial 13 finished with value: 0.7988615753401482 and parameters: {'learning_rate': 0.009758065538427122, 'num_leaves': 182, 'lambda_l1': 2.2393835124039732e-08, 'lambda_l2': 8.058724423321943, 'min_data_in_leaf': 39, 'max_depth': 49, 'feature_fraction': 0.7067387355392624, 'bagging_fraction': 0.813775058569088, 'bagging_freq': 2}. Best is trial 12 with value: 0.8001432690978955.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081478 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.80 MB) transferred to GPU in 0.063150 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.78 MB) transferred to GPU in 0.063524 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.78 MB) transferred to GPU in 0.063092 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.78 MB) transferred to GPU in 0.063751 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.73 MB) transferred to GPU in 0.063555 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.77 MB) transferred to GPU in 0.064076 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (221.79 MB) transferred to GPU in 0.063448 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:03:02,863] Trial 14 finished with value: 0.7967009634865115 and parameters: {'learning_rate': 0.009339552539629509, 'num_leaves': 156, 'lambda_l1': 8.255434189009596e-05, 'lambda_l2': 2.6099971996075807e-06, 'min_data_in_leaf': 43, 'max_depth': 48, 'feature_fraction': 0.7465834461135326, 'bagging_fraction': 0.7910146397287299, 'bagging_freq': 2}. Best is trial 12 with value: 0.8001432690978955.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084055 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.92 MB) transferred to GPU in 0.054449 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.87 MB) transferred to GPU in 0.054809 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.86 MB) transferred to GPU in 0.056107 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.86 MB) transferred to GPU in 0.055473 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.81 MB) transferred to GPU in 0.056240 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.86 MB) transferred to GPU in 0.055938 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.87 MB) transferred to GPU in 0.055385 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:03:28,890] Trial 15 finished with value: 0.8029318983383145 and parameters: {'learning_rate': 0.016549012391794612, 'num_leaves': 198, 'lambda_l1': 3.831273398389654e-05, 'lambda_l2': 3.0991592375956056e-08, 'min_data_in_leaf': 51, 'max_depth': 52, 'feature_fraction': 0.5827947918693646, 'bagging_fraction': 0.6878842352189563, 'bagging_freq': 2}. Best is trial 15 with value: 0.8029318983383145.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083460 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.28 MB) transferred to GPU in 0.053369 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.23 MB) transferred to GPU in 0.054450 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.21 MB) transferred to GPU in 0.053980 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.23 MB) transferred to GPU in 0.054334 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.20 MB) transferred to GPU in 0.056434 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.22 MB) transferred to GPU in 0.053786 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (182.22 MB) transferred to GPU in 0.057304 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:03:54,287] Trial 16 finished with value: 0.8262199546335564 and parameters: {'learning_rate': 0.0926162561568789, 'num_leaves': 206, 'lambda_l1': 5.811102160944643e-05, 'lambda_l2': 2.0466646913443722e-08, 'min_data_in_leaf': 54, 'max_depth': 53, 'feature_fraction': 0.5690736591462578, 'bagging_fraction': 0.6499282487628893, 'bagging_freq': 2}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084114 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8233387914724056 , log_loss:0.5136255547864296 , ks:0.482227980332385


[I 2023-10-02 19:04:13,793] Trial 17 finished with value: 0.8233387914724056 and parameters: {'learning_rate': 0.09955552202322622, 'num_leaves': 129, 'lambda_l1': 0.00031480335208679993, 'lambda_l2': 3.097869082225795e-08, 'min_data_in_leaf': 57, 'max_depth': 54, 'feature_fraction': 0.5682156079205974, 'bagging_fraction': 0.6587709197155364, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083890 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8196309525763916 , log_loss:0.5200245851697117 , ks:0.4760513311860578


[I 2023-10-02 19:04:32,998] Trial 18 finished with value: 0.8196309525763916 and parameters: {'learning_rate': 0.09186749687895979, 'num_leaves': 135, 'lambda_l1': 0.0004097394810761537, 'lambda_l2': 2.1049121676486173e-06, 'min_data_in_leaf': 59, 'max_depth': 56, 'feature_fraction': 0.4255598832378312, 'bagging_fraction': 0.6509748342021678, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082975 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8213177743971163 , log_loss:0.5164947915498794 , ks:0.4785804852048522


[I 2023-10-02 19:04:52,167] Trial 19 finished with value: 0.8213177743971163 and parameters: {'learning_rate': 0.0912122769831002, 'num_leaves': 116, 'lambda_l1': 1.0218802430268351e-05, 'lambda_l2': 2.8989714794638343e-08, 'min_data_in_leaf': 29, 'max_depth': 64, 'feature_fraction': 0.5633645785010247, 'bagging_fraction': 0.6627360726215891, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083630 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.27 MB) transferred to GPU in 0.051106 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.19 MB) transferred to GPU in 0.050431 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.18 MB) transferred to GPU in 0.050555 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.20 MB) transferred to GPU in 0.051549 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.19 MB) transferred to GPU in 0.050741 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.21 MB) transferred to GPU in 0.051574 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (176.21 MB) transferred to GPU in 0.050508 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:05:15,234] Trial 20 finished with value: 0.8108023088806426 and parameters: {'learning_rate': 0.04038578917720369, 'num_leaves': 124, 'lambda_l1': 0.00017526513194710564, 'lambda_l2': 1.0641126968934062e-08, 'min_data_in_leaf': 72, 'max_depth': 37, 'feature_fraction': 0.6246599817222982, 'bagging_fraction': 0.6284691194866178, 'bagging_freq': 2}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084013 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8201549934242994 , log_loss:0.518668612604024 , ks:0.47735723103573874


[I 2023-10-02 19:05:35,032] Trial 21 finished with value: 0.8201549934242994 and parameters: {'learning_rate': 0.07775756674046, 'num_leaves': 133, 'lambda_l1': 8.858771723511777e-06, 'lambda_l2': 5.258654050785568e-07, 'min_data_in_leaf': 23, 'max_depth': 64, 'feature_fraction': 0.555830410658552, 'bagging_fraction': 0.7014101339104863, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081376 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8132741473437016 , log_loss:0.5305412129971179 , ks:0.4655586167421055


[I 2023-10-02 19:05:53,787] Trial 22 finished with value: 0.8132741473437016 and parameters: {'learning_rate': 0.05195408672935882, 'num_leaves': 113, 'lambda_l1': 0.0005925067662009308, 'lambda_l2': 1.3349986375551982e-08, 'min_data_in_leaf': 29, 'max_depth': 54, 'feature_fraction': 0.574623891056889, 'bagging_fraction': 0.6413391194086346, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.080810 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.823673221934974 , log_loss:0.5136717491977308 , ks:0.48248381375516675


[I 2023-10-02 19:06:14,256] Trial 23 finished with value: 0.823673221934974 and parameters: {'learning_rate': 0.09385376430971233, 'num_leaves': 155, 'lambda_l1': 1.5729504955282714e-05, 'lambda_l2': 1.3615811921315995e-07, 'min_data_in_leaf': 54, 'max_depth': 64, 'feature_fraction': 0.5293179042637854, 'bagging_fraction': 0.7281424146317307, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082152 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.33 MB) transferred to GPU in 0.060033 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.27 MB) transferred to GPU in 0.058361 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.27 MB) transferred to GPU in 0.059341 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.29 MB) transferred to GPU in 0.059643 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.23 MB) transferred to GPU in 0.059374 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.29 MB) transferred to GPU in 0.059596 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (202.29 MB) transferred to GPU in 0.059062 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:06:39,629] Trial 24 finished with value: 0.80383098052371 and parameters: {'learning_rate': 0.02390979088925586, 'num_leaves': 154, 'lambda_l1': 5.522532417736894e-05, 'lambda_l2': 5.628198637693578e-07, 'min_data_in_leaf': 52, 'max_depth': 52, 'feature_fraction': 0.5122582952184049, 'bagging_fraction': 0.721468550684816, 'bagging_freq': 2}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084247 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
auc:0.8126906770380243 , log_loss:0.5348808496345205 , ks:0.4645780368215605


[I 2023-10-02 19:07:00,780] Trial 25 finished with value: 0.8126906770380243 and parameters: {'learning_rate': 0.056333303387932986, 'num_leaves': 214, 'lambda_l1': 0.0002188945582156744, 'lambda_l2': 1.0073300135399803e-07, 'min_data_in_leaf': 60, 'max_depth': 43, 'feature_fraction': 0.40749806241937714, 'bagging_fraction': 0.7474446246223847, 'bagging_freq': 1}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084840 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.98 MB) transferred to GPU in 0.048287 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.89 MB) transferred to GPU in 0.047139 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.89 MB) transferred to GPU in 0.048640 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.90 MB) transferred to GPU in 0.048470 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.91 MB) transferred to GPU in 0.049009 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.89 MB) transferred to GPU in 0.048584 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.91 MB) transferred to GPU in 0.048251 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:07:23,711] Trial 26 finished with value: 0.806859868695724 and parameters: {'learning_rate': 0.027727093161251066, 'num_leaves': 159, 'lambda_l1': 9.649382470487605e-06, 'lambda_l2': 1.0681840264141267e-08, 'min_data_in_leaf': 68, 'max_depth': 57, 'feature_fraction': 0.5288362705354787, 'bagging_fraction': 0.5988592149175365, 'bagging_freq': 2}. Best is trial 16 with value: 0.8262199546335564.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082981 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (189.87 MB) transferred to GPU in 0.053952 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.828677088494562 , log_loss:0.5066110067167572 , ks:0.49104611908738494


[I 2023-10-02 19:07:48,330] Trial 27 finished with value: 0.828677088494562 and parameters: {'learning_rate': 0.0979876550122422, 'num_leaves': 224, 'lambda_l1': 0.0014235559343150825, 'lambda_l2': 6.657688122881207e-06, 'min_data_in_leaf': 81, 'max_depth': 49, 'feature_fraction': 0.6363259048680269, 'bagging_fraction': 0.6770301860680766, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082739 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (212.45 MB) transferred to GPU in 0.061636 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.819980269950896 , log_loss:0.5209227259114113 , ks:0.47601752875686926


[I 2023-10-02 19:08:14,723] Trial 28 finished with value: 0.819980269950896 and parameters: {'learning_rate': 0.05166308742060963, 'num_leaves': 227, 'lambda_l1': 0.006619995619130377, 'lambda_l2': 3.6239871295586607e-06, 'min_data_in_leaf': 81, 'max_depth': 47, 'feature_fraction': 0.6492197223655491, 'bagging_fraction': 0.7575777853911564, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081881 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (192.21 MB) transferred to GPU in 0.054996 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:08:37,711] Trial 29 finished with value: 0.8045242667830236 and parameters: {'learning_rate': 0.01944398548012135, 'num_leaves': 185, 'lambda_l1': 0.0012784888120224269, 'lambda_l2': 6.862541002943279e-07, 'min_data_in_leaf': 85, 'max_depth': 33, 'feature_fraction': 0.6226736619302382, 'bagging_fraction': 0.6853876507458602, 'bagging_freq': 4}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.081661 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (200.28 MB) transferred to GPU in 0.056758 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.8139488613054691 , log_loss:0.5331952544929716 , ks:0.4664225871293414


[I 2023-10-02 19:09:02,551] Trial 30 finished with value: 0.8139488613054691 and parameters: {'learning_rate': 0.03834459542058527, 'num_leaves': 198, 'lambda_l1': 0.05449225789837758, 'lambda_l2': 1.3273433347914857e-05, 'min_data_in_leaf': 77, 'max_depth': 60, 'feature_fraction': 0.6088167752742816, 'bagging_fraction': 0.7141574427862488, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.085233 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.63 MB) transferred to GPU in 0.055411 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.58 MB) transferred to GPU in 0.056039 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.56 MB) transferred to GPU in 0.054725 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.58 MB) transferred to GPU in 0.054580 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.54 MB) transferred to GPU in 0.054561 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.57 MB) transferred to GPU in 0.054656 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (186.57 MB) transferred to GPU in 0.055201 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:09:27,532] Trial 31 finished with value: 0.8251053776555826 and parameters: {'learning_rate': 0.09052374910350319, 'num_leaves': 171, 'lambda_l1': 0.00010005917063063065, 'lambda_l2': 8.459777355892614e-08, 'min_data_in_leaf': 64, 'max_depth': 50, 'feature_fraction': 0.5901267076278998, 'bagging_fraction': 0.6654652653009074, 'bagging_freq': 2}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082058 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (174.91 MB) transferred to GPU in 0.051506 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.8200216569913578 , log_loss:0.5197296891155858 , ks:0.4766015301468394


[I 2023-10-02 19:09:49,986] Trial 32 finished with value: 0.8200216569913578 and parameters: {'learning_rate': 0.06317911965273605, 'num_leaves': 171, 'lambda_l1': 8.78981306308682e-05, 'lambda_l2': 1.6195396537624666e-07, 'min_data_in_leaf': 67, 'max_depth': 51, 'feature_fraction': 0.6075215648126091, 'bagging_fraction': 0.6235728717819533, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083054 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.96 MB) transferred to GPU in 0.055904 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.91 MB) transferred to GPU in 0.056753 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.89 MB) transferred to GPU in 0.056284 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.92 MB) transferred to GPU in 0.055884 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.87 MB) transferred to GPU in 0.055285 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.91 MB) transferred to GPU in 0.055979 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (190.92 MB) transferred to GPU in 0.055914 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:10:16,270] Trial 33 finished with value: 0.8213612389129544 and parameters: {'learning_rate': 0.06373544002428654, 'num_leaves': 233, 'lambda_l1': 2.0753131986270765e-05, 'lambda_l2': 9.111302438605251e-08, 'min_data_in_leaf': 48, 'max_depth': 39, 'feature_fraction': 0.5377935424490893, 'bagging_fraction': 0.6809251422673114, 'bagging_freq': 2}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082094 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (160.11 MB) transferred to GPU in 0.046404 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.8133173672599119 , log_loss:0.535131955844366 , ks:0.46521737282494047


[I 2023-10-02 19:10:38,493] Trial 34 finished with value: 0.8133173672599119 and parameters: {'learning_rate': 0.03401015821417003, 'num_leaves': 206, 'lambda_l1': 2.0829898729463023e-07, 'lambda_l2': 9.259109002678394e-07, 'min_data_in_leaf': 65, 'max_depth': 44, 'feature_fraction': 0.6441931153725518, 'bagging_fraction': 0.5707874749779738, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084069 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.24 MB) transferred to GPU in 0.043798 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.14 MB) transferred to GPU in 0.043552 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.15 MB) transferred to GPU in 0.043310 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.15 MB) transferred to GPU in 0.044336 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.13 MB) transferred to GPU in 0.044055 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.15 MB) transferred to GPU in 0.044566 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (150.15 MB) transferred to GPU in 0.043438 secs. 1 sparse feature groups
[LightGBM] [Info

[I 2023-10-02 19:11:00,632] Trial 35 finished with value: 0.8258984537400638 and parameters: {'learning_rate': 0.09795423339658885, 'num_leaves': 169, 'lambda_l1': 9.82735423989919e-05, 'lambda_l2': 2.489997075381993e-07, 'min_data_in_leaf': 75, 'max_depth': 59, 'feature_fraction': 0.5903855465772945, 'bagging_fraction': 0.5354993319042066, 'bagging_freq': 2}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.088541 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (155.08 MB) transferred to GPU in 0.044046 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

auc:0.8162483477042491 , log_loss:0.5266018227736194 , ks:0.47010531964592966


[I 2023-10-02 19:11:21,904] Trial 36 finished with value: 0.8162483477042491 and parameters: {'learning_rate': 0.04431646458690806, 'num_leaves': 176, 'lambda_l1': 0.0007717158712683269, 'lambda_l2': 5.570873046651371e-06, 'min_data_in_leaf': 74, 'max_depth': 49, 'feature_fraction': 0.6900836313492142, 'bagging_fraction': 0.5528131952612948, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.082156 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (144.36 MB) transferred to GPU in 0.040949 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:11:41,963] Trial 37 finished with value: 0.8235591627172117 and parameters: {'learning_rate': 0.06814605416496566, 'num_leaves': 232, 'lambda_l1': 0.00015615592538374367, 'lambda_l2': 1.1391397940686215e-05, 'min_data_in_leaf': 86, 'max_depth': 58, 'feature_fraction': 0.5901973545765397, 'bagging_fraction': 0.5146042521254033, 'bagging_freq': 7}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.084152 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (167.67 MB) transferred to GPU in 0.049113 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:12:03,340] Trial 38 finished with value: 0.8106298257510343 and parameters: {'learning_rate': 0.029456506763463605, 'num_leaves': 192, 'lambda_l1': 0.004833154954941355, 'lambda_l2': 3.625092429340071e-07, 'min_data_in_leaf': 92, 'max_depth': 41, 'feature_fraction': 0.6432197846294514, 'bagging_fraction': 0.5977756667534049, 'bagging_freq': 5}. Best is trial 27 with value: 0.828677088494562.


LGBM - Optimization using optuna
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] Using GPU Device: NVIDIA GeForce RTX 4090, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (280.38 MB) transferred to GPU in 0.083036 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Info] 27 dense feature groups (174.60 MB) transferred to GPU in 0.050390 secs. 1 sparse feature groups
[LightGBM] [Info] Size of histogram bin entry: 8
[LightGBM] [Inf

[I 2023-10-02 19:12:24,868] Trial 39 finished with value: 0.8019253626823509 and parameters: {'learning_rate': 0.01631914456021541, 'num_leaves': 169, 'lambda_l1': 7.070420118340646e-05, 'lambda_l2': 4.3850314935049856e-05, 'min_data_in_leaf': 64, 'max_depth': 31, 'feature_fraction': 0.6136836521982938, 'bagging_fraction': 0.6224834798074727, 'bagging_freq': 4}. Best is trial 27 with value: 0.828677088494562.


Last Fit
[LightGBM] [Info] Number of positive: 5564616, number of negative: 4935384
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 6132
[LightGBM] [Info] Number of data points in the train set: 10500000, number of used features: 28
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.529963 -> initscore=0.119997
[LightGBM] [Info] Start training from score 0.119997
Training until validation scores don't improve for 100 rounds
Did not meet early stopping. Best iteration is:
[100]	valid_0's binary_logloss: 0.506611


AUC in the final models

||XGBoost|CatBoost|LightGBM|
|---|---|---|---|
|Number of Trials|50|16|39|
|AUC|0.8365957634913157|0.8347870809351778|0.828677088494562|

## Final hyperparameters in our models

### XGBoost
```
Trial 28 finished with value: 0.8365957634913157 and parameters: {'learning_rate': 0.08684579563106676, 'max_depth': 15, 'min_child_weight': 197, 'gamma': 0.0184790670508487, 'alpha': 3.800980410024799e-08, 'lambda': 0.8947997949504739, 'colsample_bytree': 0.5074143246162668}. Best is trial 28 with value: 0.8365957634913157.
```
### CatBoost
```
Trial 11 finished with value: 0.8347870809351778 and parameters: {'learning_rate': 0.09193395513931629, 'depth': 8, 'max_bin': 292, 'min_data_in_leaf': 94, 'l2_leaf_reg': 2.4060335322115672e-05, 'random_strength': 1.117086010478237e-05, 'bagging_temperature': 2.8645589742944164, 'od_type': 'Iter', 'od_wait': 39}. Best is trial 11 with value: 0.8347870809351778.
```
### LightGBM
```
Trial 27 finished with value: 0.828677088494562 and parameters: {'learning_rate': 0.0979876550122422, 'num_leaves': 224, 'lambda_l1': 0.0014235559343150825, 'lambda_l2': 6.657688122881207e-06, 'min_data_in_leaf': 81, 'max_depth': 49, 'feature_fraction': 0.6363259048680269, 'bagging_fraction': 0.6770301860680766, 'bagging_freq': 3}. Best is trial 27 with value: 0.828677088494562.
```

In [None]:
lgbm_model

# SHAP in XGBoost, CatBoost, LightGBM

## SHAP using CPU

In [None]:
import shap
time_cpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.TreeExplainer(xgb_model)
shap_values = explainer.shap_values(X_train)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
import shap
time_cpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.TreeExplainer(cat_model)
shap_values = explainer.shap_values(X_train)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
import shap
time_cpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.TreeExplainer(lgbm_model)
shap_values = explainer.shap_values(X_train)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
time_cpu_shap

## SHAP using GPU

In [None]:
time_gpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.explainers.GPUTree(xgb_model)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
time_gpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.explainers.GPUTree(cat_model)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
time_gpu_shap = ['CPU']
start = timeit.default_timer()
explainer = shap.explainers.GPUTree(lgbm_model)
stop = timeit.default_timer()
time_cpu_shap.append(stop-start)
shap.summary_plot(shap_values[1], X_train,show=False)

In [None]:
time_cpu_shap

# Bibliography
You can find all files in [this repository](https://github.com/joaomh/xgboost-catboost-lgbm)

References, and, of course, you can access the documentation for each algorithm. 


[1] - [Schapire, Robert E(1999). A Short Introduction to Boosting](https://cseweb.ucsd.edu/~yfreund/papers/IntroToBoosting.pdf)

[2] - [HASTIE, T.; TIBSHIRANI, R.; FRIEDMAN, J. (2009). The Elements of Statistical Learning](https://hastie.su.domains/Papers/ESLII.pdf)

[3] - [Jerome H. Friedman (2001). GREEDY FUNCTION APPROXIMATION:A GRADIENT BOOSTING MACHINE](https://jerryfriedman.su.domains/ftp/trebst.pdf)

[4] - [Tianqi Chen, Carlos Guestrin (2016).XGBoost: {A} Scalable Tree Boosting System](https://arxiv.org/abs/1603.02754)

[5] - [Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Nikita Kazeev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev (2017).CatBoost: unbiased boosting with categorical features](https://arxiv.org/abs/1706.09516)

[6] - [Ke, Guolin and Meng, Qi and Finley, Thomas and Wang, Taifeng and Chen, Wei and Ma, Weidong and Ye, Qiwei and Liu, Tie-Yan (2017).Lightgbm: A highly efficient gradient boosting decision tree](https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf)

[7] - [Anna Veronika Dorogush: Mastering gradient boosting with CatBoost | PyData London 2019](https://www.youtube.com/watch?v=usdEWSDisS0)

[8] - [Pedro Tabacof - Unlocking the Power of Gradient-Boosted Trees (using LightGBM) | PyData London 2022](https://www.youtube.com/watch?v=qGsHlvE8KZM)

[9] - [Pinheiro, J., & Becker, M.. (2023). Um estudo sobre algoritmos de Boosting e a otimização de hiperparâmetros utilizando optuna.](https://bdta.abcd.usp.br/item/003122385)