# Testing multiple Light GBM models
Our last model had an accuracy of 0.8 for both targets.
If the change the models params, can we increase this value?

That's what we are going to do on this notebook.

As always, let's import the libs and load the data, and split the data into 2 datasets, just like before

In [1]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import lightgbm as lgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV

In [2]:
df = pd.read_csv("test_data.csv", sep=";")

In [3]:
df = df.drop("Unnamed: 0", axis=1)

In [4]:
df = df.set_index("id")

In [5]:
country_index = list(df["country"].unique())
platform_index = list(df["creation_platform"].unique())
source_index = list(df["source_pulido"].unique())

In [6]:
df_num_values = df.drop(["country", "creation_platform", "source_pulido"], axis=1)
df_num_values["country_index"] = df["country"].apply(lambda i: country_index.index(i))
df_num_values["creation_platform_index"] = df["creation_platform"].apply(lambda i: platform_index.index(i))
df_num_values["source_pulido_index"] = df["source_pulido"].apply(lambda i: source_index.index(i))

In [7]:
X_train, X_test, y_train, y_test = train_test_split(df_num_values.drop("target", axis=1), df_num_values.target, test_size=0.2, random_state=42)

In [8]:
train_data = lgb.Dataset(X_train, label=y_train)
test_data = lgb.Dataset(X_test, label=y_test)

On the last notebook, we created a model with a single list (or dict) of parameters.

Here we a going to define multiple values for each parameter and we are going to train the model with every possible combination to discover which parameters performs better

In [9]:
param_grid = {
    'objective': ['binary'],
    'metric': ['binary_error'],
    'boosting_type': ['gbdt', 'dart'],  # Gradient Boosting Decision Tree or Dropouts meet Multiple Additive Regression Trees
    'num_leaves': [20, 30, 40],  # Maximum tree leaves for base learners
    'learning_rate': [0.01, 0.05, 0.1],  # Learning rate for boosting
    'n_estimators': [50, 100, 200],  # Number of boosting iterations
    'feature_fraction': [0.8, 0.9],  # Fraction of features to be randomly selected for each boosting round
}

In [10]:
lgb_clf = lgb.LGBMClassifier()

For comparing the models, we are going to use the `recall` metric

In [11]:
grid_search = GridSearchCV(lgb_clf, param_grid, cv=5, scoring='recall', verbose=1)

In [12]:
# Warning! This may take a while. Go grab some coffee.
grid_search.fit(X_train, y_train)

Fitting 5 folds for each of 108 candidates, totalling 540 fits
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008069 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007729 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in th

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006874 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008422 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006269 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008412 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007376 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006371 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006743 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006159 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006261 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007596 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007696 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008233 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007548 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008157 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007555 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007582 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007127 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006577 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.005863 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007352 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008069 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007206 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006726 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008376 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.005982 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007912 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008728 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007132 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006073 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007430 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006457 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007673 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006111 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007233 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008249 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006307 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008567 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006578 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006351 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008884 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007650 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008277 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006747 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006156 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006130 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007403 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006327 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007535 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007305 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008684 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007229 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008622 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007523 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008151 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006758 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006381 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006821 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007739 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007400 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007413 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006451 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006512 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007266 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008023 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006984 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007869 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007658 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008036 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006132 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006745 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008383 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006059 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008089 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008230 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006100 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007722 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008345 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007431 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007835 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006520 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008269 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007890 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007474 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006312 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008123 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006241 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006569 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007860 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006129 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008298 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007374 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006725 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.005950 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007976 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006476 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006388 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007513 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006650 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006925 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008212 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007230 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007174 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006827 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007574 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006869 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006558 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006418 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007423 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007899 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006938 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007538 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008000 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1471
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007379 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439827
[LightGBM] [Info] Start training from score -2.439827
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007478 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1463
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007480 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080188 -> initscore=-2.439791
[LightGBM] [Info] Start training from score -2.439791
[LightGBM] [Info] Number of positive: 30039, number of negative: 344579
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007278 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1467
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 30039, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006833 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 374617, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439824
[LightGBM] [Info] Start training from score -2.439824
[LightGBM] [Info] Number of positive: 30040, number of negative: 344578
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007483 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1473
[LightGBM] [Info] Number of data points in the train set: 374618, number of used features: 13
[LightGBM] [In

[LightGBM] [Info] Number of positive: 37549, number of negative: 430723
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.007753 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 468272, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439818
[LightGBM] [Info] Start training from score -2.439818


After training for a while, it's time to check the best parameters found:

In [13]:
best_params = grid_search.best_params_
print("Best parameters found:", best_params)

Best parameters found: {'boosting_type': 'gbdt', 'feature_fraction': 0.9, 'learning_rate': 0.1, 'metric': 'binary_error', 'n_estimators': 200, 'num_leaves': 40, 'objective': 'binary'}


Now, let's train again the model using the best parameter found and check it's accuracy

In [15]:
num_round = 100
best_model = lgb.train(best_params, train_data, num_round, valid_sets=[test_data])
y_pred = best_model.predict(X_test)



[LightGBM] [Info] Number of positive: 37549, number of negative: 430723
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.009787 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 1470
[LightGBM] [Info] Number of data points in the train set: 468272, number of used features: 13
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.080186 -> initscore=-2.439818
[LightGBM] [Info] Start training from score -2.439818


In [16]:
y_pred_binary = [1 if pred > 0.5 else 0 for pred in y_pred]

In [17]:
accuracy = accuracy_score(y_test, y_pred_binary)
print("Accuracy of the best model:", accuracy)

Accuracy of the best model: 0.923130146581474


In [18]:
np.percentile(y_pred[y_test == 0], [0, 30, 50, 70, 100])

array([3.10474731e-04, 3.38785730e-03, 1.43203608e-02, 5.28973724e-02,
       8.97121082e-01])

In [19]:
np.percentile(y_pred[y_test == 1], [0, 30, 50, 70, 100])

array([6.90427725e-04, 1.53569116e-01, 2.63965092e-01, 3.85892985e-01,
       9.27406128e-01])

In [32]:
target_0_preds = y_pred[y_test == 0] <= 0.5
target_1_preds = y_pred[y_test == 1] > 0.5

In [33]:
unique, counts = np.unique(target_0_preds, return_counts=True)
print(unique, counts)
print("Accuracy for negative targets:", counts[1]/counts.sum())

[False  True] [   902 106846]
Accuracy for negative targets: 0.9916286149162862


In [34]:
unique, counts = np.unique(target_1_preds, return_counts=True)
print(unique, counts)
print("Accuracy for positive targets:", counts[1]/counts.sum())

[False  True] [8097 1223]
Accuracy for positive targets: 0.13122317596566524


This time we had a even higher accuracy for negative targets, but about the same for positive targets.

Let's do the same and find some threshold to increse this accuracy

In [35]:
threshold = 0.1
target_0_preds = y_pred[y_test == 0] <= threshold
target_1_preds = y_pred[y_test == 1] > threshold

In [36]:
unique, counts = np.unique(target_0_preds, return_counts=True)
print(unique, counts)
print("Accuracy:", counts[1]/counts.sum())

[False  True] [21215 86533]
Accuracy: 0.8031053940676393


In [37]:
unique, counts = np.unique(target_1_preds, return_counts=True)
print(unique, counts)
print("Accuracy:", counts[1]/counts.sum())

[False  True] [1832 7488]
Accuracy: 0.8034334763948497


Just like before, we increased our accuracy to 0.8 in both targets.

Let's save the model

In [38]:
import pickle
with open('models/lightgbm_multiple_train_model.pkl', 'wb') as f:
    pickle.dump(best_model, f)