# Table of Contents:
1. [Apply GPU to LightGBM](#Apply-GPU-to-LightGBM)
1. [Required library](#Required-library)
1. [Preprocess](#Preprocess)
1. [Initial conditions](#Initial-conditions)
1. [Utility Score](#Utility-Score)
1. [LightGBM Training with GPU](#LightGBM-Training-with-GPU)
1. [Optimisation](#Optimisation)
1. [References](#References)

# Apply GPU to LightGBM
## See also [GPU-accelerated LightGBM full](https://www.kaggle.com/dromosys/gpu-accelerated-lightgbm-full)
### Procedure:
1. Change Accelerator to GPU;
1. Change Internet to ON;
1. Add parameters of LightGBM:
    ```
    'device': 'gpu',
    'gpu_platform_id': 0,
    'gpu_device_id': 0
    ```
1. Run execute the cell below.

In [1]:
%%time
!rm -r /opt/conda/lib/python3.6/site-packages/lightgbm
!git clone --recursive https://github.com/Microsoft/LightGBM

rm: cannot remove '/opt/conda/lib/python3.6/site-packages/lightgbm': No such file or directory
Cloning into 'LightGBM'...
remote: Enumerating objects: 56, done.[K
remote: Counting objects: 100% (56/56), done.[K
remote: Compressing objects: 100% (48/48), done.[K
remote: Total 20886 (delta 22), reused 17 (delta 7), pack-reused 20830[K
Receiving objects: 100% (20886/20886), 16.48 MiB | 18.01 MiB/s, done.
Resolving deltas: 100% (15242/15242), done.
Submodule 'include/boost/compute' (https://github.com/boostorg/compute) registered for path 'compute'
Submodule 'eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'eigen'
Submodule 'external_libs/fast_double_parser' (https://github.com/lemire/fast_double_parser.git) registered for path 'external_libs/fast_double_parser'
Submodule 'external_libs/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'external_libs/fmt'
Cloning into '/kaggle/working/LightGBM/compute'...
remote: Enumerating objects: 21728, do

In [2]:
%%time
!apt-get install -y -qq libboost-all-dev

CPU times: user 30.1 ms, sys: 9.77 ms, total: 39.9 ms
Wall time: 2.35 s


In [3]:
%%time
%%bash
cd LightGBM
rm -r build
mkdir build
cd build
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
make -j$(nproc)

-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking

rm: cannot remove 'build': No such file or directory
cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/u

CPU times: user 44.1 ms, sys: 30 ms, total: 74.1 ms
Wall time: 5min 12s


In [4]:
%%time
!cd LightGBM/python-package/;python3 setup.py install --precompile

running install
running build
running build_py
creating build
creating build/lib
creating build/lib/lightgbm
copying lightgbm/engine.py -> build/lib/lightgbm
copying lightgbm/dask.py -> build/lib/lightgbm
copying lightgbm/compat.py -> build/lib/lightgbm
copying lightgbm/basic.py -> build/lib/lightgbm
copying lightgbm/sklearn.py -> build/lib/lightgbm
copying lightgbm/plotting.py -> build/lib/lightgbm
copying lightgbm/callback.py -> build/lib/lightgbm
copying lightgbm/__init__.py -> build/lib/lightgbm
copying lightgbm/libpath.py -> build/lib/lightgbm
running egg_info
creating lightgbm.egg-info
writing lightgbm.egg-info/PKG-INFO
writing dependency_links to lightgbm.egg-info/dependency_links.txt
writing requirements to lightgbm.egg-info/requires.txt
writing top-level names to lightgbm.egg-info/top_level.txt
writing manifest file 'lightgbm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'build'
writing m

In [5]:
%%time
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
!rm -r LightGBM

CPU times: user 8.96 ms, sys: 7.18 ms, total: 16.1 ms
Wall time: 1.45 s


# Required library

In [6]:
%%time
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
import os
import random
from optuna.samplers import TPESampler
import multiprocessing
import lightgbm as lgb
import optuna
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import pickle
from sklearn.utils import resample

CPU times: user 709 ms, sys: 161 ms, total: 870 ms
Wall time: 1.99 s


# Initial conditions

In [7]:
%%time
n_trials = int(3)
num_boost_round = int(300)
early_stopping_rounds = int(30)
verbose_eval = int(30)
SEED = 2000

CPU times: user 5 µs, sys: 1 µs, total: 6 µs
Wall time: 11.2 µs


In [8]:
%%time
# Function to seed everything
def seed_everything(seed):
    random.seed(seed)
    np.random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
seed_everything(SEED)

CPU times: user 0 ns, sys: 143 µs, total: 143 µs
Wall time: 147 µs


# Preprocess

In [9]:
%%time
X_train = pd.read_pickle('../input/research-with-customized-sharp-weighted/X_train.pickle')
y_train = pd.read_pickle('../input/research-with-customized-sharp-weighted/y_train.pickle')

CPU times: user 725 ms, sys: 2.87 s, total: 3.59 s
Wall time: 39.3 s


# [Utility Score](https://www.kaggle.com/satorushibata/optimized-lightgbm-classifier-on-utility-score#Utility-Score)
- See also [Approx. Utility score for Regression Objective Fnc](https://www.kaggle.com/anlgrbz/approx-utility-score-for-regress-weighted-traing#Custom-Objective-Function-Implementations-for-LGBM)

In [10]:
%%time
def mse_modified_1(y_pred, y_train):
    lmbda = 0.2
    days = y_train.get_weight()
    y_train = y_train.get_label()
    residual = (y_train - y_pred).astype("float64")
    signs_matching = (y_train * y_pred) >= 0    
    grad = np.where(signs_matching, -2 * residual,  -2 * residual - y_train*lmbda)
    hess = np.where(signs_matching, 2 , 2 )
    return grad, hess

def Eval_mse_modified_1(y_pred, y_train):
    lmbda = 0.2
    #weight = y_train.get_weight()
    y_train = y_train.get_label()
    residual = (y_train - y_pred).astype("float")
    signs_matching = (y_train * y_pred) >= 0
    mse_action_value = np.where(signs_matching,  residual**2, residual**2 - y_train*y_pred*lmbda)
    return "MSE_Modified_1", np.mean(mse_action_value), False

CPU times: user 7 µs, sys: 0 ns, total: 7 µs
Wall time: 12.2 µs


# [LightGBM Training](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.train.html) with GPU

In [11]:
%%time
X_train_adv, X_valid_adv, y_train_adv, y_valid_adv = \
    train_test_split(X_train, y_train, test_size=0.33, random_state=int(SEED), shuffle=True)

CPU times: user 2.76 s, sys: 396 ms, total: 3.16 s
Wall time: 3.17 s


In [12]:
%%time
# Make the model of LightGBM and plot results.
def LightGBM(params, X_train_adv, X_valid_adv, y_train_adv, y_valid_adv):
    # Set data
    lgb_train = lgb.Dataset(X_train_adv, y_train_adv)
    lgb_valid = lgb.Dataset(X_valid_adv, y_valid_adv, reference = lgb_train)
    # Training
    model = lgb.train(
        params,
        lgb_train,
        valid_sets = [lgb_train, lgb_valid],
        num_boost_round = num_boost_round,
        early_stopping_rounds = early_stopping_rounds,
        verbose_eval = verbose_eval,
        fobj  = mse_modified_1,
        feval = Eval_mse_modified_1
    )
    # Prediction
    y_pred = model.predict(X_valid_adv, num_iteration = model.best_iteration)
    y_pred_boot = resample(y_pred, n_samples = len(y_train_adv))
    # Evaluation
    ROC_AUC_Score = roc_auc_score(y_train_adv, y_pred_boot)
    print('ROC AUC Score of LightGBM =', ROC_AUC_Score)
    return ROC_AUC_Score

CPU times: user 5 µs, sys: 0 ns, total: 5 µs
Wall time: 8.11 µs


# [Optimisation](https://tech.preferred.jp/en/blog/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization/)

In [13]:
%%time
def objective(trial):
    params = {
        'task': 'train',
        'objective': 'binary',
        'boosting_type': 'gbdt',
        'seed': SEED,
        'metric': 'AUC',
        'importance_type': 'gain',
        'lambda_l1': trial.suggest_loguniform('lambda_l1', 0.1, 0.9),
        'lambda_l2': trial.suggest_loguniform('lambda_l2', 0.1, 0.9),
        'num_leaves': trial.suggest_int('num_leaves', 2, 256),
        'feature_fraction': trial.suggest_uniform('feature_fraction', 0.4, 1.0),
        'bagging_fraction': trial.suggest_uniform('bagging_fraction', 0.4, 1.0),
        'bagging_freq': trial.suggest_int('bagging_freq', 1, 7),
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
        'learning_rate' :trial.suggest_loguniform('learning_rate', 1e-3, 1e-1),  # An additional item
        'verbose': 0,
        'num_threads': multiprocessing.cpu_count(),
        'device': 'gpu',
        'gpu_platform_id': 0,
        'gpu_device_id': 0
    }
    
    return LightGBM(params, X_train_adv, X_valid_adv, y_train_adv, y_valid_adv)

CPU times: user 7 µs, sys: 0 ns, total: 7 µs
Wall time: 11.7 µs


In [14]:
%%time
# Optimize Hyperparameters.
study = optuna.create_study(direction = 'maximize', sampler = TPESampler(seed=int(SEED)))
study.optimize(objective, n_trials = n_trials, n_jobs = multiprocessing.cpu_count(), timeout=60*60*2)

[32m[I 2021-01-09 23:14:29,373][0m A new study created in memory with name: no-name-64faa418-d60d-4348-8ad1-5bc232bd7786[0m


Training until validation scores don't improve for 30 rounds
Training until validation scores don't improve for 30 rounds
[30]	training's auc: 0.531935	training's MSE_Modified_1: 0.471598	valid_1's auc: 0.529964	valid_1's MSE_Modified_1: 0.472164
[30]	training's auc: 0.558059	training's MSE_Modified_1: 0.26126	valid_1's auc: 0.550707	valid_1's MSE_Modified_1: 0.261711
[60]	training's auc: 0.532275	training's MSE_Modified_1: 0.443152	valid_1's auc: 0.530123	valid_1's MSE_Modified_1: 0.443688
[60]	training's auc: 0.570289	training's MSE_Modified_1: 0.246958	valid_1's auc: 0.558593	valid_1's MSE_Modified_1: 0.247615
[90]	training's auc: 0.532597	training's MSE_Modified_1: 0.418331	valid_1's auc: 0.530403	valid_1's MSE_Modified_1: 0.41884
[90]	training's auc: 0.582245	training's MSE_Modified_1: 0.245071	valid_1's auc: 0.566469	valid_1's MSE_Modified_1: 0.246001
[120]	training's auc: 0.532809	training's MSE_Modified_1: 0.396682	valid_1's auc: 0.530646	valid_1's MSE_Modified_1: 0.397165
[150

[32m[I 2021-01-09 23:20:09,139][0m Trial 0 finished with value: 0.5006588997279999 and parameters: {'lambda_l1': 0.16596310765880176, 'lambda_l2': 0.7014968236195287, 'num_leaves': 24, 'feature_fraction': 0.8612281686109593, 'bagging_fraction': 0.8740167598632683, 'bagging_freq': 4, 'min_child_samples': 38, 'learning_rate': 0.002277832161590344}. Best is trial 0 with value: 0.5006588997279999.[0m


ROC AUC Score of LightGBM = 0.5006588997279999
[300]	training's auc: 0.632565	training's MSE_Modified_1: 0.239827	valid_1's auc: 0.588873	valid_1's MSE_Modified_1: 0.242957
Did not meet early stopping. Best iteration is:
[300]	training's auc: 0.632565	training's MSE_Modified_1: 0.239827	valid_1's auc: 0.588873	valid_1's MSE_Modified_1: 0.242957


[32m[I 2021-01-09 23:20:55,449][0m Trial 1 finished with value: 0.49972937371510107 and parameters: {'lambda_l1': 0.3693474339071541, 'lambda_l2': 0.11610430896144414, 'num_leaves': 67, 'feature_fraction': 0.44850996974606466, 'bagging_fraction': 0.5113115123586354, 'bagging_freq': 6, 'min_child_samples': 74, 'learning_rate': 0.04755150399193334}. Best is trial 0 with value: 0.5006588997279999.[0m


ROC AUC Score of LightGBM = 0.49972937371510107
Training until validation scores don't improve for 30 rounds
[30]	training's auc: 0.547703	training's MSE_Modified_1: 0.332735	valid_1's auc: 0.543812	valid_1's MSE_Modified_1: 0.333171
[60]	training's auc: 0.553683	training's MSE_Modified_1: 0.27577	valid_1's auc: 0.548125	valid_1's MSE_Modified_1: 0.27616
[90]	training's auc: 0.558277	training's MSE_Modified_1: 0.25664	valid_1's auc: 0.551244	valid_1's MSE_Modified_1: 0.257047
[120]	training's auc: 0.56319	training's MSE_Modified_1: 0.250032	valid_1's auc: 0.554631	valid_1's MSE_Modified_1: 0.250494
[150]	training's auc: 0.567134	training's MSE_Modified_1: 0.24763	valid_1's auc: 0.556985	valid_1's MSE_Modified_1: 0.248169
[180]	training's auc: 0.571084	training's MSE_Modified_1: 0.246556	valid_1's auc: 0.55974	valid_1's MSE_Modified_1: 0.24717
[210]	training's auc: 0.575084	training's MSE_Modified_1: 0.245944	valid_1's auc: 0.562346	valid_1's MSE_Modified_1: 0.246646
[240]	training's au

[32m[I 2021-01-09 23:23:39,314][0m Trial 2 finished with value: 0.5000911941447237 and parameters: {'lambda_l1': 0.22362622590522555, 'lambda_l2': 0.5911957446525865, 'num_leaves': 53, 'feature_fraction': 0.5741729577906235, 'bagging_fraction': 0.43216525403335093, 'bagging_freq': 2, 'min_child_samples': 49, 'learning_rate': 0.018322486355592714}. Best is trial 0 with value: 0.5006588997279999.[0m


ROC AUC Score of LightGBM = 0.5000911941447237
CPU times: user 14min 41s, sys: 1min 57s, total: 16min 38s
Wall time: 9min 10s


In [15]:
%%time
# Save
pickle.dump(study.best_trial.params, open('LightGBM_Hyperparameter.pickle', 'wb'))

CPU times: user 0 ns, sys: 779 µs, total: 779 µs
Wall time: 623 µs


# References:
- [LightGBM Tuner: New Optuna Integration for Hyperparameter Optimization](https://tech.preferred.jp/en/blog/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization/)
- [Optimized LightGBM Classifier on Utility Score](https://www.kaggle.com/satorushibata/optimized-lightgbm-classifier-on-utility-score)
- [Research with Customized Sharp Weighted](https://www.kaggle.com/satorushibata/research-with-customized-sharp-weighted)
- [GPU-accelerated LightGBM full](https://www.kaggle.com/dromosys/gpu-accelerated-lightgbm-full)
- [lightgbm.LGBMModel](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.LGBMModel.html)