# Table of Contents:
1. [Apply GPU to LightGBM](#1)
1. [Required library](#2)
1. [Preprocess](#3)
1. [Initial conditions](#4)
1. [Utility Score](#5)
1. [LightGBM Training with GPU](#6)
1. [Optimisation](#7)
1. [References](#8)

# Apply GPU to LightGBM
## See also [GPU-accelerated LightGBM full](https://www.kaggle.com/dromosys/gpu-accelerated-lightgbm-full)
### Procedure:
1. Change Accelerator to GPU;
1. Change Internet to ON;
1. Add parameters of LightGBM:
    ```
    'device': 'gpu',
    'gpu_platform_id': 0,
    'gpu_device_id': 0
    ```
1. Run execute the cell below.

In [1]:
%%time
!rm -r /opt/conda/lib/python3.6/site-packages/lightgbm
!git clone --recursive https://github.com/Microsoft/LightGBM

rm: cannot remove '/opt/conda/lib/python3.6/site-packages/lightgbm': No such file or directory
Cloning into 'LightGBM'...
remote: Enumerating objects: 92, done.[K
remote: Counting objects: 100% (92/92), done.[K
remote: Compressing objects: 100% (60/60), done.[K
remote: Total 20816 (delta 47), reused 54 (delta 31), pack-reused 20724[K
Receiving objects: 100% (20816/20816), 16.43 MiB | 21.19 MiB/s, done.
Resolving deltas: 100% (15173/15173), done.
Submodule 'include/boost/compute' (https://github.com/boostorg/compute) registered for path 'compute'
Submodule 'eigen' (https://gitlab.com/libeigen/eigen.git) registered for path 'eigen'
Submodule 'external_libs/fast_double_parser' (https://github.com/lemire/fast_double_parser.git) registered for path 'external_libs/fast_double_parser'
Submodule 'external_libs/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'external_libs/fmt'
Cloning into '/kaggle/working/LightGBM/compute'...
remote: Enumerating objects: 21728, d

In [2]:
%%time
!apt-get install -y -qq libboost-all-dev

CPU times: user 32.8 ms, sys: 6.99 ms, total: 39.8 ms
Wall time: 2.6 s


In [3]:
%%time
%%bash
cd LightGBM
rm -r build
mkdir build
cd build
cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda/include/ ..
make -j$(nproc)

-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking

rm: cannot remove 'build': No such file or directory
cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
/u

CPU times: user 47.2 ms, sys: 28.9 ms, total: 76.2 ms
Wall time: 5min 12s


In [4]:
%%time
!cd LightGBM/python-package/;python3 setup.py install --precompile

running install
running build
running build_py
creating build
creating build/lib
creating build/lib/lightgbm
copying lightgbm/sklearn.py -> build/lib/lightgbm
copying lightgbm/compat.py -> build/lib/lightgbm
copying lightgbm/libpath.py -> build/lib/lightgbm
copying lightgbm/callback.py -> build/lib/lightgbm
copying lightgbm/dask.py -> build/lib/lightgbm
copying lightgbm/engine.py -> build/lib/lightgbm
copying lightgbm/plotting.py -> build/lib/lightgbm
copying lightgbm/__init__.py -> build/lib/lightgbm
copying lightgbm/basic.py -> build/lib/lightgbm
running egg_info
creating lightgbm.egg-info
writing lightgbm.egg-info/PKG-INFO
writing dependency_links to lightgbm.egg-info/dependency_links.txt
writing requirements to lightgbm.egg-info/requires.txt
writing top-level names to lightgbm.egg-info/top_level.txt
writing manifest file 'lightgbm.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching 'build'
writing m

In [5]:
%%time
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
!rm -r LightGBM

CPU times: user 8.92 ms, sys: 6.94 ms, total: 15.9 ms
Wall time: 1.57 s


# Required library

In [6]:
%%time
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
import os
import random
from optuna.samplers import TPESampler
import multiprocessing
import lightgbm as lgb
import optuna
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score
import pickle
from sklearn.utils import resample

CPU times: user 712 ms, sys: 151 ms, total: 863 ms
Wall time: 1.74 s


# Initial conditions

In [7]:
%%time
n_trials = int(3)
num_boost_round = int(300)
early_stopping_rounds = int(30)
verbose_eval = int(30)
SEED = 2000

CPU times: user 5 µs, sys: 1 µs, total: 6 µs
Wall time: 10.3 µs


In [8]:
%%time
# Function to seed everything
def seed_everything(seed):
    random.seed(seed)
    np.random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
seed_everything(SEED)

CPU times: user 137 µs, sys: 27 µs, total: 164 µs
Wall time: 169 µs


# Preprocess

In [9]:
%%time
X_train = pd.read_pickle('../input/research-with-customized-sharp-weighted/X_train.pickle')
y_train = pd.read_pickle('../input/research-with-customized-sharp-weighted/y_train.pickle')

CPU times: user 819 ms, sys: 2.89 s, total: 3.71 s
Wall time: 26.7 s


# [Utility Score](https://www.kaggle.com/satorushibata/optimized-lightgbm-classifier-on-utility-score#Utility-Score)
- See also [Approx. Utility score for Regression Objective Fnc](https://www.kaggle.com/anlgrbz/approx-utility-score-for-regress-weighted-traing#Custom-Objective-Function-Implementations-for-LGBM)

In [10]:
%%time
def mse_modified_1(y_pred, y_train):
    lmbda = 0.2
    days = y_train.get_weight()
    y_train = y_train.get_label()
    residual = (y_train - y_pred).astype("float64")
    signs_matching = (y_train * y_pred) >= 0    
    grad = np.where(signs_matching, -2 * residual,  -2 * residual - y_train*lmbda)
    hess = np.where(signs_matching, 2 , 2 )
    return grad, hess

def Eval_mse_modified_1(y_pred, y_train):
    lmbda = 0.2
    #weight = y_train.get_weight()
    y_train = y_train.get_label()
    residual = (y_train - y_pred).astype("float")
    signs_matching = (y_train * y_pred) >= 0
    mse_action_value = np.where(signs_matching,  residual**2, residual**2 - y_train*y_pred*lmbda)
    return "MSE_Modified_1", np.mean(mse_action_value), False

CPU times: user 5 µs, sys: 0 ns, total: 5 µs
Wall time: 8.34 µs


# [LightGBM Training](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.train.html) with GPU

In [11]:
%%time
X_train_adv, X_valid_adv, y_train_adv, y_valid_adv = \
    train_test_split(X_train, y_train, test_size=0.33, random_state=int(SEED), shuffle=True)

CPU times: user 2.94 s, sys: 369 ms, total: 3.3 s
Wall time: 3.31 s


In [12]:
%%time
# Make the model of LightGBM and plot results.
def LightGBM(params, X_train_adv, X_valid_adv, y_train_adv, y_valid_adv):
    # Set data
    lgb_train = lgb.Dataset(X_train_adv, y_train_adv)
    lgb_valid = lgb.Dataset(X_valid_adv, y_valid_adv, reference = lgb_train)
    # Training
    model = lgb.train(
        params,
        lgb_train,
        valid_sets = [lgb_train, lgb_valid],
        num_boost_round = num_boost_round,
        early_stopping_rounds = early_stopping_rounds,
        verbose_eval = verbose_eval,
        fobj  = mse_modified_1,
        feval = Eval_mse_modified_1
    )
    # Prediction
    y_pred = model.predict(X_valid_adv, num_iteration = model.best_iteration)
    y_pred_boot = resample(y_pred, n_samples = len(y_train_adv))
    # Evaluation
    ROC_AUC_Score = roc_auc_score(y_train_adv, y_pred_boot)
    print('ROC AUC Score of LightGBM =', ROC_AUC_Score)
    return ROC_AUC_Score

CPU times: user 3 µs, sys: 2 µs, total: 5 µs
Wall time: 8.58 µs


# [Optimisation](https://tech.preferred.jp/en/blog/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization/)

In [13]:
%%time
def objective(trial):
    params = {
        'task': 'train',
        'objective': 'binary',
        'boosting_type': 'gbdt',
        'seed': SEED,
        'metric': 'AUC',
        'importance_type': 'gain',
        'lambda_l1': trial.suggest_loguniform('lambda_l1', 0.1, 0.9),
        'lambda_l2': trial.suggest_loguniform('lambda_l2', 0.1, 0.9),
        'num_leaves': trial.suggest_int('num_leaves', 2, 256),
        'feature_fraction': trial.suggest_uniform('feature_fraction', 0.4, 1.0),
        'bagging_fraction': trial.suggest_uniform('bagging_fraction', 0.4, 1.0),
        'bagging_freq': trial.suggest_int('bagging_freq', 1, 7),
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
        'verbose': 0,
        'num_threads': multiprocessing.cpu_count(),
        'device': 'gpu',
        'gpu_platform_id': 0,
        'gpu_device_id': 0
    }
    
    return LightGBM(params, X_train_adv, X_valid_adv, y_train_adv, y_valid_adv)

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 7.39 µs


In [14]:
%%time
# Optimize Hyperparameters.
study = optuna.create_study(direction = 'maximize', sampler = TPESampler(seed=int(SEED)))
study.optimize(objective, n_trials = n_trials, n_jobs = multiprocessing.cpu_count(), timeout=60*60*2)

[32m[I 2021-01-03 00:43:51,674][0m A new study created in memory with name: no-name-1330de3c-5193-4251-8bc7-1cfe975f5f93[0m


Training until validation scores don't improve for 30 rounds
Training until validation scores don't improve for 30 rounds
[30]	training's auc: 0.617354	training's MSE_Modified_1: 0.241895	valid_1's auc: 0.584778	valid_1's MSE_Modified_1: 0.244105
[30]	training's auc: 0.627747	training's MSE_Modified_1: 0.240572	valid_1's auc: 0.595595	valid_1's MSE_Modified_1: 0.242909
[60]	training's auc: 0.657617	training's MSE_Modified_1: 0.236449	valid_1's auc: 0.601122	valid_1's MSE_Modified_1: 0.240917
[60]	training's auc: 0.670702	training's MSE_Modified_1: 0.234784	valid_1's auc: 0.612927	valid_1's MSE_Modified_1: 0.239551
[90]	training's auc: 0.688369	training's MSE_Modified_1: 0.232513	valid_1's auc: 0.610426	valid_1's MSE_Modified_1: 0.239309
[90]	training's auc: 0.702956	training's MSE_Modified_1: 0.230651	valid_1's auc: 0.62194	valid_1's MSE_Modified_1: 0.237899
[120]	training's auc: 0.713222	training's MSE_Modified_1: 0.229081	valid_1's auc: 0.616837	valid_1's MSE_Modified_1: 0.238136
[12

[32m[I 2021-01-03 00:51:59,099][0m Trial 1 finished with value: 0.49990132082423405 and parameters: {'lambda_l1': 0.8076200712613311, 'lambda_l2': 0.7661088764095041, 'num_leaves': 244, 'feature_fraction': 0.4315111154517529, 'bagging_fraction': 0.6640401673702919, 'bagging_freq': 4, 'min_child_samples': 13}. Best is trial 1 with value: 0.49990132082423405.[0m


ROC AUC Score of LightGBM = 0.49990132082423405


[32m[I 2021-01-03 00:52:24,578][0m Trial 0 finished with value: 0.499274673997011 and parameters: {'lambda_l1': 0.1850879731094282, 'lambda_l2': 0.38161872695118404, 'num_leaves': 254, 'feature_fraction': 0.7254889593129901, 'bagging_fraction': 0.8134460938965464, 'bagging_freq': 3, 'min_child_samples': 42}. Best is trial 1 with value: 0.49990132082423405.[0m


ROC AUC Score of LightGBM = 0.499274673997011
Training until validation scores don't improve for 30 rounds
[30]	training's auc: 0.542892	training's MSE_Modified_1: 0.248771	valid_1's auc: 0.539496	valid_1's MSE_Modified_1: 0.248945
[60]	training's auc: 0.551848	training's MSE_Modified_1: 0.247519	valid_1's auc: 0.546437	valid_1's MSE_Modified_1: 0.247811
[90]	training's auc: 0.558903	training's MSE_Modified_1: 0.246791	valid_1's auc: 0.552001	valid_1's MSE_Modified_1: 0.247204
[120]	training's auc: 0.56553	training's MSE_Modified_1: 0.246105	valid_1's auc: 0.557227	valid_1's MSE_Modified_1: 0.246632
[150]	training's auc: 0.570341	training's MSE_Modified_1: 0.245546	valid_1's auc: 0.560159	valid_1's MSE_Modified_1: 0.246207
[180]	training's auc: 0.574627	training's MSE_Modified_1: 0.245089	valid_1's auc: 0.563185	valid_1's MSE_Modified_1: 0.24586
[210]	training's auc: 0.579154	training's MSE_Modified_1: 0.2446	valid_1's auc: 0.566369	valid_1's MSE_Modified_1: 0.245494
[240]	training's a

[32m[I 2021-01-03 00:54:50,355][0m Trial 2 finished with value: 0.5005389338954468 and parameters: {'lambda_l1': 0.12085936406119104, 'lambda_l2': 0.2078915827751604, 'num_leaves': 15, 'feature_fraction': 0.8317518572850652, 'bagging_fraction': 0.9444576212537554, 'bagging_freq': 2, 'min_child_samples': 55}. Best is trial 2 with value: 0.5005389338954468.[0m


ROC AUC Score of LightGBM = 0.5005389338954468
CPU times: user 17min 17s, sys: 2min 34s, total: 19min 51s
Wall time: 10min 58s


In [15]:
%%time
# Save
pickle.dump(study.best_trial.params, open('LightGBM_Hyperparameter.pickle', 'wb'))

CPU times: user 789 µs, sys: 120 µs, total: 909 µs
Wall time: 701 µs


# References:
- [LightGBM Tuner: New Optuna Integration for Hyperparameter Optimization](https://tech.preferred.jp/en/blog/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization/)
- [Optimized LightGBM Classifier on Utility Score](https://www.kaggle.com/satorushibata/optimized-lightgbm-classifier-on-utility-score)
- [Research with Customized Sharp Weighted](https://www.kaggle.com/satorushibata/research-with-customized-sharp-weighted)
- [GPU-accelerated LightGBM full](https://www.kaggle.com/dromosys/gpu-accelerated-lightgbm-full)