# Bayesian global optimization with gaussian processes for finding (sub-)optimal parameters of LightGBM

As many of fellow kaggler asking how did I get LightGBM parameters for the kernel [Customer Transaction Prediction](https://www.kaggle.com/fayzur/customer-transaction-prediction) I published. So, I decided to publish a kernel to optimize parameters. 



In this kernel I use Bayesian global optimization with gaussian processes for finding optimal parameters. This optimization attempts to find the maximum value of an black box function in as few iterations as possible. In our case the black box function will be a function that I will write to optimize (maximize) the evaluation function (AUC) so that parameters get maximize AUC in training and validation, and expect to do good in the private. The final prediction will be **rank average on 5 fold cross validation predictions**.

Continue to the end of this kernel and **upvote it if you find it is interesting**.

![image.jpg](https://i.imgur.com/XKS1oqU.jpg)

Image taken from : https://github.com/fmfn/BayesianOptimization

## Notebook  Content
0. [Installing Bayesian global optimization library](#0) <br>    
1. [Loading the data](#1)
2. [Black box function to be optimized (LightGBM)](#2)
3. [Training LightGBM model](#3)
4. [Rank averaging](#4)
5. [Submission](#5)

<a id="0"></a> <br>
## 0. Installing Bayesian global optimization library

Let's install the latest release from pip

In [1]:
!pip install bayesian-optimization



<a id="1"></a> <br>
## 1. Loading the data

In [2]:
import pandas as pd
import numpy as np
from sklearn.model_selection import StratifiedKFold
from scipy.stats import rankdata
import lightgbm as lgb
from sklearn import metrics
import gc
import warnings

pd.set_option('display.max_columns', 200)

In [5]:
train_df = pd.read_csv('C:\\PythonScripts\\Kaggle\\dont_overfitt\\train.csv')

test_df = pd.read_csv('C:\\PythonScripts\\Kaggle\\dont_overfitt\\test.csv')

We are given anonymized dataset containing 200 numeric feature variables from var_0 to var_199. Let's have a look train dataset:

In [6]:
train_df.head()

Unnamed: 0,id,target,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,...,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299
0,0,1.0,-0.098,2.165,0.681,-0.614,1.309,-0.455,-0.236,0.276,-2.246,1.825,-0.912,-0.107,0.305,0.102,0.826,0.417,0.177,-0.673,-0.503,1.864,0.41,-1.927,0.102,-0.931,1.763,1.449,-1.097,-0.686,-0.25,-1.859,1.125,1.009,-2.296,0.385,-0.876,1.528,-0.144,-1.078,-0.403,0.005,1.405,-0.044,-0.458,0.579,2.929,0.833,0.761,0.737,0.669,0.717,-1.542,-1.847,-0.445,1.238,-0.84,-1.891,-1.531,-0.396,-0.927,2.072,0.946,-1.105,0.008,0.933,-1.41,-0.77,1.74,-1.504,-0.391,-1.551,-1.415,-0.974,0.796,-2.464,-1.424,1.23,0.219,0.13,-0.371,-0.93,1.851,1.292,-0.38,1.318,1.146,-0.399,2.227,0.447,0.87,1.42,-1.675,0.019,0.06,0.768,2.563,0.638,1.164,0.407,...,-2.017,-0.485,1.906,-0.119,0.609,-0.564,0.264,-0.604,-0.733,-2.352,-1.661,0.498,-0.841,0.907,-0.476,0.817,1.372,1.187,0.844,0.028,0.029,-0.808,0.253,1.005,1.413,-0.133,0.655,-0.921,0.231,-1.902,-0.005,-1.73,1.132,-0.194,0.039,1.489,-0.328,0.966,-0.057,-0.181,0.723,-0.313,-0.165,-0.803,0.074,-2.851,-1.021,-0.894,0.967,0.218,-0.692,-0.514,0.754,-1.892,0.203,2.174,-0.755,-1.053,-0.516,-1.109,-0.681,1.25,-0.565,-1.318,-0.923,0.075,-0.704,2.457,0.771,-0.46,0.569,-1.32,-1.516,-2.145,-1.12,0.156,0.82,-1.049,-1.125,0.484,0.617,1.253,1.248,0.504,-0.802,-0.896,-1.793,-0.284,-0.601,0.569,0.867,1.347,0.504,-0.649,0.672,-2.097,1.051,-0.414,1.038,-1.065
1,1,0.0,1.081,-0.973,-0.383,0.326,-0.428,0.317,1.172,0.352,0.004,-0.291,2.907,1.085,2.144,1.54,0.584,1.133,1.098,-0.237,-0.498,0.283,-1.1,-0.417,1.382,-0.515,-1.519,0.619,-0.128,0.866,-0.54,1.238,-0.227,0.269,-0.39,-2.721,1.659,0.106,-0.121,1.719,0.411,-0.303,-0.307,0.38,0.503,-1.32,0.339,-1.102,-0.947,0.267,0.695,0.167,0.188,-1.082,-0.872,0.66,0.051,0.303,-0.553,-0.771,0.588,0.472,1.315,-0.467,-0.064,1.808,0.633,1.221,1.112,1.133,-0.543,-2.144,0.151,-0.813,1.966,-1.19,0.19,-0.473,0.002,1.195,-0.799,1.117,-0.759,-0.661,0.406,-0.846,-0.035,-1.634,-0.011,0.503,0.61,-1.822,-0.03,1.188,-0.006,-0.279,1.914,0.62,-1.495,1.787,...,-0.551,0.003,-0.344,-1.194,-0.106,-0.679,0.009,0.372,0.025,0.066,1.005,-0.822,0.468,0.413,0.004,0.329,1.213,0.216,0.584,-0.761,-0.151,-0.175,-0.603,0.007,0.075,-0.354,-0.124,1.299,0.85,-0.318,-0.141,0.154,-0.441,-0.024,0.793,-1.47,0.386,-2.254,-0.463,0.366,-0.676,0.071,0.504,1.5,-1.16,-0.187,-0.43,-1.151,1.764,1.307,-0.731,-1.234,0.96,1.47,0.652,0.483,-2.015,-1.258,0.63,1.158,0.971,-1.489,0.53,0.917,-0.094,-1.407,0.887,-0.104,-0.583,1.267,-1.667,-2.771,-0.516,1.312,0.491,0.932,2.064,0.422,1.215,2.012,0.043,-0.307,-0.059,1.121,1.333,0.211,1.753,0.053,1.274,-0.612,-0.165,-1.695,-1.257,1.359,-0.808,-1.624,-0.458,-1.099,-0.936,0.973
2,2,1.0,-0.523,-0.089,-0.348,0.148,-0.022,0.404,-0.023,-0.172,0.137,0.183,0.459,0.478,-0.425,0.352,1.095,0.3,-1.044,0.27,-1.038,0.144,-1.658,-0.946,0.633,-0.772,1.786,0.136,-0.103,-1.223,2.273,0.055,-2.032,-0.452,0.064,0.924,-0.692,-0.067,-0.917,1.896,-0.152,1.92,-1.244,-1.704,0.167,1.088,0.068,0.972,-1.554,0.218,-2.677,-1.528,0.613,-1.269,0.516,-0.714,-0.347,-1.025,1.34,0.923,-0.071,0.552,0.837,0.847,-0.807,-0.091,1.424,0.943,0.333,0.593,-0.544,0.154,-1.081,0.409,-0.964,1.91,0.837,-1.252,1.492,-0.971,0.355,1.079,0.758,-0.031,-0.101,1.527,-0.942,-0.496,-0.572,0.533,1.02,-1.488,0.696,0.269,-1.476,0.545,0.636,0.857,-1.796,2.54,...,0.968,-0.738,-1.636,-0.533,-0.353,0.635,0.386,-1.081,0.161,-0.791,0.948,1.67,-0.309,1.662,-0.053,0.307,-0.22,0.269,1.873,-0.395,0.186,0.163,-0.118,0.129,0.301,-0.125,-1.181,-0.671,-0.303,-0.541,-0.285,-0.226,0.751,-1.391,-0.906,0.933,0.773,-1.234,-0.967,-0.01,-0.815,1.0,-0.569,-0.486,2.342,0.779,-0.548,-2.33,2.158,2.165,-0.945,-2.269,0.678,0.468,-0.405,1.059,0.483,2.47,1.459,-0.511,-0.54,-0.299,1.074,-0.748,1.086,-0.766,-0.931,0.432,1.345,-0.491,-1.602,-0.727,0.346,0.78,-0.527,-1.122,-0.208,-0.73,-0.302,2.535,-1.045,0.037,0.02,1.373,0.456,-0.277,1.381,1.843,0.749,0.202,0.013,0.263,-1.222,0.726,1.444,-1.165,-1.544,0.004,0.8,-1.211
3,3,1.0,0.067,-0.021,0.392,-1.637,-0.446,-0.725,-1.035,0.834,0.503,0.274,0.335,-1.148,0.067,-1.01,1.048,-1.442,0.21,0.836,-0.326,0.716,-0.764,0.248,-1.308,2.127,0.365,0.296,-0.808,1.854,0.118,0.38,0.999,-1.171,2.798,0.394,-1.048,1.078,0.401,-0.486,-0.732,-2.241,-0.193,0.336,0.009,0.423,1.07,-0.861,1.32,-0.976,-1.096,-0.912,0.548,0.924,0.053,0.57,0.508,-0.717,-1.133,-0.723,0.645,-1.083,0.287,-0.396,0.178,-0.421,0.196,-0.706,-1.458,1.629,-1.112,-0.479,-0.264,0.205,1.092,0.606,-0.276,1.116,0.272,1.1,-0.811,0.037,0.03,0.312,1.848,0.455,-0.934,0.739,0.286,-0.86,0.29,1.188,-0.604,1.103,-1.823,0.863,-0.447,-1.108,-1.151,-0.919,...,1.477,1.02,0.351,0.186,-0.037,-1.73,0.786,0.656,1.259,0.469,-1.561,-0.719,-1.04,0.142,0.505,1.41,1.042,0.066,0.34,-1.029,-1.382,1.35,0.294,0.036,-0.64,0.168,1.069,-0.235,0.327,-1.878,0.9,1.059,-0.458,1.006,0.898,0.955,0.118,0.054,0.347,0.507,0.526,0.899,1.496,-0.447,1.176,1.852,-0.001,-0.414,1.35,0.027,0.795,-0.056,-0.497,0.814,-1.114,-0.8,1.495,-0.591,0.53,-0.528,-0.083,-0.831,1.251,-0.206,-0.933,-1.215,0.281,0.512,-0.424,0.769,0.223,-0.71,2.725,0.176,0.845,-1.226,1.527,-1.701,0.597,0.15,1.864,0.322,-0.214,1.282,0.408,-0.91,1.02,-0.299,-1.574,-1.618,-0.404,0.64,-0.595,-0.966,0.9,0.467,-0.562,-0.254,-0.533,0.238
4,4,1.0,2.347,-0.831,0.511,-0.021,1.225,1.594,0.585,1.509,-0.012,2.198,0.19,0.453,0.494,1.478,-1.412,0.27,-1.312,-0.322,-0.688,-0.198,-0.285,1.042,-0.315,-0.478,0.024,-0.19,1.656,-0.469,-1.437,-0.581,-0.308,-0.837,-1.739,0.037,0.336,-1.102,2.371,0.554,1.173,-0.122,1.528,-1.22,2.054,-0.318,-0.445,0.344,0.161,0.83,-1.328,0.42,0.666,-0.212,-1.016,-0.312,0.62,0.807,0.301,-0.342,1.556,1.138,2.066,-0.755,-1.172,0.679,-0.787,0.357,1.626,-0.142,1.717,-1.424,0.432,0.732,-0.433,-0.937,-0.473,1.246,-0.93,0.35,0.083,-1.058,-0.187,-0.932,-0.054,-0.289,0.663,-1.218,-0.134,1.333,-0.115,0.218,-1.906,0.892,0.475,0.313,0.518,0.114,0.527,1.438,...,-1.221,0.554,-0.137,-0.174,0.567,0.648,-0.739,-0.143,0.742,-0.572,-0.369,0.91,-1.806,-0.686,0.093,1.96,-0.413,0.11,-0.657,0.53,-1.003,0.222,1.21,2.099,0.527,0.128,0.204,0.796,0.507,-0.126,-0.66,-0.628,-0.453,0.953,-0.993,0.518,0.055,0.159,0.625,0.024,-0.048,-0.693,-0.492,-0.67,-0.233,-1.096,-0.728,0.842,1.914,1.49,-0.462,-0.767,-0.191,0.169,1.273,-0.16,0.393,0.231,-0.906,0.348,-1.05,-0.347,0.904,-1.324,-0.849,3.432,0.222,0.416,0.174,-1.517,-0.337,0.055,-0.464,0.014,-1.073,0.325,-0.523,-0.692,0.19,-0.883,-1.83,1.408,2.319,1.704,-0.723,1.014,0.064,0.096,-0.775,1.845,0.898,0.134,2.415,-0.996,-1.006,1.378,1.246,1.478,0.428,0.253


Test dataset:

In [7]:
test_df.head()

Unnamed: 0,id,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,...,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299
0,250,0.5,-1.033,-1.595,0.309,-0.714,0.502,0.535,-0.129,-0.687,1.291,0.507,-0.317,1.848,-0.232,-0.34,-0.051,0.804,0.764,1.86,0.262,1.112,-0.491,-1.039,-0.492,0.183,-0.671,-1.313,0.149,0.244,1.072,-1.003,0.832,-1.075,1.988,1.201,-2.065,-0.826,-0.016,0.49,0.191,0.732,1.235,-0.867,-0.616,0.34,0.788,-0.044,0.305,-0.819,-0.447,-1.625,-1.005,-0.653,-0.371,1.556,0.754,-0.688,0.061,0.644,0.645,-0.222,-2.174,-0.61,-1.092,0.917,-1.01,-1.021,-0.179,1.732,-0.366,-1.694,1.038,-0.721,0.112,-0.783,0.94,-1.803,1.295,-1.031,0.452,1.198,-0.206,0.051,-1.055,1.74,-0.91,-0.509,-0.987,-1.011,0.718,0.375,0.101,0.137,-1.585,0.532,-1.201,1.21,-0.374,0.3,...,1.578,-0.488,1.424,1.106,0.363,-2.007,-0.091,0.551,0.388,0.422,0.099,0.378,-1.333,-1.102,2.145,0.745,0.345,-0.904,0.425,-0.273,0.547,-0.184,0.458,0.182,0.592,0.966,0.54,-1.382,0.069,0.131,-0.068,-0.4,0.413,-0.03,0.89,1.0,-0.774,0.34,2.345,2.748,0.774,-0.355,0.574,0.027,1.437,-0.877,0.532,-0.348,0.926,1.308,-0.12,-1.46,0.755,0.426,1.667,-0.264,1.266,0.962,1.285,1.176,0.824,0.928,1.372,1.505,0.645,0.641,-1.132,1.009,0.998,0.21,-1.634,1.046,0.114,-0.806,0.301,0.145,-0.684,0.794,-0.29,-1.688,0.313,1.14,0.447,-0.616,1.294,0.785,0.453,1.55,-0.866,1.007,-0.088,-2.628,-0.845,2.078,-0.277,2.132,0.609,-0.104,0.312,0.979
1,251,0.776,0.914,-0.494,1.347,-0.867,0.48,0.578,-0.313,0.203,1.356,-1.086,0.322,0.876,-0.563,-1.394,0.385,1.891,-2.107,-0.636,-0.055,-0.843,0.041,0.253,0.557,0.475,-0.839,-1.146,1.21,1.427,0.347,1.077,-0.194,0.323,0.543,0.894,1.19,0.342,-0.858,0.756,1.35,-0.414,0.748,2.014,0.858,0.025,1.343,0.784,-0.418,-0.515,0.694,-1.097,0.559,-0.799,-0.936,1.483,1.67,1.403,0.457,-1.564,0.049,0.55,-0.085,-0.561,-0.529,-1.563,-0.781,-0.532,0.375,-0.727,-0.053,-0.383,-0.123,1.573,-0.898,-0.07,0.811,-0.036,0.72,1.691,-0.673,-0.421,-1.665,0.099,0.089,2.032,-1.132,-1.827,-0.017,-1.748,-0.717,2.004,1.216,1.547,1.322,0.481,1.819,-0.809,0.617,-0.763,...,-1.27,-0.426,-1.236,-0.036,0.187,0.86,-1.363,-0.279,-0.556,-2.017,-0.651,-1.192,-0.339,0.363,0.416,-0.039,2.421,0.953,1.059,0.512,-0.616,-0.172,1.502,-1.078,-1.196,0.042,0.476,-0.271,0.869,-1.596,1.4,0.148,0.577,1.222,2.069,-0.82,0.443,0.025,0.089,-0.939,-0.643,-0.376,0.297,0.352,0.748,1.493,-2.634,0.368,-0.177,-0.143,0.835,-1.824,-1.452,-0.408,-0.417,0.563,-0.161,-0.494,0.17,-0.257,-1.791,0.122,-0.669,-1.558,-0.244,2.583,-0.829,0.133,-2.746,0.341,-1.145,0.492,0.437,-0.628,0.271,2.639,0.481,-0.687,1.017,1.648,-1.272,-0.797,-0.87,-1.582,-1.987,-0.052,-0.194,0.539,-1.788,-0.433,-0.683,-0.066,0.025,0.606,-0.353,-1.133,-3.138,0.281,-0.625,-0.761
2,252,1.75,0.509,-0.057,0.835,-0.476,1.428,-0.701,-2.009,-1.378,0.167,-0.132,0.459,-0.341,0.014,0.184,-0.46,-0.991,-1.039,0.992,1.036,1.552,-0.83,1.374,-0.914,0.427,0.027,0.327,1.117,0.871,-2.556,-0.036,-0.081,0.744,-1.191,-1.784,0.239,0.5,0.437,0.746,0.999,0.489,0.467,-1.063,-1.333,1.062,0.482,0.984,-0.542,1.295,-1.191,0.755,1.206,-0.558,-1.403,-0.852,0.025,0.835,0.716,0.64,-1.007,0.268,-1.148,1.019,0.905,1.142,-0.529,0.738,-1.881,-0.857,-1.171,1.057,-2.476,2.686,-2.471,-0.153,0.19,1.063,0.117,-1.038,-0.134,-1.03,-0.054,-0.608,-0.333,0.184,0.633,0.024,-0.056,2.202,0.434,0.065,-1.104,-0.455,0.29,0.906,-1.441,0.557,0.243,0.706,...,-1.297,-0.847,-0.511,-0.181,-1.06,-0.205,-1.746,-0.371,0.878,-0.885,-1.128,-0.691,1.2,0.065,1.707,-0.846,1.248,-1.201,-0.48,-0.953,1.403,-0.228,-1.545,-0.085,0.554,-0.626,-0.751,-0.696,0.248,0.059,1.059,1.457,-0.452,-1.058,-0.393,-1.529,1.167,-1.07,-2.563,0.427,0.369,0.011,1.589,0.844,-0.425,-0.572,0.558,-0.49,-0.424,-1.651,0.46,-0.581,0.259,0.982,0.123,-0.723,0.034,1.661,-1.134,-0.643,-1.167,1.009,-0.18,-0.683,-1.383,1.02,0.268,-1.558,0.62,-0.489,-2.09,-0.977,1.672,-0.655,-0.801,-1.846,0.761,-0.846,0.181,0.962,-0.611,1.45,0.021,0.32,-0.951,-2.662,0.761,-0.665,-0.619,-0.645,-0.094,0.351,-0.607,-0.737,-0.031,0.701,0.976,0.135,-1.327,2.463
3,253,-0.556,-1.855,-0.682,0.578,1.592,0.512,-1.419,0.722,0.511,0.567,0.356,-0.06,0.767,-0.196,0.359,0.08,-0.956,0.857,-0.655,-0.09,-0.008,-0.596,-0.413,-1.03,0.173,-0.969,0.998,0.079,0.79,-0.776,-0.374,-1.995,0.572,0.542,0.547,0.307,-0.074,1.703,-0.003,0.818,0.182,0.082,-0.374,-0.475,1.488,-0.556,1.975,0.812,-1.838,1.449,2.116,1.988,-1.516,0.264,-0.232,0.974,-2.0,0.072,-1.553,1.145,-1.038,-1.004,1.348,0.412,1.368,0.754,1.275,1.405,-0.024,0.636,-1.18,0.506,0.932,-0.246,1.051,-0.22,1.111,0.401,0.502,0.315,0.56,-0.569,-1.841,0.83,0.543,0.09,0.062,0.106,1.03,-1.244,-0.237,-1.649,0.405,-2.06,-0.87,1.206,1.49,-0.981,-0.828,...,0.349,-1.032,0.728,-0.691,0.936,1.075,0.602,-1.773,-0.55,1.279,-0.793,0.68,0.263,-0.394,0.121,-0.544,0.91,1.502,-0.817,0.453,-0.019,-1.556,-0.447,-0.076,-0.309,0.307,-1.386,0.637,-1.15,0.54,0.455,-0.948,-1.316,-0.274,-2.316,-0.652,-0.652,-0.611,1.744,0.26,0.051,-0.256,-0.296,-1.297,-1.636,0.023,-0.872,0.243,1.11,-0.104,-0.483,-0.189,-1.274,0.872,1.181,-0.627,0.827,-1.477,0.322,-0.62,-1.029,-0.34,0.052,2.122,-0.136,-1.799,1.45,1.866,-0.273,-0.237,-0.207,-0.196,-1.106,-1.56,-0.934,2.167,0.323,0.583,1.48,-0.685,-0.473,-1.066,-0.271,0.506,-0.753,1.048,-0.45,-0.3,-1.221,0.235,-0.336,-0.787,0.255,-0.031,-0.836,0.916,2.411,1.053,-1.601,-1.529
4,254,0.754,-0.245,1.173,-1.623,0.009,0.37,0.781,-1.763,-1.432,-0.93,-0.098,0.896,0.293,-0.259,0.03,-0.661,0.921,0.006,-0.631,1.284,-1.167,-0.744,-2.184,2.146,1.13,0.017,1.421,-0.59,1.938,-0.194,0.794,0.579,0.521,0.635,-0.023,-0.892,-0.363,-0.36,0.405,0.222,0.346,1.175,-0.252,0.767,0.654,0.339,0.481,0.751,0.611,-0.052,0.389,-0.426,1.95,1.168,-1.277,-0.154,-1.829,1.521,2.195,0.012,1.258,-1.36,0.77,-0.916,-0.198,-1.21,1.643,0.068,0.048,-0.781,0.356,0.335,0.211,-1.321,1.749,0.563,0.02,-0.433,-0.742,1.269,-3.389,-0.291,-1.216,-0.968,1.388,0.934,0.022,1.398,-0.571,-0.056,-0.033,-0.294,1.03,-0.972,-0.655,0.304,-0.028,1.155,1.376,...,1.847,1.718,0.562,-0.162,-0.521,-0.425,-1.888,-0.333,0.21,-0.11,0.827,0.102,-0.832,-0.724,0.624,-0.496,-0.196,0.46,0.214,1.192,-0.09,-0.089,0.811,1.154,1.663,-1.142,-1.592,-0.082,0.14,1.414,0.047,0.343,0.062,0.999,-0.27,0.234,-0.047,-1.567,-0.153,-0.273,-1.316,1.161,-1.568,-2.089,0.892,1.123,-0.862,-0.993,1.095,0.266,-0.455,1.304,0.548,-0.654,0.276,-0.073,-0.86,-0.585,2.169,0.141,-0.486,-0.068,-0.534,-1.322,0.5,0.263,-0.745,0.578,-0.064,0.738,-0.28,0.745,-0.588,-0.429,-0.588,0.154,-1.187,1.681,-0.832,-0.437,-0.038,-1.096,-0.156,3.565,-0.428,-0.384,1.243,-0.966,1.525,0.458,2.184,-1.09,0.216,1.186,-0.143,0.322,-0.068,-0.156,-1.153,0.825


Distribution of target variable

In [8]:
target = 'target'
predictors = train_df.columns.values.tolist()[2:]

In [9]:
train_df.target.value_counts()

1.0    160
0.0     90
Name: target, dtype: int64

The problem is unbalanced! 

In this kernel I will be using **50% Stratified rows** as holdout rows for the validation-set to get optimal parameters. Later I will use 5 fold cross validation in the final model fit.

In [56]:
bayesian_tr_index, bayesian_val_index  = list(StratifiedKFold(n_splits=10, shuffle=True, random_state=1).split(train_df, train_df.target.values))[0]

These `bayesian_tr_index` and `bayesian_val_index` indexes will be used for the bayesian optimization as training and validation index of training dataset.

<a id="2"></a> <br>
## 2. Black box function to be optimized (LightGBM)

As data is loaded, let's create the black box function for LightGBM to find parameters.

In [57]:
def LGB_bayesian(
    num_leaves,  # int
    min_data_in_leaf,  # int
    learning_rate,
    min_sum_hessian_in_leaf,    # int  
    feature_fraction,
    lambda_l1,
    lambda_l2,
    min_gain_to_split,
    max_depth):
    
    # LightGBM expects next three parameters need to be integer. So we make them integer
    num_leaves = int(num_leaves)
    min_data_in_leaf = int(min_data_in_leaf)
    max_depth = int(max_depth)

    assert type(num_leaves) == int
    assert type(min_data_in_leaf) == int
    assert type(max_depth) == int

    param = {
        'num_leaves': num_leaves,
        'max_bin': 63,
        'min_data_in_leaf': min_data_in_leaf,
        'learning_rate': learning_rate,
        'min_sum_hessian_in_leaf': min_sum_hessian_in_leaf,
        'bagging_fraction': 1.0,
        'bagging_freq': 5,
        'feature_fraction': feature_fraction,
        'lambda_l1': lambda_l1,
        'lambda_l2': lambda_l2,
        'min_gain_to_split': min_gain_to_split,
        'max_depth': max_depth,
        'save_binary': True, 
        'seed': 1337,
        'feature_fraction_seed': 1337,
        'bagging_seed': 1337,
        'drop_seed': 1337,
        'data_random_seed': 1337,
        'objective': 'binary',
        'boosting_type': 'gbdt',
        'verbose': 1,
        'metric': 'auc',
        'is_unbalance': True,
        'boost_from_average': False,   

    }    
    
    
    xg_train = lgb.Dataset(train_df.iloc[bayesian_tr_index][predictors].values,
                           label=train_df.iloc[bayesian_tr_index][target].values,
                           feature_name=predictors,
                           free_raw_data = False
                           )
    xg_valid = lgb.Dataset(train_df.iloc[bayesian_val_index][predictors].values,
                           label=train_df.iloc[bayesian_val_index][target].values,
                           feature_name=predictors,
                           free_raw_data = False
                           )   

    num_round = 5000
    clf = lgb.train(param, xg_train, num_round, valid_sets = [xg_valid], verbose_eval=250, early_stopping_rounds = 50)
    
    predictions = clf.predict(train_df.iloc[bayesian_val_index][predictors].values, num_iteration=clf.best_iteration)   
    
    score = metrics.roc_auc_score(train_df.iloc[bayesian_val_index][target].values, predictions)
    
    return score

The above `LGB_bayesian` function will act as black box function for Bayesian optimization. I already defined the the trainng and validation dataset for LightGBM inside the `LGB_bayesian` function. 

The `LGB_bayesian` function takes values for `num_leaves`, `min_data_in_leaf`, `learning_rate`, `min_sum_hessian_in_leaf`, `feature_fraction`, `lambda_l1`, `lambda_l2`, `min_gain_to_split`, `max_depth` from Bayesian optimization framework. Keep in mind that `num_leaves`, `min_data_in_leaf`, and `max_depth` should be integer for LightGBM. But Bayesian Optimization sends continous vales to function. So I force them to be integer. I am only going to find optimal parameter values of them. The reader may increase or decrease number of parameters to optimize.

Now I need to give bounds for these parameters, so that Bayesian optimization only search inside the bounds.

In [73]:
# Bounded region of parameter space
bounds_LGB = {
    'num_leaves': (5, 20), 
    'min_data_in_leaf': (5, 20),  
    'learning_rate': (0.01, 0.3),
    'min_sum_hessian_in_leaf': (0.00001, 0.01),    
    'feature_fraction': (0.05, 0.5),
    'lambda_l1': (1, 5.0), 
    'lambda_l2': (1, 5.0), 
    'min_gain_to_split': (0, 1.0),
    'max_depth':(2,5),
}

Let's put all of them in BayesianOptimization object

In [74]:
from bayes_opt import BayesianOptimization

In [75]:
LGB_BO = BayesianOptimization(LGB_bayesian, bounds_LGB, random_state=13)

Now, let's the the key space (parameters) we are going to optimize:

In [76]:
print(LGB_BO.space.keys)

['feature_fraction', 'lambda_l1', 'lambda_l2', 'learning_rate', 'max_depth', 'min_data_in_leaf', 'min_gain_to_split', 'min_sum_hessian_in_leaf', 'num_leaves']


I have created the BayesianOptimization object (`LGB_BO`), it will not work until I call maximize. Before calling it, I want to explain two parameters of BayesianOptimization object (`LGB_BO`) which we can pass to maximize:
- `init_points`: How many initial random runs of **random** exploration we want to perform. In our case `LGB_bayesian` will be called `n_iter` times.
- `n_iter`: How many runs of bayesian optimization we want to perform after number of `init_points` runs. 

Now, it's time to call the function from Bayesian optimization framework to maximize. I allow `LGB_BO` object to run for 5 `init_points` (exploration) and 5 `n_iter` (exploitation).

In [77]:
init_points = 10
n_iter = 20

In [78]:
print('-' * 130)

with warnings.catch_warnings():
    warnings.filterwarnings('ignore')
    LGB_BO.maximize(init_points=init_points, n_iter=n_iter, acq='ucb', xi=0.0, alpha=1e-6)

----------------------------------------------------------------------------------------------------------------------------------
|   iter    |  target   | featur... | lambda_l1 | lambda_l2 | learni... | max_depth | min_da... | min_ga... | min_su... | num_le... |
-------------------------------------------------------------------------------------------------------------------------------------
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[3]	valid_0's auc: 0.631944
| [0m 1       [0m | [0m 0.6319  [0m | [0m 0.4     [0m | [0m 1.95    [0m | [0m 4.297   [0m | [0m 0.2901  [0m | [0m 4.918   [0m | [0m 11.8    [0m | [0m 0.609   [0m | [0m 0.007758[0m | [0m 14.62   [0m |
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[16]	valid_0's auc: 0.6875
| [95m 2       [0m | [95m 0.6875  [0m | [95m 0.3749  [0m | [95m 1.14    [0m | [95m 2.194   [0m | [95m 0.02697 [0m | [

Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[4]	valid_0's auc: 0.711806
| [0m 24      [0m | [0m 0.7118  [0m | [0m 0.5     [0m | [0m 5.0     [0m | [0m 5.0     [0m | [0m 0.3     [0m | [0m 2.0     [0m | [0m 5.0     [0m | [0m 1.0     [0m | [0m 0.01    [0m | [0m 20.0    [0m |
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[20]	valid_0's auc: 0.701389
| [0m 25      [0m | [0m 0.7014  [0m | [0m 0.5     [0m | [0m 1.0     [0m | [0m 1.0     [0m | [0m 0.01    [0m | [0m 2.0     [0m | [0m 20.0    [0m | [0m 1.0     [0m | [0m 0.01    [0m | [0m 13.65   [0m |
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[18]	valid_0's auc: 0.840278
| [95m 26      [0m | [95m 0.8403  [0m | [95m 0.5     [0m | [95m 1.0     [0m | [95m 5.0     [0m | [95m 0.01    [0m | [95m 5.0     [0m | [95m 20.0    [0m | [95m 1.0

As the optimization is done, let's see what is the maximum value we have got.

In [80]:
LGB_BO.max['target']

0.8402777777777778

The validation AUC for parameters is 0.89 ! Let's see parameters is responsible for this score :)

In [81]:
LGB_BO.max['params']

{'feature_fraction': 0.5,
 'lambda_l1': 1.0,
 'lambda_l2': 5.0,
 'learning_rate': 0.01,
 'max_depth': 5.0,
 'min_data_in_leaf': 20.0,
 'min_gain_to_split': 1.0,
 'min_sum_hessian_in_leaf': 0.01,
 'num_leaves': 11.475737771544752}

Now we can use these parameters to our final model!

Wait, I want to show one more cool option from BayesianOptimization library. You can probe the `LGB_bayesian` function, if you have an idea of the optimal parameters or it you get **parameters from other kernel** like mine [mine](https://www.kaggle.com/fayzur/customer-transaction-prediction). I will copy and paste parameters from my other kernel here. You can probe as folowing:

In [18]:
# parameters from version 2 of
#https://www.kaggle.com/fayzur/customer-transaction-prediction?scriptVersionId=10522231

LGB_BO.probe(
    params={'feature_fraction': 0.1403, 
            'lambda_l1': 4.218, 
            'lambda_l2': 1.734, 
            'learning_rate': 0.07, 
            'max_depth': 14, 
            'min_data_in_leaf': 17, 
            'min_gain_to_split': 0.1501, 
            'min_sum_hessian_in_leaf': 0.000446, 
            'num_leaves': 6},
    lazy=True, # 
)

OK, by default these will be explored lazily (lazy=True), meaning these points will be evaluated only the next time you call maximize. Let's do a maximize call of `LGB_BO` object.

In [19]:
LGB_BO.maximize(init_points=0, n_iter=0) # remember no init_points or n_iter

|   iter    |  target   | featur... | lambda_l1 | lambda_l2 | learni... | max_depth | min_da... | min_ga... | min_su... | num_le... |
-------------------------------------------------------------------------------------------------------------------------------------
Training until validation scores don't improve for 50 rounds.
[250]	valid_0's auc: 0.86243
[500]	valid_0's auc: 0.880738
[750]	valid_0's auc: 0.887841
[1000]	valid_0's auc: 0.890905
[1250]	valid_0's auc: 0.892214
[1500]	valid_0's auc: 0.892655
Early stopping, best iteration is:
[1466]	valid_0's auc: 0.892672
| [95m 11      [0m | [95m 0.8927  [0m | [95m 0.1403  [0m | [95m 4.218   [0m | [95m 1.734   [0m | [95m 0.07    [0m | [95m 14.0    [0m | [95m 17.0    [0m | [95m 0.1501  [0m | [95m 0.000446[0m | [95m 6.0     [0m |


Finally, the list of all parameters probed and their corresponding target values is available via the property LGB_BO.res.

In [20]:
for i, res in enumerate(LGB_BO.res):
    print("Iteration {}: \n\t{}".format(i, res))

Iteration 0: 
	{'target': 0.8785456504868869, 'params': {'feature_fraction': 0.3999660847582191, 'lambda_l1': 1.1877061001745615, 'lambda_l2': 4.1213926633068425, 'learning_rate': 0.2900672674324699, 'max_depth': 14.67121336685872, 'min_data_in_leaf': 11.801738711259683, 'min_gain_to_split': 0.6090424627612779, 'min_sum_hessian_in_leaf': 0.007757509880902418, 'num_leaves': 14.624200171386038}}
Iteration 1: 
	{'target': 0.8915776403640969, 'params': {'feature_fraction': 0.3749082032826262, 'lambda_l1': 0.17518262050718658, 'lambda_l2': 1.492247354445897, 'learning_rate': 0.026968622645801674, 'max_depth': 13.284731311046386, 'min_data_in_leaf': 10.592810418122113, 'min_gain_to_split': 0.679847951578097, 'min_sum_hessian_in_leaf': 0.002570236693773035, 'num_leaves': 10.21371822728738}}
Iteration 2: 
	{'target': 0.8924701144136037, 'params': {'feature_fraction': 0.05423574653643624, 'lambda_l1': 1.7916689135248487, 'lambda_l2': 4.745470908391052, 'learning_rate': 0.07319071264818977, 'max

We have got a better validation score in the probe! As previously I ran `LGB_BO` only for 10 runs. In practice I increase it to arround 100.

In [82]:
LGB_BO.max['target']

0.8402777777777778

In [83]:
LGB_BO.max['params']

{'feature_fraction': 0.5,
 'lambda_l1': 1.0,
 'lambda_l2': 5.0,
 'learning_rate': 0.01,
 'max_depth': 5.0,
 'min_data_in_leaf': 20.0,
 'min_gain_to_split': 1.0,
 'min_sum_hessian_in_leaf': 0.01,
 'num_leaves': 11.475737771544752}

Let's build a model together use therse parameters ;)

<a id="3"></a> <br>
## 3. Training LightGBM model

In [90]:
param_lgb = {
        'num_leaves': int(LGB_BO.max['params']['num_leaves']), # remember to int here
        #'max_bin': 63,
        'min_data_in_leaf': int(LGB_BO.max['params']['min_data_in_leaf']), # remember to int here
        'learning_rate': LGB_BO.max['params']['learning_rate'],
        'min_sum_hessian_in_leaf': LGB_BO.max['params']['min_sum_hessian_in_leaf'],
        #'bagging_fraction': 1.0, 
        #'bagging_freq': 5, 
        'feature_fraction': LGB_BO.max['params']['feature_fraction'],
        'lambda_l1': LGB_BO.max['params']['lambda_l1'],
        'lambda_l2': LGB_BO.max['params']['lambda_l2'],
        'min_gain_to_split': LGB_BO.max['params']['min_gain_to_split'],
        'max_depth': int(LGB_BO.max['params']['max_depth']), # remember to int here
        'save_binary': True,
     #   'seed': 1337,
        #'feature_fraction_seed': 1337,
        #'bagging_seed': 1337,
        #'drop_seed': 1337,
        #'data_random_seed': 1337,
        'objective': 'binary',
        'boosting_type': 'gbdt',
        'verbose': 1,
        'metric': 'auc',
        #'is_unbalance': True,
        'boost_from_average': False,
    }

param_lgb

{'num_leaves': 11,
 'min_data_in_leaf': 20,
 'learning_rate': 0.01,
 'min_sum_hessian_in_leaf': 0.01,
 'feature_fraction': 0.5,
 'lambda_l1': 1.0,
 'lambda_l2': 5.0,
 'min_gain_to_split': 1.0,
 'max_depth': 5,
 'save_binary': True,
 'objective': 'binary',
 'boosting_type': 'gbdt',
 'verbose': 1,
 'metric': 'auc',
 'boost_from_average': False}

As you see, I assined `LGB_BO`'s optimal parameters to the `param_lgb` dictionary and they will be used to train a model with 5 fold.

Number of Kfolds:

In [91]:
nfold = 20

In [92]:
gc.collect()

253

In [93]:
skf = StratifiedKFold(n_splits=nfold, shuffle=True, random_state=2019)

In [94]:
oof = np.zeros(len(train_df))
predictions = np.zeros((len(test_df),nfold))

i = 1
for train_index, valid_index in skf.split(train_df, train_df.target.values):
    print("\nfold {}".format(i))
    xg_train = lgb.Dataset(train_df.iloc[train_index][predictors].values,
                           label=train_df.iloc[train_index][target].values,
                           feature_name=predictors,
                           free_raw_data = False
                           )
    xg_valid = lgb.Dataset(train_df.iloc[valid_index][predictors].values,
                           label=train_df.iloc[valid_index][target].values,
                           feature_name=predictors,
                           free_raw_data = False
                           )   

    
    clf = lgb.train(param_lgb, xg_train, 5000, valid_sets = [xg_valid], verbose_eval=250, early_stopping_rounds = 50)
    oof[valid_index] = clf.predict(train_df.iloc[valid_index][predictors].values, num_iteration=clf.best_iteration) 
    
    predictions[:,i-1] += clf.predict(test_df[predictors], num_iteration=clf.best_iteration)
    i = i + 1

print("\n\nCV AUC: {:<0.2f}".format(metrics.roc_auc_score(train_df.target.values, oof)))


fold 1
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[13]	valid_0's auc: 0.925

fold 2
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[29]	valid_0's auc: 0.7

fold 3
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[40]	valid_0's auc: 0.6

fold 4
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[6]	valid_0's auc: 0.7

fold 5
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[6]	valid_0's auc: 0.7625

fold 6
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[6]	valid_0's auc: 1

fold 7
Training until validation scores don't improve for 50 rounds.
Early stopping, best iteration is:
[6]	valid_0's auc: 0.8

fold 8
Training until validation scores don't improve for 50 rounds.
Early stopping, best iterati

So we got 0.90 AUC in 5 fold cross validation. And 5 fold prediction look like:

In [95]:
predictions

array([[0.52191212, 0.52273183, 0.53406855, ..., 0.52909054, 0.5099173 ,
        0.52058333],
       [0.5301896 , 0.55323109, 0.54883977, ..., 0.52899379, 0.50856721,
        0.57193042],
       [0.50454907, 0.50679613, 0.4855532 , ..., 0.5233784 , 0.50314271,
        0.51140575],
       ...,
       [0.50023519, 0.50301422, 0.48504433, ..., 0.5044876 , 0.50003892,
        0.49861962],
       [0.53167874, 0.55167153, 0.5934861 , ..., 0.56028022, 0.5153673 ,
        0.5798077 ],
       [0.51816932, 0.5118961 , 0.5223954 , ..., 0.52908129, 0.5028021 ,
        0.50452161]])

If you are still reading, bare with me. I will not take much of your time. :D We are almost done. Let's do a rank averaging on 5 fold predictions.

<a id="4"></a> <br>
## 4. Rank averaging

In [42]:
print("Rank averaging on", nfold, "fold predictions")
rank_predictions = np.zeros((predictions.shape[0],1))
for i in range(nfold):
    rank_predictions[:, 0] = np.add(rank_predictions[:, 0], rankdata(predictions[:, i].reshape(-1,1))/rank_predictions.shape[0]) 

rank_predictions /= nfold

Rank averaging on 20 fold predictions


Let's submit prediction to Kaggle.

<a id="5"></a> <br>
## 5. Submission

In [44]:
sub_df = pd.DataFrame({"id": test_df.id.values})
sub_df["target"] = rank_predictions
sub_df[:10]

Unnamed: 0,id,target
0,250,0.277034
1,251,0.358053
2,252,0.453792
3,253,0.73859
4,254,0.443934
5,255,0.281243
6,256,0.444399
7,257,0.212629
8,258,0.79233
9,259,0.195742


In [46]:
sub_df.to_csv("C:\\PythonScripts\\Kaggle\\dont_overfitt\\submission_baysian_strat.csv", index=False)

Do not forget to upvote :) Also fork and modify for your own use. ;)