# Introduction

<div class="alert alert-block alert-warning">
<font color=black><br>

**What?** Custom loss function

**Reference:** https://coderzcolumn.com/tutorials/machine-learning/xgboost-an-in-depth-guide-python#6<br>

<br></font>
</div>

# Import modules

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
import xgboost as xgb
import sklearn
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# Import dataset

<div class="alert alert-block alert-info">
<font color=black><br>

- Boston Housing Dataset: It's a regression problem dataset which has information about a various attribute of houses in Boston and their price in dollar. 
- This will be used for regression tasks.

<br></font>
</div>

In [6]:
boston = load_boston()

# Print just the lines from 5 to 29
for line in boston.DESCR.split("\n")[5:29]:
    print(line)

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the

In [7]:
boston_df = pd.DataFrame(data=boston.data, columns = boston.feature_names)
# Add one column with the target
boston_df["Price"] = boston.target
boston_df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,Price
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [8]:
# Splitting the dataset
X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target, train_size=0.90, random_state=42)

In [9]:
print("Train/Test Sizes : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape, "\n")

Train/Test Sizes :  (455, 13) (51, 13) (455,) (51,) 



In [10]:
dmat_train = xgb.DMatrix(X_train, Y_train, feature_names=boston.feature_names)
dmat_test = xgb.DMatrix(X_test, Y_test, feature_names=boston.feature_names)

# Creating a custom-made loss function

<div class="alert alert-block alert-info">
<font color=black><br>

- Below we have created the mean squared error loss function.

<br></font>
</div>

In [13]:
def first_grad(predt, dtrain):
    '''Compute the first derivative for mean squared error.'''
    y = dtrain.get_label() if isinstance(dtrain, xgb.DMatrix) else dtrain
    return 2*(y-predt)

def second_grad(predt, dtrain):
    '''Compute the second derivative for mean squared error.'''
    y = dtrain.get_label() if isinstance(dtrain, xgb.DMatrix) else dtrain
    return [1] * len(predt)

def mean_sqaured_error(predt, dtrain):
    ''''Mean squared error function.'''
    predt[predt < -1] = -1 + 1e-6
    grad = first_grad(predt, dtrain)
    hess = second_grad(predt, dtrain)
    return grad, hess

In [None]:
def mean_absolute_error(preds, dmat):
    actuals = dmat.get_label()
    err = (actuals - preds).sum()
    return "MAE", err

In [14]:
xgb_regressor = xgb.XGBRegressor(max_depth=3, eta=1, objective=mean_sqaured_error)  ## Custom Evaluation Function

xgb_regressor.fit(X_train, Y_train,
                  eval_set=[(X_test, Y_test)], eval_metric="mae",
                  early_stopping_rounds=5,
                  verbose=10)

[0]	validation_0-mae:19.54971
Will train until validation_0-mae hasn't improved in 5 rounds.
[10]	validation_0-mae:15.58860
[20]	validation_0-mae:13.57056
[30]	validation_0-mae:11.92066
[40]	validation_0-mae:10.05207
[50]	validation_0-mae:9.34793
[60]	validation_0-mae:8.23015
[70]	validation_0-mae:7.45686
[80]	validation_0-mae:7.02828
[90]	validation_0-mae:6.30696
[99]	validation_0-mae:5.31595


XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
             colsample_bynode=1, colsample_bytree=1, eta=1, gamma=0, gpu_id=-1,
             importance_type='gain', interaction_constraints='',
             learning_rate=1, max_delta_step=0, max_depth=3, min_child_weight=1,
             missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=0,
             num_parallel_tree=1,
             objective=<function mean_sqaured_error at 0x1399645e0>,
             random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1,
             subsample=1, tree_method='exact', validate_parameters=1,
             verbosity=None)

In [15]:
print("\nTest  R2 Score : %.2f"%xgb_regressor.score(X_test, Y_test))
print("Train R2 Score : %.2f"%xgb_regressor.score(X_train, Y_train))


Test  R2 Score : 0.27
Train R2 Score : 0.63


# Creating a custom-made evaluation function

<div class="alert alert-block alert-info">
<font color=black><br>

- The function should accept predictions and DMatrix instances as parameters and then calculate metrics based on predictions and actual target values. 
- We have created simple mean_absolute_error()

<br></font>
</div>

In [21]:
def mean_absolute_error(preds, dmat):
    actuals = dmat.get_label()
    err = (actuals - preds).sum()
    return "MAE__", err

In [22]:
booster = xgb.train({'max_depth': 3, 'eta': 1, 'objective': 'reg:squarederror'},
                    dmat_train,
                    evals=[(dmat_test, "test")],
                    feval=mean_absolute_error, ## Custom Evaluation Function
                    num_boost_round=10,
                    early_stopping_rounds=5)

[0]	test-rmse:3.59159	test-MAE__:26.53242
Multiple eval metrics have been passed: 'test-MAE__' will be used for early stopping.

Will train until test-MAE__ hasn't improved in 5 rounds.
[1]	test-rmse:3.26373	test-MAE__:15.24190
[2]	test-rmse:3.12218	test-MAE__:18.01450
[3]	test-rmse:2.94107	test-MAE__:4.76002
[4]	test-rmse:2.75222	test-MAE__:2.55075
[5]	test-rmse:2.78515	test-MAE__:-3.35190
[6]	test-rmse:2.64519	test-MAE__:-2.30084
[7]	test-rmse:2.64290	test-MAE__:-2.01779
[8]	test-rmse:2.58895	test-MAE__:-9.34707
[9]	test-rmse:2.61442	test-MAE__:-5.63952


In [23]:
print("\nTrain RMSE : ",booster.eval(dmat_train))
print("Test  RMSE : ",booster.eval(dmat_test))


Train RMSE :  [0]	eval-rmse:1.965108
Test  RMSE :  [0]	eval-rmse:2.614419


In [24]:
print("\nTest  R2 Score : %.2f"%r2_score(Y_test, booster.predict(dmat_test)))
print("Train R2 Score : %.2f"%r2_score(Y_train, booster.predict(dmat_train)))


Test  R2 Score : 0.89
Train R2 Score : 0.96


# Custom-made loss & metric functions + scikitlearn API

In [25]:
xgb_regressor = xgb.XGBRegressor(max_depth=3, eta=1, objective=mean_sqaured_error)

xgb_regressor.fit(X_train, Y_train,
                  eval_set=[(X_test, Y_test)], eval_metric=mean_absolute_error,
                  early_stopping_rounds=5,
                  verbose=5)


print("Test  R2 Score : %.2f"%xgb_regressor.score(X_test, Y_test))
print("Train R2 Score : %.2f"%xgb_regressor.score(X_train, Y_train))

[0]	validation_0-rmse:20.64506	validation_0-MAE__:-997.03516
Multiple eval metrics have been passed: 'validation_0-MAE__' will be used for early stopping.

Will train until validation_0-MAE__ hasn't improved in 5 rounds.
[5]	validation_0-rmse:18.71795	validation_0-MAE__:889.68512
Stopping. Best iteration:
[0]	validation_0-rmse:20.64506	validation_0-MAE__:-997.03516

Test  R2 Score : -5.83
Train R2 Score : -5.09


# Conclusions
<hr style="border:2px solid black"> </hr>

<div class="alert alert-danger">
<font color=black>

- XGBoost offers 2 APIs, it up to you to decide which one to use. 
- XGBoost allows you to easily use your own custom-made loss and metric functions.

</font>
</div>