Error using loss and custom metric #9782

koch-tobias · 2023-11-12T13:13:26Z

Problem
I have a binary classification task with a highly imbalanced dataset. Therefore, i want to use the logloss in combination with a custom function (fbeta score) as scoring metric. However, i get the error (Unknown metric function <function Classifier.xgb_custom_fbeta_score at 0x0000015F662DADD0>) for the following implementation.

If i use only the custom metric it works but then i can not track the logloss during training and validation.

Implementation

model = XGBClassifier(booster=config["xgb_params_binary"]["boosting_type"], 
                    objective="binary:logistic",
                    disable_default_eval_metric=True,
                    eval_metric = None,
                    max_depth=hp["max_depth"],   
                    learning_rate=hp["lr"],
                    colsample_bytree=hp["colsample_bytree"],
                    scale_pos_weight=weight_factor,
                    gamma=hp["gamma"],
                    n_estimators= config["train_settings"]["n_estimators"],
                    verbosity=0,
                    random_state=42
                    )

def xgb_custom_fbeta_score(preds, dtrain):
        beta = 2 
        labels = dtrain.get_label()
                
        preds_binary = np.round(preds)

        fbeta = fbeta_score(labels, preds_binary, beta=beta)
        
        return 'fbeta', fbeta

callbacks = [xgb.callback.EarlyStopping(config["train_settings"]["early_stopping"], metric_name='fbeta')]

metrics = ['logloss', xgb_custom_fbeta_score]
model.fit(X_train, y_train,
        eval_set=[(X_train, y_train), (X_val, y_val)],
        eval_metric=metrics, 
        callbacks=callbacks
       )

The text was updated successfully, but these errors were encountered:

trivialfis · 2023-11-15T17:17:01Z

Hi, due to limitations in the existing implementation, when a custom metric is used, no additional metric names is allowed. However, feel free to remove the disable_default_eval_metric parameter since the default metric should be logloss, then pass the custom metric as:

xgb.XGBClassifier(eval_metric=fbeta_score)

assuming the fbeta_score is similar to the scores in sklearn. Turn it into an error function if you need early stopping.

pangjac · 2023-11-15T22:36:43Z

Hi @trivialfis ,

Could you help clarify below suggestion ?

when a custom metric is used, no additional metric names is allowed.

If I want to track both logloss and one custom metric fbeta, can I write them into one single function such as below, or the customized metric should contain one single metric only ?

def xg_custom_composed_score(preds, dtrain):
    logloss = function_calculate_logloss(preds, dtrain)
    fbeta = fbeta_score(preds, dtrain, beta)
    return [("logloss", logloss), ("fbeta", fbeta)]

Thank you so much!

trivialfis · 2023-11-16T20:55:10Z

Unfortunately, XGBoost doesn't work with composite metrics. However, you can use the logloss inside XGBoost along with the fbeta score:

import numpy as np
import xgboost as xgb
from sklearn.datasets import load_digits
from sklearn.metrics import fbeta_score

X, y = load_digits(n_class=2, return_X_y=True)


def error(y_true, y_pred, *args, **kwargs) -> float:
    classes = np.repeat(0, y_pred.shape[0])
    classes[y_pred > 0.5] = 1
    y_pred = classes
    return -fbeta_score(y_true, y_pred, *args, **kwargs, beta=1.0)


clf = xgb.XGBClassifier(eval_metric=error)

clf.fit(X, y, eval_set=[(X, y)])

[0]     validation_0-logloss:0.44484    validation_0-error:0.99726
[1]     validation_0-logloss:0.30749    validation_0-error:0.99726
[2]     validation_0-logloss:0.21896    validation_0-error:0.99726

pangjac · 2023-11-27T20:00:45Z

In case anyone using XGBoost native API , not sklearn API, here is the example:

import numpy as np
import xgboost as xgb
from sklearn.datasets import load_digits
from sklearn.metrics import fbeta_score
from sklearn.model_selection import train_test_split

X, y = load_digits(n_class=2, return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=131)

dtrain = xgb.DMatrix(data=X_train, label=y_train, missing=-999999999)
dvalid = xgb.DMatrix(data=X_test,  label=y_test,  missing=-999999999)

def error_func(predt: np.ndarray, dtrain:xgb.DMatrix) -> Tuple[str, float]:
    y_true = dtrain.get_label()
    classes = np.repeat(0, y_true.shape[0])
    classes[predt > 0.5] = 1
    y_pred = classes
    return "-fbeta", -fbeta_score(y_true, y_pred, beta=0.5)

params = {
    "base_score": 0.5,
    "gamma": 0,
    "eta": 0.1,
    "max_depth": 3,
    "reg_alpha": 0,
    "reg_lambda": 1,
    "objective": "binary:logistic",
}

output = xgb.train(params=params, 
                   dtrain=dtrain, 
                   num_boost_round=100,  # set a high number
                   evals=[(dtrain, "train"),(dvalid, "validation")], 
                   custom_metric=error_func,
                   verbose_eval=True
                   )

koch-tobias closed this as completed Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error using loss and custom metric #9782

Error using loss and custom metric #9782

koch-tobias commented Nov 12, 2023

trivialfis commented Nov 15, 2023

pangjac commented Nov 15, 2023 •

edited

trivialfis commented Nov 16, 2023

pangjac commented Nov 27, 2023

Error using loss and custom metric #9782

Error using loss and custom metric #9782

Comments

koch-tobias commented Nov 12, 2023

trivialfis commented Nov 15, 2023

pangjac commented Nov 15, 2023 • edited

trivialfis commented Nov 16, 2023

pangjac commented Nov 27, 2023

pangjac commented Nov 15, 2023 •

edited