Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base score for custom multi-output objects is hard-coded at 0.5 #9840

Open
david-cortes opened this issue Dec 3, 2023 · 5 comments
Open

Comments

@david-cortes
Copy link
Contributor

david-cortes commented Dec 3, 2023

EDIT: sorry, first version of this issue was wrong, I've now edited it.

When using multi-output objectives with multi-output trees, the intercept / base score / first tree seems to start at 0.5 regardless of what the labels are.

In this example, I am passing a multi-output label in which the second column is the negative of the first one:

import numpy as np, xgboost as xgb
mtcars = np.array([[21,6,160,110,3.9,2.62,16.46,0,1,4,4],
[21,6,160,110,3.9,2.875,17.02,0,1,4,4],
[22.8,4,108,93,3.85,2.32,18.61,1,1,4,1],
[21.4,6,258,110,3.08,3.215,19.44,1,0,3,1],
[18.7,8,360,175,3.15,3.44,17.02,0,0,3,2],
[18.1,6,225,105,2.76,3.46,20.22,1,0,3,1],
[14.3,8,360,245,3.21,3.57,15.84,0,0,3,4],
[24.4,4,146.7,62,3.69,3.19,20,1,0,4,2],
[22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2],
[19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4],
[17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4],
[16.4,8,275.8,180,3.07,4.07,17.4,0,0,3,3],
[17.3,8,275.8,180,3.07,3.73,17.6,0,0,3,3],
[15.2,8,275.8,180,3.07,3.78,18,0,0,3,3],
[10.4,8,472,205,2.93,5.25,17.98,0,0,3,4],
[10.4,8,460,215,3,5.424,17.82,0,0,3,4],
[14.7,8,440,230,3.23,5.345,17.42,0,0,3,4],
[32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1],
[30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2],
[33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1],
[21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1],
[15.5,8,318,150,2.76,3.52,16.87,0,0,3,2],
[15.2,8,304,150,3.15,3.435,17.3,0,0,3,2],
[13.3,8,350,245,3.73,3.84,15.41,0,0,3,4],
[19.2,8,400,175,3.08,3.845,17.05,0,0,3,2],
[27.3,4,79,66,4.08,1.935,18.9,1,1,4,1],
[26,4,120.3,91,4.43,2.14,16.7,0,1,5,2],
[30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2],
[15.8,8,351,264,4.22,3.17,14.5,0,1,5,4],
[19.7,6,145,175,3.62,2.77,15.5,0,1,5,6],
[15,8,301,335,3.54,3.57,14.6,0,1,5,8],
[21.4,4,121,109,4.11,2.78,18.6,1,1,4,2]])
y = mtcars[:, 0]
X = mtcars[:, 1:]

def rmse_obj(predt, dtrain):
    print(predt[:5])
    raise ValueError()
    y = dtrain.get_label().reshape(predt.shape)
    grad = (y - predt).reshape(-1)
    hess = np.ones_like(grad)
    return grad,hess

dm = xgb.DMatrix(data=X, label=np.c_[y, -y])
model = xgb.train(
    dtrain=dm,
    params={
        "tree_method": "hist",
        "multi_strategy": "multi_output_tree",
    },
    num_boost_round=3,
    obj=rmse_obj
)
[[0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]]

Both have the same score of 0.5, which doesn't look like it'd be a better choice than zero, for example.

@david-cortes david-cortes changed the title Base score for custom multi-output objects doesn't seem optimal Base score for custom multi-output objects is hard-coded at 0.5 Dec 4, 2023
@trivialfis
Copy link
Member

Indeed, when a custom objective is used, the intercept cannot be fitted by XGBoost since the intercept is fitted according to the objective. For instance, there is a close solution for MAE (the median). Users need to set the base_score parameter manually when custom objective is used.

@david-cortes
Copy link
Contributor Author

But in that case, why not set it to zero instead of 0.5? Or why not add a parameter (ideally turned on by default) to try to obtain it through Newton steps if base_score is not supplied by the user?

@trivialfis
Copy link
Member

trivialfis commented Dec 5, 2023

I think the 0.5 choice was due to logistic regression with sigmoid. It's an old default that nobody touches. I added the Newton step in the recent releases, but did not change the default values when the Newton step is not applicable. In summary, "historical reasons".

Your suggestion is interesting, I can look into intercept fitting with custom objective by using one step Newton in the future.

@david-cortes
Copy link
Contributor Author

As a quick workaround, wouldn't it be better to at least leave it at zero?

In the case of logistic regression with sigmoid, that's what would output a probability of 0.5, as opposed to a raw score of 0.5 which is a probability of 0.62.

@trivialfis
Copy link
Member

trivialfis commented Dec 6, 2023

The inverse of the sigmoid is logit, which turns 0.5 into 0.0. That's my guess anyway.

As a quick workaround, wouldn't it be better to at least leave it at zero?

I think the default value probably has very little meaning for regression output where the mean can be anything. I can work on custom objectives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants