[R-package] confusing error when using lgb.train() with an unknown metric #3481

jameslamb · 2020-10-25T05:03:55Z

How you are using LightGBM?

LightGBM component: R package

Environment info

Operating System: macOS 10.14

C++ compiler version: gcc 8.1.0

CMake version: 3.17.3

R version: 4.0.2

LightGBM version or commit hash: https://github.com/microsoft/LightGBM/tree/c07644d1d71540204a9b56f26667e8180bd009e2

Reproducible example(s)

Thanks to @Laurae2 for sharing this with me and creating the reproducible example below.

After installing with Rscript build_r.R, code that uses an unrecognized metric, like this:

library(lightgbm)
library(data.table)
set.seed(1)
labels <- sample(2, 100, replace = TRUE) - 1
data <- as.matrix(data.frame(A = runif(100, min = 0, max = 1), B = runif(100, min = 0, max = 1)))
data_train <- data[1:90, ]
data_valid <- data[91:100, ]
labels_train <- labels[1:90]
labels_valid <- labels[91:100]
dtrain_lgb <- lgb.Dataset(data_train, label = labels_train)
dvalid_lgb <- lgb.Dataset.create.valid(dtrain_lgb, data_valid, label = labels_valid)
valids_lgb <- list(valid = dvalid_lgb)

model <- lgb.train(
    obj = "binary",
    params = list(metric = "nonsense"),
    data = dtrain_lgb,
    valids = valids_lgb,
    nrounds = 1,
    verbose = 1,
    num_thread = 1
)

produces this error

[LightGBM] [Info] Number of positive: 47, number of negative: 43
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000016 seconds.
You can set force_col_wise=true to remove the overhead.
[LightGBM] [Info] Total Bins 62
[LightGBM] [Info] Number of data points in the train set: 90, number of used features: 2
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.522222 -> initscore=0.088947
[LightGBM] [Info] Start training from score 0.088947
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
Error in env$eval_list[[1L]] : subscript out of bound

How to close this issue

I think the ideal fix is to raise an error from the C++ side for unrecognized metrics, so that all wrappers benefit from the fix. That would mean changing

LightGBM/src/metric/metric.cpp

Line 64 in c07644d

return nullptr;

to raise an error instead of returning a null pointer.

If this change isn't made on the C++ side, I would add a new .METRIC_ALIASES in https://github.com/microsoft/LightGBM/blob/c07644d1d71540204a9b56f26667e8180bd009e2/R-package/R/aliases.R, which lists all of the valid metrics from https://lightgbm.readthedocs.io/en/latest/Parameters.html#metric-parameters, and then raise an error in lgb.check.eval() when any unknown metrics are provided in params.

@Laurae2 @guolinke @StrikerRUS @btrotta what do you think?

The text was updated successfully, but these errors were encountered:

guolinke · 2020-10-26T05:12:13Z

@jameslamb we could have a key, like 'na', 'nan', 'empty', for the empty metrics, and it returns nullptr in this case.
Otherwise it should throw errors.
The same strategy could be adapted to objective functions.

jameslamb · 2020-10-26T18:22:58Z

I like that idea! I can open a pull request so we can see what it would look like.

I think it would help reduce confusion. Even just a few minutes ago, another user ran into this issue where they used an unsupported metric but got a seemingly-unrelated error message: #3028 (comment)

jameslamb added question r-package good first issue labels Oct 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R-package] confusing error when using lgb.train() with an unknown metric #3481

[R-package] confusing error when using lgb.train() with an unknown metric #3481

jameslamb commented Oct 25, 2020

guolinke commented Oct 26, 2020

jameslamb commented Oct 26, 2020

Navigation Menu

[R-package] confusing error when using lgb.train() with an unknown metric #3481

[R-package] confusing error when using lgb.train() with an unknown metric #3481

Comments

jameslamb commented Oct 25, 2020

How you are using LightGBM?

Environment info

Reproducible example(s)

How to close this issue

guolinke commented Oct 26, 2020

jameslamb commented Oct 26, 2020