Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Fix best_iter and best_score #2159

Merged
merged 7 commits into from May 27, 2019
Merged

[R-package] Fix best_iter and best_score #2159

merged 7 commits into from May 27, 2019

Conversation

Laurae2
Copy link
Contributor

@Laurae2 Laurae2 commented May 8, 2019

This should fix #2158 and #2029.

"best" rule:

  • Take the first iteration where the best score is attained, even if the further iterations have identical scores (do not take the last)
  • When there is no scoring done, the best score becomes NA and the best iteration is -1 (currently)

Later, we should enforce the metric used for early stopping should be only the first one, or at worst give the user the ability to choose the metric (best is the first).

note: @jameslamb spaces from RStudio to fix

Example (change to metric = "auc" and max_depth = 3 to test maximization):

library(lightgbm)
data(agaricus.train, package = "lightgbm")
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label = train$label)
data(agaricus.test, package = "lightgbm")
test <- agaricus.test
dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label)
params <- list(objective = "regression", metric = "l2")
valids <- list(test = dtest)
model <- lgb.train(params,
                   dtrain,
                   100,
                   min_data = 1,
                   learning_rate = 0.5)
model$best_score
model$best_iter
# > model$best_score
# [1] NA
# > model$best_iter
# [1] -1

model <- lgb.train(params,
                   dtrain,
                   100,
                   valids,
                   min_data = 1,
                   learning_rate = 0.5)
model$best_score
model$best_iter
# > model$best_score
# [1] 7.497024e-62
# > model$best_iter
# [1] 100

model <- lgb.train(params,
                   dtrain,
                   100,
                   valids,
                   min_data = 1,
                   learning_rate = 0.5,
                   early_stopping_rounds = 10)
model$best_score
model$best_iter
# > model$best_score
# [1] 7.497024e-62
# > model$best_iter
# [1] 100

model <- lgb.cv(params,
             dtrain,
             10,
             nfold = 5,
             min_data = 1,
             learning_rate = 1)
model$best_score
model$best_iter
# > model$best_score
# [1] 0.0003072197
# > model$best_iter
# [1] 1

model <- lgb.cv(params,
             dtrain,
             10,
             nfold = 5,
             min_data = 1,
             learning_rate = 1,
             early_stopping_rounds = 5)
model$best_score
model$best_iter
# > model$best_score
# [1] 5.369631e-18
# > model$best_iter
# [1] 3

manual tests done: 
* With early stopping + with validation set
* With early stopping + without validation set
* Without early stopping + with validation set
* Without early stopping + without validation set

And with multiple metrics / validation sets.
manual tests done: 
* With early stopping + with validation set
* With early stopping + without validation set
* Without early stopping + with validation set
* Without early stopping + without validation set

And with multiple metrics / validation sets.
@Laurae2 Laurae2 requested review from jameslamb and guolinke and removed request for jameslamb May 8, 2019 10:33
@Laurae2 Laurae2 mentioned this pull request May 8, 2019
@StrikerRUS
Copy link
Collaborator

Later, we should enforce the metric used for early stopping should be only the first one, or at worst give the user the ability to choose the metric (best is the first).

There are some problems with metrics' order in Python: #2127. I suppose the same is true and for R too.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 8, 2019

@StrikerRUS In R lists, it is fixed. The first element always remains the first element. You may even have two elements having the same exact names without any conflict.

However users may provide the same metric multiple times by mistake, in that case we deduplicate them.

Example:

library(lightgbm)
data(agaricus.train, package = "lightgbm")
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label = train$label)
data(agaricus.test, package = "lightgbm")
test <- agaricus.test
dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label)
params <- list(objective = "regression", metric = c("l2", "l1", "l2"))
valids <- list(test = dtest)

model <- lgb.train(params,
                   dtrain,
                   100,
                   valids,
                   min_data = 1,
                   learning_rate = 0.5)
str(model$record_evals$test, max.level = 1)
# > str(model$record_evals$test, max.level = 1)
# List of 2
#  $ l2:List of 2
#  $ l1:List of 2

@guolinke
Copy link
Collaborator

guolinke commented May 9, 2019

@Laurae2 will it be better to have a parameter named "first_metric_only" in R as well?

@guolinke
Copy link
Collaborator

guolinke commented May 9, 2019

BTW, if both R and python have "first_metric_only" option, I think we should have the same option in CLI version.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 12, 2019

@guolinke Yes, I think we should have first_metric_only option for CLI, R, and Python.

As the handling would be different for each wrapper, R and Python would have their own implementations using callback.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 12, 2019

@StrikerRUS do you know why Travis MPI/Python jobs are failing?

@StrikerRUS
Copy link
Collaborator

@Laurae2 gcc 9 has been released recently. Hotfix is already in master: abbbbd7.

@guolinke
Copy link
Collaborator

Later, we should enforce the metric used for early stopping should be only the first one, or at worst give the user the ability to choose the metric (best is the first).

@Laurae2 is this implemented in this PR?

@guolinke
Copy link
Collaborator

also refer to this:
#2127 (comment)
we should ensure that cli, python and R have the same behavior with first_metric_only.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 17, 2019

@guolinke This PR uses all metrics. We can do first_metric_only in another PR.

Note that the best score / iteration is taken from the first metric when it was not computed by the model.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 17, 2019

Closing/reopening for CI

@Laurae2 Laurae2 closed this May 17, 2019
@Laurae2 Laurae2 reopened this May 17, 2019
@StrikerRUS
Copy link
Collaborator

@Laurae2

Closing/reopening for CI

This doesn't work for branch's CIs, it works only for PR's CIs.

@Laurae2
Copy link
Contributor Author

Laurae2 commented May 26, 2019

@jameslamb do we merge it as is for the moment?

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented May 27, 2019

I've updated the branch. Now all checks should be OK.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you @Laurae2 , apologies for my delayed review

@jameslamb jameslamb merged commit f70a053 into master May 27, 2019
@StrikerRUS StrikerRUS deleted the fix-#2029 branch May 27, 2019 19:52
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R-package] lgb.cv returns negative best score with an absolute value metric
4 participants