Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix metric alias #2273

Merged
merged 10 commits into from Jul 25, 2019
Merged

fix metric alias #2273

merged 10 commits into from Jul 25, 2019

Conversation

guolinke
Copy link
Collaborator

refer to #2209 (comment)

@StrikerRUS
Copy link
Collaborator

@guolinke Is it possible to share std::string GetMetricType(const std::string& type) function for objective too? It seems to be that the difference is in l2/l2_root and multiclass/multiclassova only.

@StrikerRUS
Copy link
Collaborator

@guolinke I updated docs accordingly.

@guolinke
Copy link
Collaborator Author

Sorry, didn't get it. Could explain with more details?

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Jul 20, 2019

@guolinke I mean, we have practically the same code for objective function here
https://github.com/microsoft/LightGBM/blob/master/src/objective/objective_function.cpp as for metric function which you've updated in this PR. So I wonder can we use std::string GetMetricType(const std::string& type) for replacing aliases with main names for both metrics and objectives.

UPD: After that all || in if conditions for objective functions will be removed.

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented Jul 20, 2019

@guolinke In other words, replace

*objective = value;

with

objective = GetMetricType(value);

This will allow to work only with objectives names in further code, without considering aliases.

@guolinke
Copy link
Collaborator Author

@StrikerRUS Thanks, good point! will update it accordingly.

@StrikerRUS
Copy link
Collaborator

@guolinke Don't you want to merge GetMetricType and ParseObjective functions into one function (e.g. ParseFunctionAliases)? And that function can take one additional enum argument function_type with two possible values metric and objective to help with l2/l2_root and multiclass/multiclassova cases, which are different in two current functions.

@guolinke
Copy link
Collaborator Author

@StrikerRUS I tried it, and think it is more easy-to-read to use two functions.
Actually, the duplicated aliases are no many, just two or three.
but introducing the function_type, with caring about the duplicate or not, will increase the complexity of this function.

@StrikerRUS
Copy link
Collaborator

@guolinke OK, got it!

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, please check.

include/LightGBM/objective_function.h Outdated Show resolved Hide resolved
src/io/config.cpp Outdated Show resolved Hide resolved
src/io/config.cpp Show resolved Hide resolved
src/io/config.cpp Show resolved Hide resolved
src/io/config.cpp Show resolved Hide resolved
@StrikerRUS
Copy link
Collaborator

I updated docs according to new objective names.

@StrikerRUS
Copy link
Collaborator

I think that the following code can be simplified by removing checks for aliases, since we have already done it earlier:

LightGBM/src/io/config.cpp

Lines 198 to 234 in 7189743

bool CheckMultiClassObjective(const std::string& objective) {
return (objective == std::string("multiclass")
|| objective == std::string("multiclassova")
|| objective == std::string("softmax")
|| objective == std::string("multiclass_ova")
|| objective == std::string("ova")
|| objective == std::string("ovr"));
}
void Config::CheckParamConflict() {
// check if objective, metric, and num_class match
int num_class_check = num_class;
bool objective_custom = objective == std::string("none") || objective == std::string("null")
|| objective == std::string("custom") || objective == std::string("na");
bool objective_type_multiclass = CheckMultiClassObjective(objective) || (objective_custom && num_class_check > 1);
if (objective_type_multiclass) {
if (num_class_check <= 1) {
Log::Fatal("Number of classes should be specified and greater than 1 for multiclass training");
}
} else {
if (task == TaskType::kTrain && num_class_check != 1) {
Log::Fatal("Number of classes must be 1 for non-multiclass training");
}
}
for (std::string metric_type : metric) {
bool metric_custom_or_none = metric_type == std::string("none") || metric_type == std::string("null")
|| metric_type == std::string("custom") || metric_type == std::string("na");
bool metric_type_multiclass = (CheckMultiClassObjective(metric_type)
|| metric_type == std::string("multi_logloss")
|| metric_type == std::string("multi_error")
|| (metric_custom_or_none && num_class_check > 1));
if ((objective_type_multiclass && !metric_type_multiclass)
|| (!objective_type_multiclass && metric_type_multiclass)) {
Log::Fatal("Multiclass objective and metrics don't match");
}
}

@StrikerRUS
Copy link
Collaborator

I added new aliases for regression objectives in docs.

@StrikerRUS
Copy link
Collaborator

@guolinke Can we merge?

@guolinke
Copy link
Collaborator Author

Yes

@StrikerRUS StrikerRUS merged commit 5d3a3ea into master Jul 25, 2019
@StrikerRUS StrikerRUS deleted the metric-fix branch July 25, 2019 14:32
matsuken92 added a commit to matsuken92/LightGBM that referenced this pull request Jul 27, 2019
For microsoft#2273 fixed not only the order of metrics in cpp, removing metric check process at callback.py
StrikerRUS pushed a commit that referenced this pull request Sep 15, 2019
* Bug fix for first_metric_only if the first metric is train metric.

* Update bug fix for feval issue.

* Disable feval for first_metric_only.

* Additional test items.

* Fix wrong assertEqual settings & formating.

* Change dataset of test.

* Fix random seed for test.

* Modiry assumed test result due to different sklearn verion between CI and local.

* Remove f-string

* Applying variable  assumed test result for test.

* Fix flake8 error.

* Modifying  in accordance with review comments.

* Modifying for pylint.

* simplified tests

* Deleting error criteria `if eval_metric is None`.

* Delete test items of classification.

* Simplifying if condition.

* Applying first_metric_only for sklearn wrapper.

* Modifying test_sklearn for comforming to python 2.x

* Fix flake8 error.

* Additional fix for sklearn and add tests.

* Bug fix and add test cases.

* some refactor

* fixed lint

* fixed lint

* Fix duplicated metrics scores to pass the test.

* Fix the case first_metric_only not in params.

* Converting metrics aliases.

* Add comment.

* Modify comment for pylint.

* Modify comment for pydocstyle.

* Using split test set for two eval_set.

* added test case for metric aliases and length checks

* minor style fixes

* fixed rmse name and alias position

* Fix the case metric=[]

* Fix using env.model._train_data_name

* Fix wrong test condition.

* Move initial process to _init() func.

* Modify test setting for test_sklearn & training data matching on callback.py

* test_sklearn.py
-> A test case for training is wrong, so fixed.

* callback.py
-> A condition of if statement for detecting test dataset is wrong, so fixed.

* Support composite name metrics.

* Remove metric check process & reduce redundant test cases.

For #2273 fixed not only the order of metrics in cpp, removing metric check process at callback.py

* Revised according to the matters pointed out on a review.

* increased code readability

* Fix the issue of order of validation set.

* Changing to OrderdDict from default dict for score result.

* added missed check in cv function for first_metric_only and feval co-occurrence

* keep order only for metrics but not for datasets in best_score

* move OrderedDict initialization to init phase

* fixed minor printing issues

* move first metric detection to init phase and split can be performed without checks

* split only once during callback

* removed excess code

* fixed typo in variable name and squashed ifs

* use setdefault

* hotfix

* fixed failing test

* refined tests

* refined sklearn test

* Making "feval" effective on early stopping.

* allow feval and first_metric_only for cv

* removed unused code

* added tests for feval

* fixed printing

* add note about whitespaces in feval name

* Modifying final iteration process in case valid set is  training data.
@lock lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants