Top k multi error #2178

btrotta · 2019-05-15T09:09:22Z

Implements the feature requested in issue #1139 . Added new parameter top_k_threshold. When the metric is set to multi_error, parameter top_k_threshold can be used to obtain the top-k multi-error. By default this parameter is set to 1, which gives the usual multi-error.

…shold.

msftclas · 2019-05-15T09:09:35Z

All CLA requirements met.

StrikerRUS · 2019-05-15T11:11:41Z

@btrotta Thank you very much for your contribution!

Seems that your new tests fail on Python 2.7. Can you please take a look?

Also please move them to the existent test_engine.py file. Maybe in the future we should divide our tests in more elegant way, but for now there is no need to create a new separate file for 2 small tests.

src/metric/multiclass_metric.hpp

tests/python_package_test/test_multiclass_metrics.py

StrikerRUS · 2019-05-15T13:00:35Z

We already have top_k/topk parameter for voting parallel: https://lightgbm.readthedocs.io/en/latest/Parameters.html#top_k.
I suppose that for this parameter we should take less confusing name. What about multi_error_top_k_threshold or simply multi_error_top_k?
@guolinke

guolinke · 2019-05-15T13:41:07Z

yes, I think it is better to change the parameter name.

…se number of training rounds so loss is larger and easier to compare.

StrikerRUS

Seems that new behavior differs from previous one in case of model predicts all classes to be equally likely.
Consider topk@1 and score = [0.33, 0.33, 0.33]
Old:
LossOnPoint => 1.0
New:
LossOnPoint => 0.0

…tests and docs.

btrotta · 2019-05-17T10:01:41Z

@StrikerRUS Good point! I've fixed this in the latest commit. The metric is now defined so that the top-k error on a sample is 0 if there are at least num_classes - k predictions strictly less than the prediction on the true class.

StrikerRUS · 2019-05-17T12:12:23Z

@btrotta Thank you for hotfixing that!
I think we can enhance the metric name to help users orient in logs.
Just like map or ndcg:

LightGBM/src/metric/map_metric.hpp

Lines 38 to 40 in 0a4a7a8

    
           for (auto k : eval_at_) { 
        
             name_.emplace_back(std::string("map@") + std::to_string(k)); 
        
           }

The only thing I'm doubt about is that we cannot change multi_error -> multi_error@1 due to backward compatibility with users' existing codebase which relies on the previous name. So, it should be a special case.

btrotta · 2019-05-17T13:13:55Z

@StrikerRUS Done

StrikerRUS

Thanks! Now the general concept looks good to me. Just several nitpicks below for the consistency with existing codebase and formatting style.

include/LightGBM/config.h

src/metric/multiclass_metric.hpp

btrotta · 2019-05-18T02:55:24Z

Formatting issues fixed in latest commit.

StrikerRUS

@btrotta Thank you very much for your contribution!
LGTM except two comments below!

include/LightGBM/config.h

btrotta · 2019-05-18T11:51:37Z

@StrikerRUS Thanks so much for your careful review! Those 2 issues with docs are fixed now.

StrikerRUS · 2019-05-22T12:03:27Z

@guolinke Can you please give a second review to this PR?

guolinke · 2019-05-22T12:06:25Z

it looks good to me

StrikerRUS · 2019-05-24T12:49:22Z

@guolinke Should this PR be in 2.2.4 release?

guolinke · 2019-05-26T00:41:36Z

@StrikerRUS yeah, it can be.

StrikerRUS · 2019-05-26T11:08:34Z

@guolinke Then I'm merging 😃

btrotta added 2 commits May 15, 2019 18:56

Implement top-k multiclass error metric. Add new parameter top_k_thre…

f834dde

…shold.

Add test for multiclass metrics

066f07d

guolinke requested a review from StrikerRUS May 15, 2019 09:18

Make test less sensitive to avoid floating-point issues.

ad27d7f

StrikerRUS reviewed May 15, 2019

View reviewed changes

src/metric/multiclass_metric.hpp Outdated Show resolved Hide resolved

StrikerRUS reviewed May 15, 2019

View reviewed changes

tests/python_package_test/test_multiclass_metrics.py Outdated Show resolved Hide resolved

Change tabs to spaces.

27a4039

btrotta added 3 commits May 16, 2019 18:39

Fix problem with test in Python 2. Refactor to use np.testing. Decrea…

666cb0b

…se number of training rounds so loss is larger and easier to compare.

Move multiclass tests into test_engine.py

b4c7bbd

Change parameter name from top_k_threshold to multi_error_top_k.

963e35d

StrikerRUS requested changes May 16, 2019

View reviewed changes

Fix top-k error metric to handle case where scores are equal. Update …

08e8fa4

…tests and docs.

btrotta added 2 commits May 17, 2019 23:05

Change name of top-k metric to multi_error@k.

28a9dcc

Change tabs to spaces.

90fb6fa

StrikerRUS reviewed May 18, 2019

View reviewed changes

Fix formatting.

0f83ae8

StrikerRUS approved these changes May 18, 2019

View reviewed changes

include/LightGBM/config.h Outdated Show resolved Hide resolved

include/LightGBM/config.h Show resolved Hide resolved

StrikerRUS requested a review from guolinke May 18, 2019 10:36

Fix minor issues in docs.

53558c0

guolinke approved these changes May 22, 2019

View reviewed changes

StrikerRUS merged commit b3db9e9 into microsoft:master May 26, 2019

StrikerRUS mentioned this pull request May 27, 2019

Metric Parameters: top-k error rate for multi-class classification? #1139

Closed

lock bot locked as resolved and limited conversation to collaborators Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Top k multi error #2178

Top k multi error #2178

btrotta commented May 15, 2019

msftclas commented May 15, 2019 •

edited

StrikerRUS commented May 15, 2019

StrikerRUS commented May 15, 2019 •

edited

guolinke commented May 15, 2019

StrikerRUS left a comment

btrotta commented May 17, 2019

StrikerRUS commented May 17, 2019 •

edited

btrotta commented May 17, 2019

StrikerRUS left a comment

btrotta commented May 18, 2019

StrikerRUS left a comment

btrotta commented May 18, 2019

StrikerRUS commented May 22, 2019

guolinke commented May 22, 2019

StrikerRUS commented May 24, 2019

guolinke commented May 26, 2019

StrikerRUS commented May 26, 2019

Top k multi error #2178

Top k multi error #2178

Conversation

btrotta commented May 15, 2019

msftclas commented May 15, 2019 • edited

StrikerRUS commented May 15, 2019

StrikerRUS commented May 15, 2019 • edited

guolinke commented May 15, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

btrotta commented May 17, 2019

StrikerRUS commented May 17, 2019 • edited

btrotta commented May 17, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

btrotta commented May 18, 2019

StrikerRUS left a comment

Choose a reason for hiding this comment

btrotta commented May 18, 2019

StrikerRUS commented May 22, 2019

guolinke commented May 22, 2019

StrikerRUS commented May 24, 2019

guolinke commented May 26, 2019

StrikerRUS commented May 26, 2019

msftclas commented May 15, 2019 •

edited

StrikerRUS commented May 15, 2019 •

edited

StrikerRUS commented May 17, 2019 •

edited