When baseline score is close to 0, set percent improved to nan #1467

dsherry · 2020-11-24T20:53:05Z

Problem
In #1458 @rpeck found a dataset where, for the baseline model (which predicts the mean) ExpVariance came out to 0 on 2 of the 3 CV folds, but came out to ~1e-16 on the other CV fold. This resulted in the percent improvement score for ExpVariance for all pipelines being super big, because that computation divides by the baseline score for the objective in question.

Solution
Define a threshold (1e-10 seemed good) below which the percent improvement computation assumes the baseline objective score is essentially 0, and will then set the percent improvement for all other pipelines to nan.

freddyaboulton

Thanks @dsherry ! Looks great and I verified that percent_better is nan on ExpVar in the dataset mentioned in #1458 !

dsherry · 2020-11-24T21:01:40Z

Ah thanks for double-checking that @freddyaboulton ! Sweet.

codecov · 2020-11-24T21:03:39Z

Codecov Report

Merging #1467 (d33d57a) into main (e10cb02) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@            Coverage Diff            @@
##             main    #1467     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         223      223             
  Lines       15013    15019      +6     
=========================================
+ Hits        15006    15012      +6     
  Misses          7        7

Impacted Files	Coverage Δ
evalml/objectives/objective_base.py	`100.0% <100.0%> (ø)`
...lml/tests/objective_tests/test_standard_metrics.py	`100.0% <100.0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e10cb02...d33d57a. Read the comment docs.

dsherry added 2 commits November 24, 2020 15:47

Unit tests

0a0f5eb

Release notes

1196d9e

dsherry marked this pull request as ready for review November 24, 2020 20:54

dsherry requested review from angela97lin, freddyaboulton and rpeck November 24, 2020 20:54

Missed a case

d33d57a

freddyaboulton approved these changes Nov 24, 2020

View reviewed changes

dsherry merged commit a56ef73 into main Nov 24, 2020

dsherry deleted the ds_1458_expvar_baseline branch November 24, 2020 21:13

dsherry mentioned this pull request Nov 24, 2020

Release v0.16.0 #1468

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When baseline score is close to 0, set percent improved to nan #1467

When baseline score is close to 0, set percent improved to nan #1467

dsherry commented Nov 24, 2020 •

edited

Loading

freddyaboulton left a comment

dsherry commented Nov 24, 2020

codecov bot commented Nov 24, 2020 •

edited

Loading

When baseline score is close to 0, set percent improved to nan #1467

When baseline score is close to 0, set percent improved to nan #1467

Conversation

dsherry commented Nov 24, 2020 • edited Loading

freddyaboulton left a comment

Choose a reason for hiding this comment

dsherry commented Nov 24, 2020

codecov bot commented Nov 24, 2020 • edited Loading

Codecov Report

dsherry commented Nov 24, 2020 •

edited

Loading

codecov bot commented Nov 24, 2020 •

edited

Loading