Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When baseline score is close to 0, set percent improved to nan #1467

Merged
merged 3 commits into from
Nov 24, 2020

Conversation

dsherry
Copy link
Contributor

@dsherry dsherry commented Nov 24, 2020

Fixes #1458

Problem
In #1458 @rpeck found a dataset where, for the baseline model (which predicts the mean) ExpVariance came out to 0 on 2 of the 3 CV folds, but came out to ~1e-16 on the other CV fold. This resulted in the percent improvement score for ExpVariance for all pipelines being super big, because that computation divides by the baseline score for the objective in question.

Solution
Define a threshold (1e-10 seemed good) below which the percent improvement computation assumes the baseline objective score is essentially 0, and will then set the percent improvement for all other pipelines to nan.

@dsherry dsherry marked this pull request as ready for review November 24, 2020 20:54
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dsherry ! Looks great and I verified that percent_better is nan on ExpVar in the dataset mentioned in #1458 !

@dsherry
Copy link
Contributor Author

dsherry commented Nov 24, 2020

Ah thanks for double-checking that @freddyaboulton ! Sweet.

@codecov
Copy link

codecov bot commented Nov 24, 2020

Codecov Report

Merging #1467 (d33d57a) into main (e10cb02) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1467     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         223      223             
  Lines       15013    15019      +6     
=========================================
+ Hits        15006    15012      +6     
  Misses          7        7             
Impacted Files Coverage Δ
evalml/objectives/objective_base.py 100.0% <100.0%> (ø)
...lml/tests/objective_tests/test_standard_metrics.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e10cb02...d33d57a. Read the comment docs.

@dsherry dsherry merged commit a56ef73 into main Nov 24, 2020
@dsherry dsherry deleted the ds_1458_expvar_baseline branch November 24, 2020 21:13
@dsherry dsherry mentioned this pull request Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants