Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AutoMLSearch._check_for_high_variance to not emit RuntimeWarning #2024

Merged
merged 8 commits into from Mar 25, 2021

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented Mar 23, 2021

Closes #1964

The original reproducer resulted in RuntimeWarning because we were dividing 0 by 0 (using arrays, I believe using floats themselves produce a slightly different warning):

Ex:

a = np.array([0.0])
0.0 / a.mean()

We will also run into a similar RuntimeWarning if we have non-0 / 0

a = np.array([0.0])
1 / a.mean()

image

This PR only calculates high variance (abs(cv_scores.std() / cv_scores.mean()) if cv_scores.std() is not 0. If cv_scores.std() is 0, we don't care what the mean is.

To resolve the latter case (mean is 0), this PR just states that high variance is False. Not ideal, but the thought process was that if mean is 0 and std is not 0 (the std == 0 case is covered above), then high variance will be triggered regardless of how little the variance is (divide by zero). It seems like a better approach to avoid false positives, and opt for false negatives here. I've filed #2023 to track improving our impl to be more robust against this case.

For fun, on the original reproducer:
image

@angela97lin angela97lin self-assigned this Mar 23, 2021
@angela97lin angela97lin changed the title Update _check_for_high_variance to not emit RuntimeWarning if cv_scores.std() == 0 Update _check_for_high_variance to not emit RuntimeWarning Mar 23, 2021
@angela97lin angela97lin changed the title Update _check_for_high_variance to not emit RuntimeWarning Update AutoMLSearch._check_for_high_variance to not emit RuntimeWarning Mar 23, 2021
@codecov
Copy link

codecov bot commented Mar 23, 2021

Codecov Report

Merging #2024 (37c95cb) into main (0e0ffe9) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #2024     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         278      278             
  Lines       22748    22761     +13     
=========================================
+ Hits        22739    22752     +13     
  Misses          9        9             
Impacted Files Coverage Δ
evalml/automl/automl_search.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_automl.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e0ffe9...37c95cb. Read the comment docs.

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great description of the problem and an accompanying solution! Thanks for filing an issue to track cv_scores.mean() being 0.

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The writeup is 👌

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @angela97lin !

@angela97lin angela97lin merged commit cfcfe68 into main Mar 25, 2021
@angela97lin angela97lin deleted the 1964_hv branch March 25, 2021 20:29
@chukarsten chukarsten mentioned this pull request Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

F1 objective causes a Runtime warning in AutoMLSearch
4 participants