Skip to content

Conversation

@freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #3334


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@freddyaboulton freddyaboulton force-pushed the 3334-add-test-size-class-imbalance branch from 3c7213c to 50e7bdc Compare February 17, 2022 22:55
@codecov
Copy link

codecov bot commented Feb 17, 2022

Codecov Report

Merging #3341 (370104d) into main (f4a671d) will decrease coverage by 0.7%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3341     +/-   ##
=======================================
- Coverage   99.6%   99.0%   -0.6%     
=======================================
  Files        329     329             
  Lines      31977   32000     +23     
=======================================
- Hits       31847   31649    -198     
- Misses       130     351    +221     
Impacted Files Coverage Δ
evalml/data_checks/class_imbalance_data_check.py 100.0% <100.0%> (ø)
...ta_checks_tests/test_class_imbalance_data_check.py 100.0% <100.0%> (ø)
evalml/automl/pipeline_search_plots.py 17.9% <0.0%> (-82.1%) ⬇️
...l/tests/automl_tests/test_pipeline_search_plots.py 23.3% <0.0%> (-76.7%) ⬇️
...ests/automl_tests/test_automl_search_regression.py 74.3% <0.0%> (-20.4%) ⬇️
.../automl_tests/test_automl_search_classification.py 83.8% <0.0%> (-12.6%) ⬇️
evalml/tests/automl_tests/test_automl_utils.py 91.1% <0.0%> (-8.9%) ⬇️
...lml/tests/automl_tests/test_iterative_algorithm.py 92.3% <0.0%> (-7.7%) ⬇️
evalml/automl/utils.py 98.5% <0.0%> (-1.5%) ⬇️
evalml/automl/automl_search.py 99.6% <0.0%> (-0.1%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4a671d...370104d. Read the comment docs.

@freddyaboulton freddyaboulton force-pushed the 3334-add-test-size-class-imbalance branch from 50e7bdc to fd66543 Compare February 17, 2022 23:48
Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this increasing of the targets make me wonder if there's any consolidation that can be done in the evalml/tests/data_checks_tests/test_class_imbalance_data_check.py module to perhaps push some of those synthetic targets out of the individual tests and either to the module level or to a conftest.py further up. But other than that, looks gucci.

then we consider this severely imbalanced. Must be greater than 0. Defaults to 100.
num_cv_folds (int): The number of cross-validation folds. Must be positive. Choose 0 to ignore this warning. Defaults to 3.
test_size (None, float, int): Percentage of test set size. Used to calculate class imbalance prior to splitting the
data into training and validation/test sets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hyper nit - won't you need a Raises: block for the new ValueError? I guess not given the existing ValueError isn't complained about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to ValueError, lame that it's not caught by our linter because we move our docstring from init to our class docstring (IIRC for docs) so it doesn't automatically pick it up 😢

Copy link
Contributor Author

@freddyaboulton freddyaboulton Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that darglint doesn't check for raises blocks in constructors. @angela97lin do you know?

Lol Angela and I commented at the same time. Adding the missing Raises block!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @freddyaboulton, I don't think it does a good job for class docstrings in general (I think I recently came across a case where we added a new parameter but it didn't complain about the missing docstring).

If we had this docstring in the constructor (init) it might actually pick it up, but IIRC, then our docs won't look like this: https://evalml.alteryx.com/en/stable/autoapi/evalml/data_checks/class_imbalance_data_check/index.html#evalml.data_checks.class_imbalance_data_check.ClassImbalanceDataCheck

We'd have to add back init and click on the init method link specifically to see the parameters, rather than just at the class page.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever use of numpy array multiplication to update the tests, LGTM! Thanks @freddyaboulton 😁

then we consider this severely imbalanced. Must be greater than 0. Defaults to 100.
num_cv_folds (int): The number of cross-validation folds. Must be positive. Choose 0 to ignore this warning. Defaults to 3.
test_size (None, float, int): Percentage of test set size. Used to calculate class imbalance prior to splitting the
data into training and validation/test sets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to ValueError, lame that it's not caught by our linter because we move our docstring from init to our class docstring (IIRC for docs) so it doesn't automatically pick it up 😢

@freddyaboulton freddyaboulton enabled auto-merge (squash) March 1, 2022 21:20
@freddyaboulton freddyaboulton merged commit f7853d5 into main Mar 1, 2022
@chukarsten chukarsten mentioned this pull request Mar 3, 2022
@freddyaboulton freddyaboulton deleted the 3334-add-test-size-class-imbalance branch May 13, 2022 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add optional test_size parameter to ClassImbalanceDataCheck

4 participants