-
Notifications
You must be signed in to change notification settings - Fork 90
Refactor and fix make_pipeline tests to assert correct order and address new Woodwork Unknown type #2572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2572 +/- ##
=======================================
- Coverage 99.9% 99.9% -0.0%
=======================================
Files 295 295
Lines 27093 26811 -282
=======================================
- Hits 27047 26765 -282
Misses 46 46
Continue to review full report at Codecov.
|
component.name for component in expected_components | ||
] | ||
|
||
|
||
@pytest.mark.parametrize("input_type", ["pd", "ww"]) | ||
@pytest.mark.parametrize("problem_type", ProblemTypes.all_problem_types) | ||
def test_make_pipeline_no_datetimes(input_type, problem_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see value in this compared to test_make_pipeline_text_columns
so deleting :)
@@ -181,7 +180,7 @@ def make_pipeline( | |||
problem_type, | |||
parameters=None, | |||
sampler_name=None, | |||
extra_components=[], | |||
extra_components=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be a good idea to change this default value to None, rather than an empty list. Right now, we don't modify this parameter but if we did, it could lead to unexpected behavior:
From https://docs.python.org/3/reference/compound_stmts.html#function-definitions:
Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! Thankfully this didn't cause any bugs since it's not used
if is_time_series(problem_type): | ||
delayed_features = [DelayedFeatureTransformer] | ||
if estimator_class.model_family == ModelFamily.LINEAR_MODEL: | ||
estimator_components = [StandardScaler, estimator_class] | ||
elif estimator_class.model_family == ModelFamily.CATBOOST: | ||
estimator_components = [estimator_class] | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found myself being confused by the different if/else statements (some of which are related, others which are not!) so I thought this would be a clearer breakdown but let me know if you have thoughts/opinions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hot take: We shouldn't support plain numpy arrays anymore.
if estimator_class.model_family == ModelFamily.ARIMA: | ||
expected_components = [Imputer] + estimator_components | ||
@pytest.fixture | ||
def get_test_data_from_configuration(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixture that contains the data for the make_pipeline test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this, anything that consolidates fixtures is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!
if is_time_series(problem_type): | ||
parameters = { | ||
"pipeline": {"date_index": "soem dates", "gap": 1, "max_delay": 1}, | ||
"Time Series Baseline Estimator": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using Time Series Baseline Estimator so this is unnecessary!
("email with other features", ["email", "numerical", "categorical"]), | ||
], | ||
) | ||
def test_make_pipeline_master( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_description
isn't being used by the code but since we no longer have the title of tests to tell us what is going on, I thought this would be useful to quickly understand what each test case is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use it in the assert error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid, solid PR. I love the mass deletion and consolidation into the fixture. Very clean, very nice.
[OneHotEncoder] | ||
if estimator_class.model_family != ModelFamily.CATBOOST | ||
and ( | ||
"categorical" in column_names | ||
or ( | ||
any(ltype in column_names for ltype in ["url", "email"]) | ||
and input_type == "ww" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this formatting gave me a stroke!
if estimator_class.model_family == ModelFamily.ARIMA: | ||
expected_components = [Imputer] + estimator_components | ||
@pytest.fixture | ||
def get_test_data_from_configuration(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this, anything that consolidates fixtures is good.
"this is another string", | ||
"this is just another string", | ||
"evalml should handle string input", | ||
"cats are gr8", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're a woman of both taste and distinction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@angela97lin Thank you for this!
@@ -181,7 +180,7 @@ def make_pipeline( | |||
problem_type, | |||
parameters=None, | |||
sampler_name=None, | |||
extra_components=[], | |||
extra_components=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! Thankfully this didn't cause any bugs since it's not used
if estimator_class.model_family == ModelFamily.ARIMA: | ||
expected_components = [Imputer] + estimator_components | ||
@pytest.fixture | ||
def get_test_data_from_configuration(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!
("email with other features", ["email", "numerical", "categorical"]), | ||
], | ||
) | ||
def test_make_pipeline_master( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use it in the assert error message?
component.name for component in expected_components | ||
] | ||
|
||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! Non-blocking food-for-thought comment hehe but did you consider including the expected answer in the parametrization rather than computing the answer within the test?
Pro: It'd be much easier to add future test cases that are independent of the other ones. I remember having to "fight" with the answer-generation-logic when I was adding test cases for URL and Email featurizers.
Con: There are a lot of test cases so it's a lot of data to specify. There will also be some duplicate answers across these tests.
I think this is great as-is but just wanted to put that out there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gosh, I love this idea a lot. The reason I decided to go with logic, though, is because the estimator also affects what the pipeline looks like, and we currently iterate through each of the estimators in the test. There's room to argue whether or not it's really necessary to loop through all of the estimators as we do or if we should just write out test cases for the special estimators such as CatBoost which can handle categorical cols natively and thus don't require a OneHotEncoder... but maybe this will be worth another iteration of refactoring.
I do agree though, I had to fight and wrangle the logic quite some bit to get all of these tests to pass, and if we add more logic, I'm not sure how well everything will continue to hold up 😂
if is_time_series(problem_type): | ||
delayed_features = [DelayedFeatureTransformer] | ||
if estimator_class.model_family == ModelFamily.LINEAR_MODEL: | ||
estimator_components = [StandardScaler, estimator_class] | ||
elif estimator_class.model_family == ModelFamily.CATBOOST: | ||
estimator_components = [estimator_class] | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hot take: We shouldn't support plain numpy arrays anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, I love the parametrization of test cases to reduce code and improve readability!
"test_description, column_names", | ||
[ | ||
("all nan is not categorical", ["all_null", "numerical"]), | ||
("all types", ["all_null", "categorical", "dates", "numerical"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't all_types
also include text
, email
, and url
? They're all being tested anyways so definitely not a big deal, but should the test_description
be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the cleanup you did here! The new test is very through. Left a comment on some extra simplification, but also left a comment just for my understanding.
] | ||
|
||
@pytest.mark.parametrize( | ||
"test_description, column_names", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is test_description
just to allow people who look at the test to understand what each test is param iteration is testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! Now that we can't use the test names to understand, I thought having test_description
would make it easier to see at a quick glance what each test case is for 😁
…into 2553_make_pipeline_tests
Closes #2553.
infer_feature_type
will convert text columns to Unknown type, which returns a different pipeline.LogTransformer
component that could be added to pipelinemake_pipeline
tests had similar logic, but different data. Having to update these tests for the updates inmake_pipeline
logic meant I had to update it in many places. It may be hard to tell from the diff, but I refactored all the tests into one parameterized test and made a fixture for the data to use. Although having one large parameterized method may seem scary, it's definitely easier to add new test cases via new columns in the df. I primarily kept all of the existing tests as is with some touchups but open to changing / removing :)Time Series Baseline Estimator
in parameters dict passed. Thanks to @bchen1116's great work, a warning was raised indicating that the parameters for this component weren't used since it's not in the pipeline! 😁