Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tests to use new pipeline API instead of defining custom pipeline classes #3172

Merged
merged 32 commits into from
Jan 10, 2022

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented Dec 23, 2021

Closes #2184. evalml/tests/conftest.py is a good place to look at the changes made, as all other changes are likely to address the fixture changes.

Also does general cleanup and removes fixtures added to parameters where they are not used. Happy to argue or discuss any place where we may favor the previous implementation!

In some cases, such as the time series pipeline classes and dask pipelines in evalml/tests/automl_tests/dask_test_utils.py, I kept the test classes because I thought it was beneficial: Time series pipeline classes require specific parameters for initialization that we may not always want as part of the fixture definition, and the dask classes override methods to test specific functionality.

There were a lot of classes in evalml/tests/model_understanding_tests/test_permutation_importance.py that I wanted to replace, but it was surprisingly difficult to get a clean implementation because we use Target Encoder in one of the classes, and Target Encoder is only available for our non-core dependencies. In the current setup, classes are not evaluated at runtime, so we don't run into an issue. However, if we try to replace these pipeline classes with instances instead, we'll actually evaluate the pipeline initialization and in the case of Target Encoder and core-dependencies is True, we'll get an error seen here: https://github.com/alteryx/evalml/runs/4710317986?check_suite_focus=true

I still wonder if there's a way to clean this up but for now, it's not urgent.

I have mild trepidation about using fixtures that return instances and then calling fit/transform in individual tests. We can always call .new() to create a duplicate instance of the pipeline instance, but it seems like each test does not step on each other's toes. According to https://docs.pytest.org/en/6.2.x/fixture.html: "Fixtures are created when first requested by a test, and are destroyed based on their scope: function: the default scope, the fixture is destroyed at the end of the test."

That means that each function should have its own fixture scope and we don't need to worry about the same instance being used in multiple methods!

@codecov
Copy link

codecov bot commented Dec 23, 2021

Codecov Report

Merging #3172 (4cd54fe) into main (b469733) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3172     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        326     326             
  Lines      31390   31135    -255     
=======================================
- Hits       31286   31039    -247     
+ Misses       104      96      -8     
Impacted Files Coverage Δ
...lml/tests/automl_tests/test_iterative_algorithm.py 100.0% <ø> (ø)
evalml/tests/pipeline_tests/test_pipeline_utils.py 99.7% <ø> (ø)
evalml/tests/automl_tests/test_automl.py 99.6% <100.0%> (+0.1%) ⬆️
evalml/tests/automl_tests/test_automl_algorithm.py 97.3% <100.0%> (-<0.1%) ⬇️
.../automl_tests/test_automl_search_classification.py 100.0% <100.0%> (ø)
...ests/automl_tests/test_automl_search_regression.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_automl_utils.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_engine_base.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_components.py 99.3% <100.0%> (ø)
evalml/tests/conftest.py 95.9% <100.0%> (-0.4%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b469733...4cd54fe. Read the comment docs.

@@ -1393,11 +1387,7 @@ def fit(self, X, y=None):


@pytest.mark.parametrize("component_class", all_components())
def test_component_equality_all_components(
component_class,
logistic_regression_binary_pipeline_class,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing because unused

@angela97lin angela97lin self-assigned this Jan 5, 2022
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin Thank you for this! I think it's cool that we're using the new api for tests and that we can reduce the number of lines in the test files as well.

Left some minor comments on ways to remove some of the existing class-style api uses as well as some typos in the component names in some parameter dicts.

}[problem_type_value]
baseline_pipeline_class = {
ProblemTypes.BINARY: "evalml.pipelines.BinaryClassificationPipeline",
ProblemTypes.MULTICLASS: "evalml.pipelines.MulticlassClassificationPipeline",
ProblemTypes.REGRESSION: "evalml.pipelines.RegressionPipeline",
ProblemTypes.TIME_SERIES_REGRESSION: "evalml.pipelines.TimeSeriesRegressionPipeline",
}[problem_type_value]
pipeline_class = _get_pipeline_base_class(problem_type_value)

class DummyPipeline(pipeline_class):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of the old-style class api definition here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton Ah yeah I was trying to figure out a way to replace this but was having a hard time 😭

I think the reason was related to these lines:

    mock_score_1 = MagicMock(return_value={objective.name: pipeline_scores[0]})
    mock_score_2 = MagicMock(return_value={objective.name: pipeline_scores[1]})
    Pipeline1.score = mock_score_1
    Pipeline2.score = mock_score_2

If the pipeline classes are removed, it's hard to set each of the mock score methods for each pipeline, gah. Will take another stab at it and see if there's a way around this still.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, I think it gets pretty tricky in some of these cases: we mock the pipelines searched to return a specific score, and then compare them to the baseline scores. If we don't have a specific class to mock, then the pipeline searched classes and the baseline classes become the same and are hard to mock.

X_y_binary,
AutoMLTestEnv,
):
# Test that percent-better-than-baseline is correctly computed when scores differ across folds
X, y = X_y_binary

class DummyPipeline(dummy_binary_pipeline_class):
class DummyPipeline(BinaryClassificationPipeline):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I don't think we need DummyPipeline ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same as above with difficulty mocking pipeline vs baseline without a specific pipeline class)

@@ -25,7 +25,6 @@

class DoubleColumns(Transformer):
"""Custom transformer for testing permutation importance implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok if we keep the class-style api for test_fast_permutation_importance_matches_slow_output. Just curious if you've tried this:

  • Mark the test as needing non-core dependencies (pytest.mark.noncore_dependency)
  • Refactor test_cases so that it's a tuple of component_graph, parameters
  • Instantiate the pipeline class with the tuple in test_cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton I tried to do something along the lines of f480a93

I guess where it failed was that I was initializing the pipelines in the parameterization. I then tried to mark just the case with the target encoder as a noncore_dependency, rather than the whole test. I can go back and try to use a (component_graph, parameters) tuple instead--I chose to use pipeline classes because we also have a regression pipeline test / graph that we test, which might be hard to capture just using the component graph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alas, I tried again, trying to keep the pipeline instances as parameters so that we don't have to determine which pipeline class to use to initialize the parameters and component graph. It doesn't work because our the noncore dependency Target Encoder--I'll just keep it as the class-style API :')

evalml/tests/pipeline_tests/test_pipelines.py Show resolved Hide resolved
evalml/tests/conftest.py Show resolved Hide resolved
evalml/tests/pipeline_tests/test_pipelines.py Show resolved Hide resolved
Copy link
Contributor

@eccabay eccabay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! 🚢


assert len(automl.rankings) == 2
assert len(automl.full_rankings) == 2
assert 0.1234 in automl.rankings["mean_cv_score"].values

with env.test_context(score_return_value={"Log Loss Binary": 0.5678}):
test_pipeline_2 = dummy_binary_pipeline_class(
test_pipeline_2 = dummy_binary_pipeline.new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of new!

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid work!

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chukarsten chukarsten merged commit 02c2021 into main Jan 10, 2022
@angela97lin angela97lin deleted the 2184_test_classes branch January 10, 2022 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests: use new pipeline API instead of defining custom pipeline classes
5 participants