Move pipeline building into `IterativeAlgorithm` #2854

jeremyliweishih · 2021-09-28T20:43:44Z

Fixes #2656. Mainly moving the pipeline building logic out of AutoMLSearch and into _create_pipelines() in IterativeAlgorithm. Will comment with my decisions on the PR but most testing changes were due to the API changes in IterativeAlgorithm or how pipelines were passed into IterativeAlgorithm.

One of the requirements of #2656 is:

move algorithm specific tests out of test_automl.py, test_automl_search_classification.py and test_automl_search_regression.py by mocking out next_batch and algorithm specific methods

Due to the length of this PR, I will make another issue for that specific requirement.

…e_building

codecov · 2021-09-28T21:27:40Z

Codecov Report

Merging #2854 (28ade4e) into main (322dcc0) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #2854     +/-   ##
=======================================
- Coverage   99.7%   99.7%   -0.0%     
=======================================
  Files        302     302             
  Lines      28256   28296     +40     
=======================================
+ Hits       28164   28200     +36     
- Misses        92      96      +4

Impacted Files	Coverage Δ
...valml/automl/automl_algorithm/default_algorithm.py	`100.0% <ø> (ø)`
evalml/automl/automl_algorithm/automl_algorithm.py	`100.0% <100.0%> (ø)`
...lml/automl/automl_algorithm/iterative_algorithm.py	`100.0% <100.0%> (ø)`
evalml/automl/automl_search.py	`99.9% <100.0%> (-<0.1%)`	⬇️
...ts/automl_tests/parallel_tests/test_automl_dask.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/test_automl.py	`99.5% <100.0%> (-<0.1%)`	⬇️
...lml/tests/automl_tests/test_iterative_algorithm.py	`100.0% <100.0%> (ø)`
evalml/tests/conftest.py	`98.3% <100.0%> (-0.3%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 322dcc0...28ade4e. Read the comment docs.

…e_building

jeremyliweishih · 2021-10-01T15:34:18Z

evalml/automl/automl_algorithm/iterative_algorithm.py

@@ -129,6 +193,88 @@ def __init__(
                        " and Real!"
                    )

+    def _create_pipelines(self):


This logic is ripped out of AutoMLSearch and the API changes in IterativeAlgorithm accommodates this. Now DefaultAlgorithm and IterativeAlgorithm have more similar APIs.

DefaultAlgorithm doesn't have the same _create_pipelines API, right? Or do you just mean because we moved the logic around so we have more similar dependencies / parameter expectations?

I meant the IterativeAlgorithm.__init__() parameters!

…e_building

jeremyliweishih · 2021-10-01T18:30:14Z

evalml/automl/automl_algorithm/iterative_algorithm.py

+            raise ValueError("No allowed pipelines to search")
+
+        if self.ensembling and len(self.allowed_pipelines) == 1:
+            self.logger.warning(


I also opted to move all the logging in as well. From my understanding of the logger, there will be no change in output.

jeremyliweishih · 2021-10-01T18:30:37Z

evalml/automl/automl_algorithm/iterative_algorithm.py

@@ -279,3 +481,27 @@ def _transform_parameters(self, pipeline, proposed_parameters):
                        component_parameters[param_name] = value
            parameters[name] = component_parameters
        return parameters
+
+    def _catch_warnings(self, warning_list):


This was only used in pipeline building so I moved it in as well.

jeremyliweishih · 2021-10-01T18:31:33Z

evalml/automl/automl_search.py

-        )
-        self.logger.debug(
-            f"allowed_model_families set to {self.allowed_model_families}"
+        text_in_ensembling = (


Opted to leave this out as both IterativeAlgorithm and DefaultAlgorithm take in text_in_ensembling as arguments.

evalml/automl/automl_search.py

jeremyliweishih · 2021-10-01T19:10:38Z

evalml/tests/conftest.py

@@ -193,6 +193,8 @@ def assert_allowed_pipelines_equal_helper():
    def assert_allowed_pipelines_equal_helper(
        actual_allowed_pipelines, expected_allowed_pipelines
    ):
+        actual_allowed_pipelines.sort(key=lambda p: p.name)


Since pipelines used to be built in AutoMLSearch, tests would compare against an unsorted list of pipelines. Changed due to self.allowed_pipelines in IterativeAlgorithm sorting due to _ESTIMATOR_FAMILY_ORDER.

If we want to do this, we have tests in place then to confirm that the order of allowed_pipelines as as expected? My concern here is that by sorting both lists, we're able to confirm that the types of pipelines match, but not the order in which they're executed anymore.

Yes I agree, but from what I could tell assert_allowed_pipelines_equal_helper isn't used in any tests that are checking for order and we have tests in test_iterative_algorithm.py like test_iterative_algorithm_first_batch_order_param that do account for the case!

To clarify: since most of the tests in test_automl etc use the default iterative algorithm, the pipelines will be sorted in the order defined by _ESTIMATOR_FAMILY_ORDER in iterative algorithm. But since these tests directly compare against make_pipeline, the pipelines compared are in a different order. Another solution to this would be to sort the order into _ESTIMATOR_FAMILY_ORDER but I opted for a simpler solution.

evalml/automl/automl_algorithm/iterative_algorithm.py

angela97lin

Looks good--left some smaller comments to address before merging, but otherwise pretty excited by this cleanup and separation!

evalml/automl/automl_algorithm/iterative_algorithm.py

angela97lin · 2021-10-06T15:24:40Z

evalml/automl/automl_algorithm/iterative_algorithm.py

@@ -129,6 +193,88 @@ def __init__(
                        " and Real!"
                    )

+    def _create_pipelines(self):


DefaultAlgorithm doesn't have the same _create_pipelines API, right? Or do you just mean because we moved the logic around so we have more similar dependencies / parameter expectations?

evalml/automl/automl_search.py

evalml/tests/automl_tests/test_automl.py

evalml/tests/automl_tests/test_iterative_algorithm.py

evalml/tests/automl_tests/test_automl.py

evalml/tests/automl_tests/test_iterative_algorithm.py

evalml/tests/conftest.py

angela97lin · 2021-10-06T16:45:45Z

evalml/tests/conftest.py

@@ -193,6 +193,8 @@ def assert_allowed_pipelines_equal_helper():
    def assert_allowed_pipelines_equal_helper(
        actual_allowed_pipelines, expected_allowed_pipelines
    ):
+        actual_allowed_pipelines.sort(key=lambda p: p.name)


If we want to do this, we have tests in place then to confirm that the order of allowed_pipelines as as expected? My concern here is that by sorting both lists, we're able to confirm that the types of pipelines match, but not the order in which they're executed anymore.

bchen1116

LGTM! Left two nits about docs

evalml/automl/automl_algorithm/automl_algorithm.py

evalml/automl/automl_algorithm/default_algorithm.py

…e_building

jeremyliweishih added 3 commits September 28, 2021 12:22

Move pipeline logic to iterative algorithm and begin fixing tests

943615a

Fix iterativ algorithm tests

d8322fe

Merge branch 'main' of github.com:alteryx/evalml into js_2656_pipelin…

b2c025a

…e_building

jeremyliweishih changed the title ~~Js 2656 pipeline building~~ Move pipeline building into IterativeAlgorithm Sep 28, 2021

jeremyliweishih added 14 commits September 28, 2021 17:35

Fix iterative algorithm tests

77d7d06

Fix automl tests

e998dcd

Fix core-dependencies test

e1a258c

Merge branch 'main' of github.com:alteryx/evalml into js_2656_pipelin…

e14543f

…e_building

lint

2870174

lint

ed143c3

Add docstrings

1c9c6bd

Lint

6a1fd8a

Lint again

4409e4f

Merge branch 'main' of github.com:alteryx/evalml into js_2656_pipelin…

fae4839

…e_building

lint

c254d67

Fix 3.7 test failing

c3289bc

Remove random import

384b63d

lint

119d9f6

jeremyliweishih commented Oct 1, 2021

View reviewed changes

jeremyliweishih added 5 commits October 1, 2021 13:06

Move more ensembling out of search

d7585f9

Merge branch 'main' of github.com:alteryx/evalml into js_2656_pipelin…

0a36542

…e_building

Fix broken iteration logic

0ef7df0

Fix broken test due to mocking

73c534a

RL

636821e

jeremyliweishih commented Oct 1, 2021

View reviewed changes

evalml/automl/automl_search.py Outdated Show resolved Hide resolved

Revert back self.allowed_pipelines call

000949c

jeremyliweishih commented Oct 1, 2021

View reviewed changes

Remove uncessary make_data_type calls

de45fef

jeremyliweishih marked this pull request as ready for review October 1, 2021 19:26

auto-assign bot assigned jeremyliweishih Oct 1, 2021

jeremyliweishih requested review from angela97lin, bchen1116, dsherry, christopherbunn, chukarsten, eccabay, freddyaboulton and ParthivNaresh October 1, 2021 19:46

eccabay reviewed Oct 4, 2021

View reviewed changes

evalml/automl/automl_algorithm/iterative_algorithm.py Outdated Show resolved Hide resolved

evalml/automl/automl_algorithm/iterative_algorithm.py Outdated Show resolved Hide resolved

jeremyliweishih added 3 commits October 5, 2021 14:38

Merge branch 'main' into js_2656_pipeline_building

777dea3

Merge branch 'main' into js_2656_pipeline_building

e88ae2d

Fix docstrings

7f00190

angela97lin approved these changes Oct 6, 2021

View reviewed changes

bchen1116 approved these changes Oct 6, 2021

View reviewed changes

evalml/automl/automl_algorithm/automl_algorithm.py Show resolved Hide resolved

evalml/automl/automl_algorithm/default_algorithm.py Show resolved Hide resolved

jeremyliweishih added 6 commits October 7, 2021 14:50

Merge branch 'main' of github.com:alteryx/evalml into js_2656_pipelin…

508c57c

…e_building

Fix merge

510500f

lint

bdbc348

Fix docstrings

0ff3591

Address comments

a3a182c

Fix other comments

28ade4e

jeremyliweishih merged commit 023babf into main Oct 7, 2021

chukarsten mentioned this pull request Oct 14, 2021

Release v0.35.0 #2918

Merged

freddyaboulton deleted the js_2656_pipeline_building branch May 13, 2022 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move pipeline building into `IterativeAlgorithm` #2854

Move pipeline building into `IterativeAlgorithm` #2854

jeremyliweishih commented Sep 28, 2021 •

edited

Loading

codecov bot commented Sep 28, 2021 •

edited

Loading

jeremyliweishih Oct 1, 2021

angela97lin Oct 6, 2021

jeremyliweishih Oct 6, 2021

jeremyliweishih Oct 1, 2021

jeremyliweishih Oct 1, 2021

jeremyliweishih Oct 1, 2021

jeremyliweishih Oct 1, 2021

angela97lin Oct 6, 2021

jeremyliweishih Oct 6, 2021

jeremyliweishih Oct 6, 2021

angela97lin left a comment

angela97lin Oct 6, 2021

angela97lin Oct 6, 2021

bchen1116 left a comment

Move pipeline building into IterativeAlgorithm #2854

Move pipeline building into IterativeAlgorithm #2854

Conversation

jeremyliweishih commented Sep 28, 2021 • edited Loading

codecov bot commented Sep 28, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angela97lin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bchen1116 left a comment

Choose a reason for hiding this comment

Move pipeline building into `IterativeAlgorithm` #2854

Move pipeline building into `IterativeAlgorithm` #2854

jeremyliweishih commented Sep 28, 2021 •

edited

Loading

codecov bot commented Sep 28, 2021 •

edited

Loading