Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training and scoring pipelines with AutoMLSearch #1913

Merged
merged 16 commits into from
Mar 8, 2021

Conversation

freddyaboulton
Copy link
Contributor

@freddyaboulton freddyaboulton commented Mar 2, 2021

Pull Request Description

Fixes #1729


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from a083332 to 435330a Compare March 3, 2021 15:11
@codecov
Copy link

codecov bot commented Mar 3, 2021

Codecov Report

Merging #1913 (3437792) into main (22e158e) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1913     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         265      265             
  Lines       21737    21936    +199     
=========================================
+ Hits        21731    21930    +199     
  Misses          6        6             
Impacted Files Coverage Δ
evalml/automl/automl_search.py 100.0% <100.0%> (ø)
evalml/automl/engine/engine_base.py 100.0% <100.0%> (ø)
evalml/automl/engine/sequential_engine.py 100.0% <100.0%> (ø)
evalml/pipelines/pipeline_base.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_automl.py 100.0% <100.0%> (ø)
.../automl_tests/test_automl_search_classification.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_engine_base.py 100.0% <100.0%> (ø)
...valml/tests/automl_tests/test_sequential_engine.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22e158e...3437792. Read the comment docs.

random_seed=self.random_seed)
self._best_pipeline.fit(X_train, y_train)
tune_binary_threshold(self._best_pipeline, self.objective, self.problem_type, X_threshold_tuning, y_threshold_tuning)
best_pipeline = self._engine.train_pipeline(best_pipeline, self.X_train, self.y_train,
Copy link
Contributor Author

@freddyaboulton freddyaboulton Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will change once #1814 is merged. We'll have to subset to the ensemble indices before calling train_pipeline

@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from 3d0e31d to b49aee4 Compare March 3, 2021 19:26
@freddyaboulton freddyaboulton marked this pull request as ready for review March 3, 2021 20:04
@freddyaboulton freddyaboulton requested a review from dsherry March 3, 2021 20:04
@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from b49aee4 to 18bb3ef Compare March 3, 2021 22:59
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions and other tests that I think would be good to add! Definitely would be nice to discuss the implications of passing None to train and score.

objective (ObjectiveBase): Objective used in threshold tuning.

Returns:
PipelineBase - trained pipeline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: can we do

"""
Returns:
    pipeline (PipelineBase): Trained pipeline.
"""

Or something similar?

y (ww.DataTable, pd.DataFrame): Data to score on.
objectives (list(ObjectiveBase), list(str)): Objectives to score on.
Returns:
Dict: Dict containining scores for all objectives for all pipelines. Keyed by pipeline name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: containing

ValueError if any pipeline names are duplicated.
"""
seen_names = set([])
duplicate_names = set([])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to do set([]) versus set()? Just curious, I've only ever done set()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe set() is better hehe I'll make the change.


@patch('evalml.pipelines.BinaryClassificationPipeline.score')
@patch('evalml.pipelines.BinaryClassificationPipeline.fit')
def test_train_batch_score_batch(mock_fit, mock_score, dummy_binary_pipeline_class, X_y_binary):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we try to train or score before AutoMLSearch has been run? Or if we try to score before we trained a pipeline batch? can we add test cov?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what happens if the user passes in a stacked ensemble pipeline (in both cases where ensembling runs and ensembling doesn't run)?

Copy link
Contributor Author

@freddyaboulton freddyaboulton Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great questions @bchen1116 ! train_batch and score_batch are adjacent to search and mainly help users leverage the underlying engine for fitting and scoring. In my mind it makes the most sense to run this after search with pipelines you get via get_pipeline but I don't think we need to enforce that.

So to answer your questions:

  1. Should work as expected
  2. If you score a pipeline that hasn't been fit the exception will be in the log and the score will be nan for all objectives.
  3. Should work as expected, i.e. the pipeline is either fit/scored successfully.

I'll add coverage 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, sounds good to me!

evalml/tests/automl_tests/test_engine_base.py Show resolved Hide resolved
seen_names.add(pipeline.name)

if duplicate_names:
raise ValueError(f"All pipeline names must be unique. The names {', '.join(duplicate_names)} were repeated.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick, but can we add quotation marks around each name? Maybe something like

duplicates = "', '".join(duplicate_names)
f"All pipeline names must be unique. The names {duplicates} were repeated."

just for easier readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call

@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from 231d4dd to 6fe721c Compare March 4, 2021 16:30
@freddyaboulton freddyaboulton requested a review from bchen1116 March 4, 2021 18:18
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton thanks for the fixes! Everything looks great! I just left a comment on adding quotation marks around the name for clarity, since it seems like my original suggestion didn't work, but looks good otherwise!

class Pipeline2(dummy_binary_pipeline_class):
custom_name = "My Pipeline"

with pytest.raises(ValueError, match="All pipeline names must be unique. The names My Pipeline were repeated."):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sad, did the quotation marks not make it in. Can we add quotes to the pipeline names? Might need to do \" or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

assert exception in caplog.text

# Test training before search is run
train_batch_and_check()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from e96ce50 to 3501ac5 Compare March 4, 2021 19:07
@@ -10,6 +10,8 @@ Release Notes
* Added utility method to create list of components from a list of ``DataCheckAction`` :pr:`1907`
* Updated ``validate`` method to include a ``action`` key in returned dictionary for all ``DataCheck``and ``DataChecks`` :pr:`1916`
* Aggregating the shap values for predictions that we know the provenance of, e.g. OHE, text, and date-time. :pr:`1901`
* Added ``score_batch`` and ``train_batch`` methods to ``AutoMLSearch`` :pr:`1913`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this should be train_pipelines and score_pipelines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally right 🙈

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this PR and I think it's really going to start to push some solid refactoring of the Engines and AutoMLSearch. I left a few comments, I think the only blocking ones are:

  1. Resolution of placement of EngineBase.tune_threshold() static method.
  2. Some docstring stuff.
  3. Pipeline cloning discussion and resolution.

evalml/automl/engine/engine_base.py Show resolved Hide resolved
evalml/automl/engine/engine_base.py Outdated Show resolved Hide resolved
"""
X_threshold_tuning = None
y_threshold_tuning = None
if EngineBase.tune_threshold(pipeline, optimize_thresholds, objective):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I dunno, tune_threshold seems like it might better belong in the utils from whence tune_binary_threshold came. What do you think?

evalml/automl/engine/engine_base.py Show resolved Hide resolved
evalml/automl/engine/engine_base.py Show resolved Hide resolved
evalml/automl/utils.py Outdated Show resolved Hide resolved
evalml/tests/automl_tests/test_automl.py Show resolved Hide resolved
evalml/tests/automl_tests/test_engine_base.py Show resolved Hide resolved
evalml/tests/automl_tests/test_engine_base.py Show resolved Hide resolved
@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from 4464df1 to b71837c Compare March 8, 2021 16:29
@freddyaboulton freddyaboulton force-pushed the 1729-score-fit-pipelines-on-holdout-with-automl branch from 9511aa5 to 3437792 Compare March 8, 2021 20:42
Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@freddyaboulton freddyaboulton merged commit c4306f6 into main Mar 8, 2021
@freddyaboulton freddyaboulton deleted the 1729-score-fit-pipelines-on-holdout-with-automl branch March 8, 2021 22:07
@dsherry dsherry mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it easy to score automl search pipelines on holdout
4 participants