Pass X_train, y_train in Engine.submit_scoring_job for time series #2786

freddyaboulton · 2021-09-15T17:30:01Z

Pull Request Description

After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

codecov · 2021-09-15T17:34:17Z

Codecov Report

Merging #2786 (23aeeb5) into main (d54173a) will increase coverage by 0.8%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #2786     +/-   ##
=======================================
+ Coverage   99.0%   99.8%   +0.8%     
=======================================
  Files        298     298             
  Lines      27646   27681     +35     
=======================================
+ Hits       27364   27613    +249     
+ Misses       282      68    -214

Impacted Files	Coverage Δ
evalml/automl/automl_search.py	`99.9% <100.0%> (+0.2%)`	⬆️
evalml/automl/engine/cf_engine.py	`100.0% <100.0%> (ø)`
evalml/automl/engine/dask_engine.py	`100.0% <100.0%> (ø)`
evalml/automl/engine/engine_base.py	`100.0% <100.0%> (ø)`
evalml/automl/engine/sequential_engine.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/dask_test_utils.py	`100.0% <100.0%> (ø)`
...ts/automl_tests/parallel_tests/test_automl_dask.py	`100.0% <100.0%> (ø)`
evalml/tests/automl_tests/test_automl.py	`99.7% <0.0%> (+0.1%)`	⬆️
evalml/automl/utils.py	`100.0% <0.0%> (+1.7%)`	⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d54173a...23aeeb5. Read the comment docs.

freddyaboulton · 2021-09-15T18:37:00Z

evalml/tests/automl_tests/parallel_tests/test_automl_dask.py

+
+@pytest.mark.parametrize(
+    "engine_str",
+    engine_strs + ["sequential"],


@chukarsten Check it out - threaded engines respect mocks

ParthivNaresh

Solid catch, looks good!

angela97lin

Looks good! 😁

angela97lin · 2021-09-16T14:02:51Z

evalml/automl/engine/dask_engine.py

@@ -159,6 +163,7 @@ def submit_scoring_job(self, automl_config, pipeline, X, y, objectives):
        X_schema = X.ww.schema
        y_schema = y.ww.schema
        X, y = self.send_data_to_cluster(X, y)
+        X_train, y_train = self.send_data_to_cluster(X_train, y_train)


Just for my own curiosity: theoretically, if send_data_to_cluster supported more arguments, this could have been combined with the line above, right? 🤔

angela97lin · 2021-09-16T14:09:47Z

evalml/tests/automl_tests/parallel_tests/test_automl_dask.py

+    X_train, y_train = X[:50], y[:50]
+    X_test, y_test = X[50:], y[50:]
+    X_train, y_train = pd.DataFrame(X_train), pd.Series(y_train)
+    X_test, y_test = pd.DataFrame(X_test), pd.Series(y_test)


Omega nitpick: could probably combine these lines to:

X_train, y_train = pd.DataFrame(X[:50]), pd.Series(y[:50]) X_test, y_test = pd.DataFrame(X[50:]), pd.Series(y[50:])

But might just be personal preference 😅

Side note: This is probably a task for the larger test refactoring/cleanup PR, but I wonder if it's worth making our fixtures dataframes, since we've slowly been moving away from explicitly supporting numpy arrays lol

@angela97lin Agreed that having our most common fixtures be numpy arrays is not ideal - maybe tests our code with unrepresentative inputs compared to what a user would pass in!

freddyaboulton marked this pull request as ready for review September 15, 2021 18:35

auto-assign bot assigned freddyaboulton Sep 15, 2021

freddyaboulton requested review from dsherry, angela97lin, chukarsten, bchen1116, christopherbunn, eccabay, jeremyliweishih and ParthivNaresh and removed request for dsherry and angela97lin September 15, 2021 18:36

freddyaboulton commented Sep 15, 2021

View reviewed changes

ParthivNaresh approved these changes Sep 16, 2021

View reviewed changes

angela97lin approved these changes Sep 16, 2021

View reviewed changes

freddyaboulton force-pushed the 2785-fix-score-pipelines-for-automl-search branch from 2ecef68 to a654370 Compare September 16, 2021 14:35

freddyaboulton added 4 commits September 16, 2021 12:24

Pass X_train, y_train in submit_scoring_job

579514a

Add 2786 to release notes

4ec55b9

Set dtype for windows tests

25a349e

Fix release notes. Condense dataframe creation in test

23aeeb5

freddyaboulton force-pushed the 2785-fix-score-pipelines-for-automl-search branch from a654370 to 23aeeb5 Compare September 16, 2021 16:25

freddyaboulton merged commit d1e6afb into main Sep 16, 2021

freddyaboulton deleted the 2785-fix-score-pipelines-for-automl-search branch September 16, 2021 17:08

chukarsten mentioned this pull request Oct 1, 2021

Release v0.34.0 #2864

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass X_train, y_train in Engine.submit_scoring_job for time series #2786

Pass X_train, y_train in Engine.submit_scoring_job for time series #2786

freddyaboulton commented Sep 15, 2021

codecov bot commented Sep 15, 2021 •

edited

Loading

freddyaboulton Sep 15, 2021

ParthivNaresh left a comment

angela97lin left a comment

angela97lin Sep 16, 2021

angela97lin Sep 16, 2021

freddyaboulton Sep 16, 2021

Pass X_train, y_train in Engine.submit_scoring_job for time series #2786

Pass X_train, y_train in Engine.submit_scoring_job for time series #2786

Conversation

freddyaboulton commented Sep 15, 2021

Pull Request Description

codecov bot commented Sep 15, 2021 • edited Loading

Codecov Report

freddyaboulton Sep 15, 2021

Choose a reason for hiding this comment

ParthivNaresh left a comment

Choose a reason for hiding this comment

angela97lin left a comment

Choose a reason for hiding this comment

angela97lin Sep 16, 2021

Choose a reason for hiding this comment

angela97lin Sep 16, 2021

Choose a reason for hiding this comment

freddyaboulton Sep 16, 2021

Choose a reason for hiding this comment

codecov bot commented Sep 15, 2021 •

edited

Loading