Add TimeSeriesRegression problem type #1386

freddyaboulton · 2020-10-30T21:26:38Z

Pull Request Description

After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

codecov · 2020-10-30T21:34:18Z

Codecov Report

Merging #1386 (f4ba7d8) into main (6a80b40) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@            Coverage Diff            @@
##             main    #1386     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         213      213             
  Lines       13940    13946      +6     
=========================================
+ Hits        13933    13939      +6     
  Misses          7        7

Impacted Files	Coverage Δ
evalml/automl/automl_search.py	`99.7% <100.0%> (ø)`
evalml/model_understanding/graphs.py	`100.0% <100.0%> (ø)`
...alml/objectives/binary_classification_objective.py	`100.0% <100.0%> (ø)`
.../objectives/multiclass_classification_objective.py	`100.0% <100.0%> (ø)`
evalml/objectives/objective_base.py	`100.0% <100.0%> (ø)`
evalml/objectives/regression_objective.py	`100.0% <100.0%> (ø)`
evalml/objectives/utils.py	`100.0% <100.0%> (ø)`
evalml/pipelines/binary_classification_pipeline.py	`100.0% <100.0%> (ø)`
evalml/pipelines/pipeline_base.py	`100.0% <100.0%> (ø)`
evalml/problem_types/problem_types.py	`100.0% <100.0%> (ø)`
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a80b40...f4ba7d8. Read the comment docs.

bchen1116 · 2020-11-03T16:11:00Z

evalml/tests/problem_type_tests/test_problem_types.py

@@ -16,7 +16,7 @@ def correct_problem_types():


 def test_handle_string(correct_problem_types):
-    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary']
+    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary', ProblemTypes.TIME_SERIES_REGRESSION]


how did you add ProblemTypes.TIME_SERIES_REGRESSION here without adding anything to correct_problem_types?

Hm, I think we actually need to add to correct_problem_types for this test to be fully updated. Reason why it passes right now is because zip(problem_types, correct_problem_types) will only iterate when both lists still have items, so it only goes through correct_problem_types (3 elements)

Good catch guys! I'll update this.

👏 great reviewing @bchen1116 @angela97lin !

angela97lin

LGTM! Just a few small comments about updating the test completely and the docstrings for each of the objective classes, but otherwise looks good 😁

angela97lin · 2020-11-03T22:58:30Z

docs/source/release_notes.rst

+        * Added a problem type for time series regression :pr:`1386`
+        * Added a ``is_defined_for_problem_type`` method to ``ObjectiveBase`` :pr:`1386`


angela97lin · 2020-11-03T23:03:27Z

evalml/automl/automl_search.py

                raise ValueError("Additional objective {} is not compatible with a {} problem.".format(obj.name, self.problem_type.value))

        for pipeline in self.allowed_pipelines or []:
-            if not pipeline.problem_type == self.problem_type:
+            if pipeline.problem_type != self.problem_type:


angela97lin · 2020-11-03T23:10:09Z

evalml/tests/problem_type_tests/test_problem_types.py

@@ -16,7 +16,7 @@ def correct_problem_types():


 def test_handle_string(correct_problem_types):
-    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary']
+    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary', ProblemTypes.TIME_SERIES_REGRESSION]


Hm, I think we actually need to add to correct_problem_types for this test to be fully updated. Reason why it passes right now is because zip(problem_types, correct_problem_types) will only iterate when both lists still have items, so it only goes through correct_problem_types (3 elements)

angela97lin · 2020-11-03T23:10:49Z

evalml/objectives/regression_objective.py

@@ -9,5 +9,5 @@ class RegressionObjective(ObjectiveBase):
    problem_type (ProblemTypes): Type of problem this objective is. Set to ProblemTypes.REGRESSION.


We should update the docstring for each of the objective classes!

Great point!

angela97lin · 2020-11-03T23:12:13Z

evalml/objectives/regression_objective.py

@@ -9,5 +9,5 @@ class RegressionObjective(ObjectiveBase):
    problem_type (ProblemTypes): Type of problem this objective is. Set to ProblemTypes.REGRESSION.
    """

-    problem_type = ProblemTypes.REGRESSION
+    problem_types = [ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION]


For my own curiosity: Are all regression objectives time-series applicable? Is there ever the case where an objective should be ProblemTypes.TIME_SERIES_REGRESSION and not ProblemTypes.REGRESSION? :o

I would say all of our current regression objectives would also work for a time series problem. But in the future, we will add objectives that are time series specific, e.g:

https://en.wikipedia.org/wiki/Mean_absolute_scaled_error

I think this is the right call.

All our current regression objectives are valid for timeseries regression. So allowing REGRESSION and TIME_SERIES_REGRESSION is great.

When we add the first time-series-only objective, we can override problem_types in each of those impls to only allow TIME_SERIES_REGRESSION. We could also then choose to define TimeseriesRegressionObjective to facilitate this, but probably not necessary.

We can follow the same pattern for binary/multiclass.

CLAassistant · 2020-11-04T14:41:57Z

All committers have signed the CLA.

dsherry

🚢 !

dsherry · 2020-11-06T19:56:33Z

evalml/objectives/binary_classification_objective.py

    can_optimize_threshold (bool): Determines if threshold used by objective can be optimized or not.
    """

-    problem_type = ProblemTypes.BINARY
+    problem_types = [ProblemTypes.BINARY]


dsherry · 2020-11-06T20:03:49Z

evalml/objectives/regression_objective.py

@@ -9,5 +9,5 @@ class RegressionObjective(ObjectiveBase):
    problem_type (ProblemTypes): Type of problem this objective is. Set to ProblemTypes.REGRESSION.
    """

-    problem_type = ProblemTypes.REGRESSION
+    problem_types = [ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION]


I think this is the right call.

All our current regression objectives are valid for timeseries regression. So allowing REGRESSION and TIME_SERIES_REGRESSION is great.

When we add the first time-series-only objective, we can override problem_types in each of those impls to only allow TIME_SERIES_REGRESSION. We could also then choose to define TimeseriesRegressionObjective to facilitate this, but probably not necessary.

We can follow the same pattern for binary/multiclass.

dsherry · 2020-11-06T20:06:30Z

evalml/problem_types/problem_types.py

@@ -11,11 +11,14 @@ class ProblemTypes(Enum):
    """Multiclass classification problem."""
    REGRESSION = 'regression'
    """Regression problem."""
+    TIME_SERIES_REGRESSION = 'time_series_regression'


This is a nit-pick, but: in our docs examples we typically type out problem_type='multiclass' etc. Is it easier for people to type/remember problem_type='time series regression' or problem_type='time_series_regression'?

Good point! Avoiding the underscores seems easier to type so I changed it to that hehe.

dsherry · 2020-11-06T20:11:21Z

evalml/tests/automl_tests/test_automl.py

+    if isinstance(objective, RegressionObjective):
+        objective_type = ProblemTypes.REGRESSION
+    elif isinstance(objective, MulticlassClassificationObjective):
+        objective_type = ProblemTypes.MULTICLASS


This is fine for now. But once we add objectives which aren't specific to one problem type (like timeseries), we'll have to remember to change this. Is there anything we can do to future-proof this test?

One option: add problem_type to the parametrize. Then add

if not objective.is_defined_for_problem_type(problem_type): pytest.skip()

(syntax prob wrong)

No prob if you don't get to this. I hope that in the wild, we'll always know the problem_type when we're working with objectives, so this doesn't worry me outside of our tests.

Made this change! Good point - it does make the test simpler! And when we add time series to automl, we just need to update the list of values that's parametrized so that's great.

dsherry · 2020-11-06T20:11:52Z

evalml/tests/problem_type_tests/test_problem_types.py

@@ -16,7 +16,7 @@ def correct_problem_types():


 def test_handle_string(correct_problem_types):
-    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary']
+    problem_types = ['regression', ProblemTypes.MULTICLASS, 'binary', ProblemTypes.TIME_SERIES_REGRESSION]


👏 great reviewing @bchen1116 @angela97lin !

…m types to support multiple problem types.

…d of problem_type. Updating problem types pytest fixture.

freddyaboulton self-assigned this Oct 30, 2020

freddyaboulton force-pushed the 1378-time-series-regression-problem-type branch from 06caaa9 to 5b586b0 Compare November 2, 2020 20:32

freddyaboulton marked this pull request as ready for review November 2, 2020 21:13

freddyaboulton requested review from dsherry, angela97lin, christopherbunn, eccabay, bchen1116 and jeremyliweishih November 2, 2020 21:14

bchen1116 reviewed Nov 3, 2020

View reviewed changes

angela97lin approved these changes Nov 3, 2020

View reviewed changes

freddyaboulton force-pushed the 1378-time-series-regression-problem-type branch 2 times, most recently from 2a32c8c to 7921feb Compare November 5, 2020 16:35

dsherry approved these changes Nov 6, 2020

View reviewed changes

freddyaboulton added 7 commits November 6, 2020 15:31

Add TimeSeriesRegression problem type. Had to update objective proble…

a295044

…m types to support multiple problem types.

Adding PR 1386 to the release notes.

e07e7ea

Adding is_defined_for_problem_type method.

e36ea6e

Add breaking change about ObjectiveBase.problem_types to release notes.

b026510

Updating docstrings for Objective classes to use problem_types instea…

f1eb074

…d of problem_type. Updating problem types pytest fixture.

Fixing objective docs.

0ac8c2b

Changing TIME_SERIES_REGRESSION problem type to 'time series regression'

f4ba7d8

freddyaboulton force-pushed the 1378-time-series-regression-problem-type branch from 7921feb to f4ba7d8 Compare November 6, 2020 20:32

freddyaboulton merged commit 216c8a1 into main Nov 6, 2020

freddyaboulton deleted the 1378-time-series-regression-problem-type branch November 6, 2020 20:48

dsherry mentioned this pull request Nov 24, 2020

Release v0.16.0 #1468

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TimeSeriesRegression problem type #1386

Add TimeSeriesRegression problem type #1386

freddyaboulton commented Oct 30, 2020

codecov bot commented Oct 30, 2020 •

edited

Loading

bchen1116 Nov 3, 2020

angela97lin Nov 3, 2020

freddyaboulton Nov 4, 2020

freddyaboulton Nov 4, 2020

dsherry Nov 6, 2020

angela97lin left a comment

angela97lin Nov 3, 2020

angela97lin Nov 3, 2020

angela97lin Nov 3, 2020

angela97lin Nov 3, 2020

freddyaboulton Nov 4, 2020

angela97lin Nov 3, 2020

freddyaboulton Nov 4, 2020 •

edited

Loading

dsherry Nov 6, 2020

CLAassistant commented Nov 4, 2020 •

edited

Loading

dsherry left a comment

dsherry Nov 6, 2020

dsherry Nov 6, 2020

dsherry Nov 6, 2020

freddyaboulton Nov 6, 2020

dsherry Nov 6, 2020

freddyaboulton Nov 6, 2020

dsherry Nov 6, 2020

		* Added a problem type for time series regression :pr:`1386`
		* Added a ``is_defined_for_problem_type`` method to ``ObjectiveBase`` :pr:`1386`

		@@ -9,5 +9,5 @@ class RegressionObjective(ObjectiveBase):
		problem_type (ProblemTypes): Type of problem this objective is. Set to ProblemTypes.REGRESSION.

Add TimeSeriesRegression problem type #1386

Add TimeSeriesRegression problem type #1386

Conversation

freddyaboulton commented Oct 30, 2020

Pull Request Description

codecov bot commented Oct 30, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angela97lin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

freddyaboulton Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Nov 4, 2020 • edited Loading

dsherry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 30, 2020 •

edited

Loading

freddyaboulton Nov 4, 2020 •

edited

Loading

CLAassistant commented Nov 4, 2020 •

edited

Loading