Add STLDecomposer to multiseries pipelines #4299

remyogasawara · 2023-09-02T00:53:30Z

Resolves #4298

Acceptance Criteria (AC)

Pipelines generated during AutoMLSearch via make_pipeline for multiseries time series regression include one pipeline with the STLDecomposer and one pipeline without, following the with/without pattern we use for current time series regression problems

codecov · 2023-09-02T01:01:26Z

Codecov Report

Patch coverage: 100.0% and project coverage change: +0.1% 🎉

Comparison is base (1329988) 99.7% compared to head (28e2cdb) 99.7%.

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #4299     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        357     357             
  Lines      39577   39587     +10     
=======================================
+ Hits       39457   39467     +10     
  Misses       120     120

Files Changed	Coverage Δ
evalml/pipelines/component_graph.py	`99.8% <ø> (ø)`
...nents/transformers/preprocessing/stl_decomposer.py	`100.0% <100.0%> (ø)`
evalml/pipelines/time_series_pipeline_base.py	`100.0% <100.0%> (ø)`
evalml/pipelines/utils.py	`99.7% <100.0%> (+0.1%)`	⬆️
...valml/tests/automl_tests/test_default_algorithm.py	`100.0% <100.0%> (ø)`
...lml/tests/automl_tests/test_iterative_algorithm.py	`100.0% <100.0%> (ø)`
evalml/tests/pipeline_tests/test_pipeline_utils.py	`99.6% <100.0%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

eccabay · 2023-09-08T13:40:22Z

evalml/pipelines/component_graph.py

@@ -806,10 +806,11 @@ def graph(self, name=None, graph_format=None):
                    [
                        key + " : " + "{:0.2f}".format(val)
                        if (isinstance(val, float))
-                        else key + " : " + str(val)
+                        else key + " : " + str(val).replace("{", "").replace("}", "")


This is hard to follow 😅 can you add an explanatory comment?

eccabay · 2023-09-08T13:49:01Z

evalml/pipelines/components/transformers/preprocessing/stl_decomposer.py

@@ -442,6 +442,7 @@ def inverse_transform(
            y.append(y_series)
        y_df = pd.DataFrame(y).T
        y_df.index = original_index
+        y_df.columns = y_t.columns


Out of curiosity, why is this necessary? What was the situation where the columns weren't the same?

The predictions weren't getting the corresponding series ID values as the column names and that's needed since the decomposer uses this to select the correct value. Before this was causing the decomposer to return NaN values. @christopherbunn figured that out so he might have more info.

The predictions that are generated do not have the series ID values as their column names. Copying these names over is required so we can inverse_transform from the decomposer.

eccabay · 2023-09-08T14:03:05Z

evalml/pipelines/utils.py

+        seasonal_period = STLDecomposer.determine_periodicity(
+            X,
+            y,
+            rel_max_order=order,
+        )
+        if seasonal_period is not None and seasonal_period <= DECOMPOSER_PERIOD_CAP:


The way determine_periodicity is set up, we're currently detecting a "period" on the single stacked target data column. I'm worried that that's too brittle, it could cause weird issues in the future. Could you put this in a conditional branch to ensure we only run it in the single series case, and for now just always add the decomposer for multiseries? We'll have to come back and revisit, but that should be ok for the MVP.

remyogasawara added 2 commits September 1, 2023 17:45

init commit

a7f704a

update release notes

39f3e80

remyogasawara and others added 9 commits September 5, 2023 17:12

add decomposer to tests

4689a95

Merge branch 'main' into 4298_add_stl_to_ms_pipeline

98e1119

handle duplicates

6aa8a25

Remove nan values - NOT FINISHED

6bea453

handle series and df

189eb49

fix stl graph

4a9ab0e

fix if statements

2fd85fe

revert utils

639408a

comments

12e1771

remyogasawara marked this pull request as ready for review September 8, 2023 00:40

auto-assign bot assigned remyogasawara Sep 8, 2023

remyogasawara requested review from christopherbunn, jeremyliweishih, MichaelFu512, eccabay and chukarsten September 8, 2023 00:40

eccabay reviewed Sep 8, 2023

View reviewed changes

remyogasawara added 2 commits September 8, 2023 09:34

add comments and conditional branch

8ab7fdf

fix condition for adding decomposer

28e2cdb

remyogasawara requested a review from eccabay September 8, 2023 18:34

jeremyliweishih approved these changes Sep 8, 2023

View reviewed changes

christopherbunn approved these changes Sep 8, 2023

View reviewed changes

remyogasawara merged commit 81abfca into main Sep 8, 2023
24 checks passed

remyogasawara deleted the 4298_add_stl_to_ms_pipeline branch September 8, 2023 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add STLDecomposer to multiseries pipelines #4299

Add STLDecomposer to multiseries pipelines #4299

remyogasawara commented Sep 2, 2023

codecov bot commented Sep 2, 2023 •

edited

eccabay Sep 8, 2023

eccabay Sep 8, 2023

remyogasawara Sep 8, 2023

christopherbunn Sep 8, 2023

eccabay Sep 8, 2023

Add STLDecomposer to multiseries pipelines #4299

Add STLDecomposer to multiseries pipelines #4299

Conversation

remyogasawara commented Sep 2, 2023

codecov bot commented Sep 2, 2023 • edited

Codecov Report

eccabay Sep 8, 2023

Choose a reason for hiding this comment

eccabay Sep 8, 2023

Choose a reason for hiding this comment

remyogasawara Sep 8, 2023

Choose a reason for hiding this comment

christopherbunn Sep 8, 2023

Choose a reason for hiding this comment

eccabay Sep 8, 2023

Choose a reason for hiding this comment

codecov bot commented Sep 2, 2023 •

edited