Cap seasonal period for inclusion of STL in search #4146

eccabay · 2023-04-12T20:24:07Z

The larger the seasonal period of a dataset is, the longer STL takes to train. There's some wiggle room here, as STL will take longer to fit on a random dataset compared to a perfectly seasonal dataset when given the same period, but the relationship holds in general.

We have a few example datasets where the detected period is around 1800, but there isn't really a seasonality there at all when inspecting visually. The pipelines that include STL also perform worse on average than those without the decomposer.

Therefore, we should impose a cap on how large the seasonal period will be in order to include the STL decomposer in pipelines.

eccabay · 2023-04-12T21:06:25Z

(There are axes on this graph, you just have to click the image to see them?)
Here is a plot of the time it takes to run STLDecomposer.fit() in the worst and best case scenarios - where the data is totally random (worst) and the data is perfectly seasonal (best).

With a standard run of search for time series, we run 10 datasets with the STLDecomposer, and within each pipeline we train it 3 times (once per CV split). With a dataset where it takes 30 seconds to fit the STLDecomposer once, we will spend 15 minutes fitting the decomposer across all datasets. For a random dataset, we hit that at a period of around 3700.

I propose that we set the bar much lower, at a seasonal period of 1000. From my testing, that takes 3s/fit in the best case and 8s/fit in the worst. That means we will maximally spend 1.5-4 minutes just fitting the STLDecomposer.

Conceptually, this limit is reasonable as seasonal periods that large are rare. We will need to run performance tests on this before merging, but I'm confident this will be exclusively a performance improvement.

exalate-issue-sync bot assigned eccabay Apr 12, 2023

eccabay mentioned this issue Apr 13, 2023

Add cap to seasonal period to include decomposition #4147

Merged

chukarsten closed this as completed in #4147 Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cap seasonal period for inclusion of STL in search #4146

Cap seasonal period for inclusion of STL in search #4146

eccabay commented Apr 12, 2023

eccabay commented Apr 12, 2023 •

edited

Loading

Cap seasonal period for inclusion of STL in search #4146

Cap seasonal period for inclusion of STL in search #4146

Comments

eccabay commented Apr 12, 2023

eccabay commented Apr 12, 2023 • edited Loading

eccabay commented Apr 12, 2023 •

edited

Loading