[timeseries] Minor bugfixes & improvements for local forecasting models #3252

shchur · 2023-05-31T14:35:33Z

Description of changes:

Expose use_fallback_model as an optional hyperparameter for all local models (default True). When set to False, fallback model will be disabled, and any exception in the underlying model will propagate. This is important for testing - currently, we had one model that always failed because of a bug, but this wasn't caught by the CI because of the fallback model.
Fix typos in docstrings
All local models are now trained using at most the last 2500 entries of each time series. This allows to significantly reduce the training time without degrading the accuracy:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

tonyhoo · 2023-05-31T16:36:42Z

Thank you for addressing the issue. I have a question: why doesn't our CI detect the local model's failure and subsequent use of the naive model, given that this occurs consistently? Does this suggest that the naive model's performance is on par with the local model, or is the issue specific to the dataset being used?

tonyhoo · 2023-05-31T16:38:57Z

timeseries/src/autogluon/timeseries/models/local/abstract_local_model.py

@@ -71,6 +68,10 @@ def __init__(
            self.n_jobs = n_jobs
        else:
            raise ValueError(f"n_jobs must be a float between 0 and 1 or an integer (received n_jobs = {n_jobs})")
+        # Default values, potentially overridden inside _fit()
+        self.use_fallback_model = True
+        self.max_ts_length = 2500


Shall we adjust it based on the freq? For example, for min data, 2500 is less than 2 days which will not be able to capture weekly trends/seasonality

+1 on at least making it customizable

@tonyhoo Currently, the models that we use (AutoETS, AutoARIMA, Theta, ETS, ARIMA, Naive, SeasonalNaive) are anyway unable to capture multiple seasonalities - they only capture the seasonality at the seasonal_period that we provide. This is at most 24 * 60 = 1440 for minutely data, but usually much smaller (<= 24).

I think that the example you described would apply to models like MSTL that consider multiple seasonalities, and I agree that we would need to increase the max_ts_length for such models if we add them.

@gradientsky I've moved the parameter override code from the _fit method to __init__.

tonyhoo · 2023-05-31T16:40:49Z

timeseries/src/autogluon/timeseries/models/local/statsforecast.py

        Number of CPU cores used to fit the models in parallel.
        When set to a float between 0.0 and 1.0, that fraction of available CPU cores is used.
        When set to a positive integer, that many cores are used.
        When set to -1, all CPU cores are used.
+    max_ts_length : int, default = 2500


gradientsky · 2023-05-31T18:50:35Z

timeseries/src/autogluon/timeseries/models/local/abstract_local_model.py

@@ -71,6 +68,10 @@ def __init__(
            self.n_jobs = n_jobs
        else:
            raise ValueError(f"n_jobs must be a float between 0 and 1 or an integer (received n_jobs = {n_jobs})")
+        # Default values, potentially overridden inside _fit()
+        self.use_fallback_model = True
+        self.max_ts_length = 2500


+1 on at least making it customizable

timeseries/src/autogluon/timeseries/models/local/abstract_local_model.py

github-actions · 2023-05-31T19:03:40Z

Job PR-3252-05a8937 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3252/05a8937/index.html

shchur · 2023-06-01T08:07:08Z

timeseries/tests/unittests/models/test_local.py

@@ -3,6 +3,7 @@
 import pandas as pd
 import pytest


@tonyhoo Regarding the tests for model performance: Currently, CI for time series does not include any regression tests. The reasoning was the models could be changing quite frequently, and keeping track of the performance ranges for individual models would be tedious. However, I agree that now we should probably be looking to add these tests, as the model set is becoming more stable. Do you think this it's fine if we add these tests after v0.8, or do you think it has higher priority and we should do this asap?

We can add these post 0.8 and make sure it is incorporated into the benchmark and dashboard project

github-actions · 2023-06-01T10:35:44Z

Job PR-3252-156ba1e is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3252/156ba1e/index.html

github-actions · 2023-06-01T10:40:16Z

Job PR-3252-bc47956 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3252/bc47956/index.html

github-actions · 2023-06-01T13:16:22Z

Job PR-3252-5c3cf7f is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3252/5c3cf7f/index.html

Bugfixes & improvements for local models

749f5c7

shchur added the module: timeseries related to the timeseries module label May 31, 2023

shchur requested a review from tonyhoo May 31, 2023 14:35

shchur mentioned this pull request May 31, 2023

[timeseries] Implement prediction caching and refactor prediction logic in AbstractTimeSeriesTrainer #3237

Merged

Fix tests

05a8937

tonyhoo reviewed May 31, 2023

View reviewed changes

gradientsky reviewed May 31, 2023

View reviewed changes

shchur added 2 commits June 1, 2023 07:58

Address PR comments

156ba1e

Use % formatter

bc47956

shchur commented Jun 1, 2023

View reviewed changes

Fix typo

5c3cf7f

tonyhoo approved these changes Jun 1, 2023

View reviewed changes

shchur merged commit 5f51edc into autogluon:master Jun 1, 2023
28 checks passed

shchur deleted the fix-local-models branch June 1, 2023 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[timeseries] Minor bugfixes & improvements for local forecasting models #3252

[timeseries] Minor bugfixes & improvements for local forecasting models #3252

shchur commented May 31, 2023

tonyhoo commented May 31, 2023

tonyhoo May 31, 2023

gradientsky May 31, 2023

shchur Jun 1, 2023

tonyhoo May 31, 2023

gradientsky May 31, 2023

github-actions bot commented May 31, 2023

shchur Jun 1, 2023

tonyhoo Jun 1, 2023

github-actions bot commented Jun 1, 2023

github-actions bot commented Jun 1, 2023

github-actions bot commented Jun 1, 2023

[timeseries] Minor bugfixes & improvements for local forecasting models #3252

[timeseries] Minor bugfixes & improvements for local forecasting models #3252

Conversation

shchur commented May 31, 2023

tonyhoo commented May 31, 2023

tonyhoo May 31, 2023

Choose a reason for hiding this comment

gradientsky May 31, 2023

Choose a reason for hiding this comment

shchur Jun 1, 2023

Choose a reason for hiding this comment

tonyhoo May 31, 2023

Choose a reason for hiding this comment

gradientsky May 31, 2023

Choose a reason for hiding this comment

github-actions bot commented May 31, 2023

shchur Jun 1, 2023

Choose a reason for hiding this comment

tonyhoo Jun 1, 2023

Choose a reason for hiding this comment

github-actions bot commented Jun 1, 2023

github-actions bot commented Jun 1, 2023

github-actions bot commented Jun 1, 2023