[BUG] [timeseries] Some forecasting models fail during fit if an S3 path is used #4212

shchur · 2024-05-21T09:21:43Z

Bug Report Checklist

I provided code that demonstrates a minimal reproducible example.
I confirmed bug exists on the latest mainline of AutoGluon via source install.
I confirmed bug exists on the latest stable version of AutoGluon.

Describe the bug
Setting TimeSeriesPredictor(path="s3://my-bucket/my-predictor") leads to some forecasting models failing during training.

Related issue: awslabs/gluonts#3171

Expected behavior
All models train successfully, same as if a local path was used.

To Reproduce

pred = TimeSeriesPredictor(path="s3://my-bucket/my-predictor").fit('https://autogluon.s3.amazonaws.com/datasets/timeseries/m4_hourly_tiny/train.csv')

The text was updated successfully, but these errors were encountered:

Innixma · 2024-05-21T18:48:19Z

Open question: Should we support s3 paths?

In early versions of AutoGluon I supported s3 paths for TabularPredictor. I eventually found that once I mixed in ray and started using optimized model saving techniques that it became hard to ensure the code worked properly in both scenarios.

Because of this difficulty, I eventually stopped supporting S3 paths for predictor artifacts.

One work-around could be to use a local directory for all training / prediction, and then upload the local files to the s3 location after fitting is complete. Similarly, download the artifact if called via .load. I haven't implemented this approach but it could work.

If you have found a way to make s3 paths work with relative ease, I'd be interested in knowing how you did it.

shchur · 2024-05-22T07:35:34Z

@Innixma I assumed that TabularPredictor already supported S3 paths since it worked fine with medium_quality presets, but now I understand that this might not be the case for high_quality / best_quality, where ray is used.

I will check how much work it would take to make S3 paths work in all scenarios. At the very least, we should raise some informative error message explaining that only local paths are supported. Currently for the TimeSeriesPredictor we just observe some model failures in the middle of the training process, which is a bad customer experience.

shchur added bug: unconfirmed Something might not be working Needs Triage Issue requires Triage bug Something isn't working module: timeseries related to the timeseries module and removed bug: unconfirmed Something might not be working Needs Triage Issue requires Triage labels May 21, 2024

shchur added this to the 1.2 Release milestone May 21, 2024

shchur self-assigned this May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] [timeseries] Some forecasting models fail during fit if an S3 path is used #4212

[BUG] [timeseries] Some forecasting models fail during fit if an S3 path is used #4212

shchur commented May 21, 2024

Innixma commented May 21, 2024

shchur commented May 22, 2024

[BUG] [timeseries] Some forecasting models fail during fit if an S3 path is used #4212

[BUG] [timeseries] Some forecasting models fail during fit if an S3 path is used #4212

Comments

shchur commented May 21, 2024

Innixma commented May 21, 2024

shchur commented May 22, 2024