Forecasting in EVA #969

americast · 2023-08-26T17:59:07Z

Implemented standalone forecasting in EVA (using statsforecast package). You can run it via the following commands:

DROP TABLE IF EXISTS AirData;

CREATE TABLE AirData (
    unique_id TEXT(30),
    ds TEXT(30),
    y INTEGER);

LOAD CSV 'data/forecasting/air-passengers.csv' INTO AirData;

DROP UDF IF EXISTS Forecast;

CREATE UDF Forecast
FROM (SELECT unique_id, ds, y FROM AirData)
TYPE Forecasting
'predict' 'y';

SELECT Forecast(12) FROM AirData;

Here Forecast(12) signifies a horizon length of 12.

Thanks!

xzdandy · 2023-08-27T18:04:27Z

Implemented standalone forecasting in EVA (using statsforecast package). You can run it via the following commands:
DROP TABLE IF EXISTS AirData;

CREATE TABLE AirData (
    unique_id TEXT(30),
    ds TEXT(30),
    y INTEGER);

LOAD CSV 'data/forecasting/air-passengers.csv' INTO AirData;

DROP UDF IF EXISTS Forecast;

CREATE UDF Forecast IMPL 'evadb/udfs/forecast.py';

SELECT Forecast(unique_id, ds, y) FROM AirData;
I plan to add more features to this. Tests and documentation are still pending.

Thanks!

Hi @americast, the design looks great. Is the idea that training will be implicit here? When the user runs SELECT Forecast(unique_id, ds, y) FROM AirData;, EvaDB will train and forecast underlying together. The rationale is that the trained forecast model can only apply to the same data source.

americast · 2023-08-27T23:00:14Z

Thanks @xzdandy for your review. As of now, yes the training is implicit. Since statsforecast trains really fast, that should be fine. However, if we were to incorporate DL-based forecasting into this, we might want to train explicitly in the background.

jyotigoyal09 · 2023-08-30T04:04:19Z

Hi @americast, I see you are using Forecast(unique_id, ds, y) to forecast for a single time series data. I was wondering if you have a functionality to forecast for panel data. If yes, how would you make final model selection if groups in the panel do not have same seasonality/trend pattern.

gaurav274 · 2023-08-30T20:53:00Z

Hi @americast, I see you are using Forecast(unique_id, ds, y) to forecast for a single time series data. I was wondering if you have a functionality to forecast for panel data. If yes, how would you make final model selection if groups in the panel do not have same seasonality/trend pattern.

Hi Jyoti, Thanks for showing interest in evadb. We are in the early phases of adding forecasting. @americast @xzdandy Thoughts?

americast · 2023-08-31T15:13:49Z

Hi @americast, I see you are using Forecast(unique_id, ds, y) to forecast for a single time series data. I was wondering if you have a functionality to forecast for panel data. If yes, how would you make final model selection if groups in the panel do not have same seasonality/trend pattern.

Hi @jyotigoyal09. Thanks for your interest! Having the functionality to forecast for panel data could be a very useful functionality. In case of miltivariate forecasting, we can find the lowest common seasonality and use the same. However, we could also focus our attention on various deep learning models that do not have the requirement for specifying a seasonality upfront, such as transformer-based models.

It would be great if we could discuss in detail about this in an issue.

americast · 2023-08-31T15:16:49Z

@xzdandy I have pushed some commits and now the training is more like the Ludwig-style (#935). The training occurs when the UDF is created and the trained model is stored using a unique model file name generated from the data. Now, whenever the UDF is called, the user only needs to specify the horizon. I have updated the commands in #969 (comment) accordingly.

xzdandy · 2023-09-01T05:19:58Z

Thanks for the contribution! The implementation looks good to me at high level.

We can add

Add a long integration test cases to verify it worked end-to-end.
Clean up the code (remove redundant print, redundant import)

americast · 2023-09-02T09:23:21Z

Thanks for the contribution! The implementation looks good to me at high level.

We can add

Add a long integration test cases to verify it worked end-to-end.

Clean up the code (remove redundant print, redundant import)

@xzdandy Thanks. I have added the test and the docs.

xzdandy

Thanks for adding the tests and documentations. They look great.

Minor improvements.

Fix the linter: bash script/test/test.sh -m LINTER
Add statsforecast dependency in setup.py
Install the statsforecast dependency in .cricle/config.yml, so the test will be run in long integration tests.

americast · 2023-09-02T20:24:50Z

Thanks for adding the tests and documentations. They look great.

Minor improvements.

Fix the linter: bash script/test/test.sh -m LINTER

Add statsforecast dependency in setup.py

Install the statsforecast dependency in .cricle/config.yml, so the test will be run in long integration tests.

Thanks. I have updated them. Please let me know if this looks all good.

…asting_dev

xzdandy

Great work! Verified the long integration test works on python3.10.

americast · 2023-09-05T16:18:08Z

Great work! Verified the long integration test works on python3.10.

Awesome, thanks!

Implemented standalone forecasting in EVA (using [statsforecast](https://nixtla.github.io/statsforecast) package). You can run it via the following commands: ```sql DROP TABLE IF EXISTS AirData; CREATE TABLE AirData ( unique_id TEXT(30), ds TEXT(30), y INTEGER); LOAD CSV 'data/forecasting/air-passengers.csv' INTO AirData; DROP UDF IF EXISTS Forecast; CREATE UDF Forecast FROM (SELECT unique_id, ds, y FROM AirData) TYPE Forecasting 'predict' 'y'; SELECT Forecast(12) FROM AirData; ``` Here `Forecast(12)` signifies a horizon length of `12`. Thanks! --------- Co-authored-by: xzdandy <xzdandy@gmail.com>

americast added 2 commits August 26, 2023 13:48

Add simple forecasting

1c361d1

Add data for forecasting

57a125d

xzdandy added the High Priority ⚡️ label Aug 27, 2023

xzdandy added this to the v0.3.3 milestone Aug 27, 2023

gaurav274 modified the milestones: v0.3.3, v0.3.4 Aug 29, 2023

americast added 2 commits August 30, 2023 13:20

Add Forecasting as a separate TYPE

444b95d

Merge branch 'staging' into forecasting_dev

2ff33a0

americast added 3 commits August 31, 2023 03:50

WIP: ludwig style

2419096

Switched to Ludwig-style training

fe5bf4d

Specify horizon during UDF call

b3d4506

americast added 2 commits September 1, 2023 11:43

Added test

31634e4

more robust, removed redundant imports

4c2c151

americast marked this pull request as ready for review September 2, 2023 09:19

Add docs

04c859b

americast force-pushed the forecasting_dev branch from 7dae25b to 04c859b Compare September 2, 2023 09:22

xzdandy requested changes Sep 2, 2023

View reviewed changes

americast added 2 commits September 2, 2023 16:18

Added linting

e8b8b74

setup and ci

534afb3

xzdandy added 2 commits September 5, 2023 03:21

Fix all doc issues

66d942d

Merge branch 'staging' into forecasting_dev

12eb41b

xzdandy added 3 commits September 5, 2023 03:33

Fix Linter

6c4c7fa

Merge branch 'forecasting_dev' of github.com:americast/eva into forec…

c3b5acb

…asting_dev

Fix setup.py and .circleci/config.yml

c98aebc

xzdandy approved these changes Sep 5, 2023

View reviewed changes

xzdandy added 2 commits September 5, 2023 03:46

Merge branch 'fix_staging_doc' into forecasting_dev

ddfb5ef

Merge branch 'staging' into forecasting_dev

07bd627

xzdandy merged commit 0f88555 into georgia-tech-db:staging Sep 5, 2023
0 of 4 checks passed

americast mentioned this pull request Sep 8, 2023

Improve the univariate statsforecast function in EvaDB #1081

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forecasting in EVA #969

Forecasting in EVA #969

americast commented Aug 26, 2023 •

edited

Loading

xzdandy commented Aug 27, 2023

americast commented Aug 27, 2023

jyotigoyal09 commented Aug 30, 2023

gaurav274 commented Aug 30, 2023

americast commented Aug 31, 2023

americast commented Aug 31, 2023

xzdandy commented Sep 1, 2023

americast commented Sep 2, 2023

xzdandy left a comment

americast commented Sep 2, 2023

xzdandy left a comment •

edited

Loading

americast commented Sep 5, 2023

Forecasting in EVA #969

Forecasting in EVA #969

Conversation

americast commented Aug 26, 2023 • edited Loading

xzdandy commented Aug 27, 2023

americast commented Aug 27, 2023

jyotigoyal09 commented Aug 30, 2023

gaurav274 commented Aug 30, 2023

americast commented Aug 31, 2023

americast commented Aug 31, 2023

xzdandy commented Sep 1, 2023

americast commented Sep 2, 2023

xzdandy left a comment

Choose a reason for hiding this comment

americast commented Sep 2, 2023

xzdandy left a comment • edited Loading

Choose a reason for hiding this comment

americast commented Sep 5, 2023

americast commented Aug 26, 2023 •

edited

Loading

xzdandy left a comment •

edited

Loading