Refactor/fit args #161

grll · 2020-07-21T12:59:10Z

Fixes #DARTS-164.

Summary

remove target_indices parameter to all fit methods on multivariate models
remove component_index parameter to all fit method on univariate models
introduce covariate_series and target_series parameters to replace previous syntax
adapt existing code base to pass tests

Note: backtesting test are not passing but they will soon be refactored so I didn't invest time on that.

New vs Old API:

# old syntax:
multivariate_model.fit(multivariate_series, target_indices=[0, 1])

# new syntax:
multivariate_model.fit(multivariate_series, multivariate_series[["0", "1"]])

# old syntax:
univariate_model.fit(multivariate_series, component_index=2)

# new syntax:
univaraite_model.fit(multivariate_series["2"])

Co-authored-by: Julien Herzen <julien.herzen@unit8.co>

hrzn

This looks good, I think it's a good move. The only drawback I can see is that it makes the specification of validation time series for fit() of Torch models a bit more complex. @pennfranc what's your opinion?

hrzn · 2020-07-22T19:24:18Z

darts/models/forecasting_model.py

+        covariate_series
+            The training time series on which to fit the model (can be multivariate or univariate).
+        target_series
+            The target values used ad dependent variables when training the model


The target can also be multivariate or univariate. I would unify the comments here.

hrzn · 2020-07-22T19:27:10Z

darts/models/forecasting_model.py

-
-        self.target_indices = target_indices
-        super().fit(series)
+        raise_if_not(len(covariate_series) == len(target_series), "covariate_series and target_series musth have same "


You should probably compare the time indexes here instead of the lengths.

hrzn · 2020-07-22T19:30:38Z

darts/models/torch_forecasting_model.py

-            Optionally, a validation time series, which will be used to compute the validation loss
-            throughout training and keep track of the best performing models.
+            Optionally, 2 validation time series (covariate and target), which will be used to compute the validation
+            loss throughout training and keep track of the best performing models.
        verbose
            Optionally, whether to print progress.
        target_indices


hrzn · 2020-07-22T19:41:08Z

darts/models/torch_forecasting_model.py

-            target_indices: Optional[List[int]] = None) -> None:
+            covariate_series: TimeSeries,
+            target_series: Optional[TimeSeries] = None,
+            val_series: Optional[Tuple[TimeSeries]] = None,


How about splitting in two:

val_covariate_series: Optional[TimeSeries] val_target_series: Optional[TimeSeries]

That would be more consistent with the first two arguments.

I did it like this first but having a Tuple makes it more easy to check that the length provided is either None or 2 as opposed to having two separate args

I would say that it's more important to have a good signature here, even if the check is slightly more complex behind the scenes.

pennfranc · 2020-07-24T14:29:44Z

darts/models/forecasting_model.py

-        series
-            the training time series on which to fit the model
+        Implements behavior that should happen when calling the `fit` method of every forcasting model regardless of
+        wether they are univariate or multivariate.


whether (sorry)

No problem :) you can directly propose edits for such typo

pennfranc · 2020-07-24T14:31:58Z

darts/models/forecasting_model.py

        """
-        raise_if_not(len(series) >= self.min_train_series_length,


As far as I can see this is currently not being tested for the univariate model.

good catch I moved this check in the parent class

grll · 2020-07-24T14:32:53Z

darts/models/forecasting_model.py

-        series
-            the training time series on which to fit the model
+        Implements behavior that should happen when calling the `fit` method of every forcasting model regardless of
+        wether they are univariate or multivariate.


Suggested change

wether they are univariate or multivariate.

whether they are univariate or multivariate.

I didn't know you could do this!

pennfranc · 2020-07-24T14:38:13Z

darts/models/torch_forecasting_model.py

            The time series to be included in the dataset.
+        target_series
+            The time series used has target.


pennfranc · 2020-07-24T14:48:15Z

Overall I really like this change, just a couple of things I'm not quite sure about:

I kind of feel like the names of target_series and covariate_series should be switched. I think covariate_series should be the optional one since we always want to predict a time series, but optionally we want to include other time series as features that won't be predicted, right?
Kind of related to the above, personally I don't see why we don't simply include the target series in the training data by default. Because if it's not included by default, the user will have to manually make sure it is included in both arguments (target_series and covariate_series) at the same time (if he has other time series components that he just wants to include as features and not predict.
I hope this makes sense..

Edit: Ok looking at your example above I think I get why you did it this way, this provides the user with all possible combinations and with an example it's pretty intuitive! I still think covariate_series should be the optional one though and set to target_series if not present.
The thing is though, and maybe that's just me, the name covariate to me somehow suggests that it is additional, meaning on top of target_series. I guess that's why I would have included target_series in the training data by default.

@hrzn I agree that it requires a bit more effort to list all the fit arguments now, but I think this is a price worth paying.

…tures/int-indexing

grll · 2020-07-31T13:58:50Z

Overall I really like this change, just a couple of things I'm not quite sure about:

I kind of feel like the names of target_series and covariate_series should be switched. I think covariate_series should be the optional one since we always want to predict a time series, but optionally we want to include other time series as features that won't be predicted, right?

Kind of related to the above, personally I don't see why we don't simply include the target series in the training data by default. Because if it's not included by default, the user will have to manually make sure it is included in both arguments (target_series and covariate_series) at the same time (if he has other time series components that he just wants to include as features and not predict.
I hope this makes sense..

Edit: Ok looking at your example above I think I get why you did it this way, this provides the user with all possible combinations and with an example it's pretty intuitive! I still think covariate_series should be the optional one though and set to target_series if not present.
The thing is though, and maybe that's just me, the name covariate to me somehow suggests that it is additional, meaning on top of target_series. I guess that's why I would have included target_series in the training data by default.

@hrzn I agree that it requires a bit more effort to list all the fit arguments now, but I think this is a price worth paying.

I renamed covariate_timeseries into training_timeseries maybe it helps understand what does what, wdyt ?

* add .DS_Store to .gitignore * add proposal.md * add draft version of backtest forcasting * add backtest to model (simple refactoring) * extract backtest sanity checks in a method * extract building fit_kwargs and predict_kwargs in a method * minor fix import comment and assertion * refactor all backtest factoring tests * update progress on proposal.md * add coverage.sh * fix permission on coverage.sh * improve coverage sh script * add coverage.xml to .igtignore * improve doc on coverage.sh * fix doc * fix doc for real * univariate fcast model only support univariate ts * MultivariateFcasModel fits on the whole training ts * refactor torch forcasting model to use covariate_series * fix unused imports * allow to specify only covaraite_series * enforce covariate_series and target_series inputs for multivariate model * adapt torch datasets to use covariate / target series * adapt validation series provided as a Tuple * fix typo * adapt create_dataset on tcn model * remove component index from fit function * adapt tests to new syntax * add proposal.md * add draft version of backtest forcasting * add backtest to model (simple refactoring) * extract backtest sanity checks in a method * extract building fit_kwargs and predict_kwargs in a method * minor fix import comment and assertion * refactor all backtest factoring tests * update progress on proposal.md * fix doc * fix doc for real * fix typos and remove diagram in backtest doc * WIP add residuals * add decorator for sanity checks * clean forecasting_model * add start multitype parameter support * fix check on undefined param in sanity checks * add comments * fix(backtesting, tests): fixed bugs so that all forecasting backtest tests pass, corrected some typos * feature(backtesting): changed handling of residuals (re-introduced own function instead of being by-product of backtest) * fix(test_forecasting_model): deleted old file that was renamed due to type * feat(backtesting): moved gridsearch to ForecastingModel, removed functions from backtesting module that have been moved to ForecastingModel class, adapted tests * feat(backtesting): adapted docstring of gridsearch function * fix(Theta): adapted FourTheta model to use new gridsearch function * fix(forecasting_model, torch_forecasting_model): fixed docstrings * feat(backtesting): moved backtest_regression to regression model class * fix(forecasting_model): renamed covariate_series to training_series * fix(forecasting_model): fixed residuals function * fix(style): linter * feat(backtesting): renamed backtest_gridsearch to gridsearch * fix(tests): fixed residuals test case * feat(backtesting): moved residuals plotting function to statistics module * feat(backtesting): removed backtesting module * fix(style): linter * fix(style): linter * fix(torch_forecasting_model): fixed check in predict function * fix(forecasting_model): fixed backtest sanity check * fix(torch_forecasting_model): removed unnecessary (and bug-causing) sanity check method * feat(examples): refactored notebooks to support new function signatures * fix(style): linter * updated PROPOSAL.md * feat(forecasting_model): improved documentation * fix(torch_forecasting_model): removed redundant function * style(torch_forecasting_model): linter * fix(torch_forecasting_model): fixed docstring typo * fix(torch_forecasting_model, tests): clean up old comments * fix(statistics): improved docstrings * fix(forecasting_model, regression_model): improved variable names, fixed documentation * fix(tests): fixed old variable name in backtesting tests * removed PROPOSAL.md * feat(regression_model): added stride functionality to backtest method * fix(forecasting_model, regression_model): improved documentation * fix(forecasting_model): improved documentation * fix(forecasting_model): improved start parameter documentation * fix(forecasting_model, regression_model): cleaned up code, improved docstrings, added missing checks * feat(forecasting_model): improved backtest docstring * fix(forecasting_model, tests): improved backtest sanity checks, added corresponding test cases * feat(backtesting): replaced 'num_predictions' parameter by 'start' parameter in 'ForecastingModel.gridsearch' * fix(examples): updated notebooks Co-authored-by: Guillaume Raille <guillaume.raille@unit8.co> Co-authored-by: pennfranc <flaessig@student.ethz.ch>

grll and others added 14 commits July 15, 2020 18:03

add support for columns to the TimeSeries object

425a479

add colum support indexing to timeseries

beb6432

fix wrong docstring

20064aa

refactor indexing, fix docstring, columns as last arg

8c9f224

clean indexing method

a8021ed

refactor indexing only based on loc and iloc

4bde24b

Update darts/timeseries.py

f9d89a8

Co-authored-by: Julien Herzen <julien.herzen@unit8.co>

use underlying columns by default

9c0c9e6

fix column added on intern _df and use self.freq_str

cd6df5d

fix parameter position in from_times_and_values

6bef192

fix the tests to use str columns

fb8b78d

fix docstring timeseries

9ad5c46

remove None check on df that should exists

1cde216

Merge branch 'develop' into features/indexing

66105c7

grll requested review from hrzn and TheMP as code owners July 21, 2020 12:59

grll changed the base branch from develop to features/indexing July 22, 2020 07:04

hrzn reviewed Jul 22, 2020

View reviewed changes

hrzn requested a review from pennfranc July 22, 2020 19:50

pennfranc reviewed Jul 24, 2020

View reviewed changes

grll commented Jul 24, 2020

View reviewed changes

pennfranc reviewed Jul 24, 2020

View reviewed changes

TheMP and others added 6 commits July 28, 2020 08:17

Merge branch 'develop' into features/indexing

8ddc228

add comment for clarifying that _df is a copy

cade004

add separate function to process columns

38635ab

Merge branch 'features/indexing' of github.com:unit8co/darts into fea…

f96f169

…tures/int-indexing

Merge branch 'develop' into features/indexing

28173d1

adapt map with str col indexing

b515be0

grll added 20 commits July 30, 2020 15:57

univariate fcast model only support univariate ts

0556c97

MultivariateFcasModel fits on the whole training ts

6b61b3f

refactor torch forcasting model to use covariate_series

3fb1bb2

fix unused imports

8166e27

allow to specify only covaraite_series

6b73a2a

enforce covariate_series and target_series inputs for multivariate model

4d0304e

adapt torch datasets to use covariate / target series

dad9607

adapt validation series provided as a Tuple

0de6892

fix typo

71d6c46

adapt create_dataset on tcn model

1ebb94b

remove component index from fit function

bca4ac0

adapt tests to new syntax

530af67

refacotr metaclasses

b671085

abstract a new method make fitable series

c04fd3b

adapt torchforcastingmodel to parent class changes

1d914e1

keep covariate/target seires for Multivariate models only

b4111e2

fix typos with new implementation

2ae9a1c

move series length check in forcasting model

7544010

rename covariate into training series

8a9aa8b

adapt old backtesting to support the new fit args syntax

d909155

grll force-pushed the refactor/fit-args branch from 189c097 to d909155 Compare July 30, 2020 20:30

grll changed the base branch from features/indexing to develop August 4, 2020 17:27

Merge branch 'develop' into refactor/fit-args

f598f80

hrzn approved these changes Sep 9, 2020

View reviewed changes

guillaumeraille and others added 2 commits September 17, 2020 11:43

Merge branch 'develop' into refactor/fit-args

29abe9a

pennfranc merged commit 2977f4f into develop Sep 17, 2020

LeoTafti deleted the refactor/fit-args branch October 15, 2020 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/fit args #161

Refactor/fit args #161

grll commented Jul 21, 2020 •

edited

hrzn left a comment

hrzn Jul 22, 2020

hrzn Jul 22, 2020

hrzn Jul 22, 2020

hrzn Jul 22, 2020

grll Jul 23, 2020

hrzn Jul 23, 2020

pennfranc Jul 24, 2020

grll Jul 24, 2020

pennfranc Jul 24, 2020

grll Jul 24, 2020

grll Jul 24, 2020

pennfranc Jul 24, 2020

pennfranc Jul 24, 2020

pennfranc commented Jul 24, 2020 •

edited

grll commented Jul 31, 2020

		"""
		raise_if_not(len(series) >= self.min_train_series_length,

	wether they are univariate or multivariate.
	whether they are univariate or multivariate.

Refactor/fit args #161

Refactor/fit args #161

Conversation

grll commented Jul 21, 2020 • edited

Summary

hrzn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pennfranc commented Jul 24, 2020 • edited

grll commented Jul 31, 2020

grll commented Jul 21, 2020 •

edited

pennfranc commented Jul 24, 2020 •

edited