& ris-bali [ENH] statsmodels DynamicFactor interface #2859

lbventura · 2022-06-24T10:47:45Z

Reference Issues/PRs

Addresses #2349. There are multiple failed tests due to the behavior of the _predict method, but further exploration in testfile.ipynb suggests that the method's behavior is actually correct. Could be that tests are wrong?

What does this implement/fix? Explain your changes.

Deeper look into #2349 failed tests using check_estimator.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Any other comments?

(At least) Two outstanding points:

Implement the predict_proba method;
Create an example notebook - one cool example would be to forecast economic variables;

PR checklist

For all contributions

I've added myself to the list of contributors.
Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
I've added unit tests and made sure they pass locally.
The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.

For new estimators

I've added the estimator to the online documentation.
I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

…d wrote small ipynb test file using check_estimator

review-notebook-app · 2022-06-24T10:47:49Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

fkiraly · 2022-06-24T12:19:28Z

I think I understand the problem now
the test is fine, it intentionally checks for indices that do not start at zero
the problem is with the (inner, statsmodels) dynamic factor model, it discards indices and produces forecasts starting at len(y) rather than at y.index[-1]
the solution would be to manually correct for that and simply overwrite the indices of what comes out of the model

lbventura · 2022-07-10T16:09:31Z

Weirdly pre-commit is failing in "Install and test / code-quality (pull_request)" due to flake8, but worked in my local machine? See the latter below

fkiraly · 2022-07-10T16:47:36Z

You probably don't have all the linting checks installed locally then?
You can always go on "details" here to see what's wrong:

(I know what's wrong, and it is easy to fix - but kindly have a look yourself 😄 )

…to dynamic-factor-tests

sktime/forecasting/dynamic_factor.py

fkiraly

Looks good!

Blocking comments:

we need to remove the notebook testfile.ipynb.
the estimator should be added to the API reference in an appropriate section.

I also left a number of small, non-blocking comments, above.

sktime/forecasting/dynamic_factor.py

lbventura · 2022-07-11T16:37:46Z

the estimator should be added to the API reference in an appropriate section.

Looks good!

Blocking comments:
* we need to remove the notebook `testfile.ipynb`.

* the estimator should be added to the API reference in an appropriate section.
I also left a number of small, non-blocking comments, above.

On the second point, what exactly should be done? One example would be helpful :)

fkiraly · 2022-07-11T16:44:22Z

On the second point, what exactly should be done? One example would be helpful :)

goto docs.source.api_reference.forecasting and add an entry, following the pattern.
If not sure, jump quickly on discord.

fkiraly · 2022-07-11T16:48:30Z

the right section, is that perhaps "Structural time series models"?

lbventura · 2022-07-11T17:23:37Z

the right section, is that perhaps "Structural time series models"?

Only saw this now but yes, it is where I placed it.

fkiraly · 2022-07-11T17:32:18Z

Only saw this now but yes, it is where I placed it.

Well, you managed to do it incorrectly, as the failed doctest tells us. Let me check...

fkiraly · 2022-07-11T17:33:16Z

... odd, looks correct.

fkiraly · 2022-07-11T17:34:54Z

ah, looks like a typo in the "references" section of the new estimator. See "details" next to the doc build, maybe you can find out what exactly the typo is.

lbventura · 2022-07-11T17:34:55Z

... odd, looks correct.

It is complaining that the output is too long? Could it be due to the fact that I changed full_output to True?

fkiraly · 2022-07-11T21:54:56Z

hm, estimator tests are failing now - any idea why? Perhaps you would like to change it back to the last state where they were not?

lbventura · 2022-07-12T16:43:50Z

hm, estimator tests are failing now - any idea why? Perhaps you would like to change it back to the last state where they were not?

The only reasons I can think of are:

Having changed the sciptype from multivariate to both. I reverted back to multivariate;
Having changed the outputs of the model (the full_output point discussed above).

Let us see if the new commit addresses the problem.

fkiraly · 2022-07-12T17:05:22Z

The only reasons I can think of are:

Having changed the sciptype from multivariate to both. I reverted back to multivariate;

But, should it not be both? This estimator can deal with univariate data, can it not?

Having changed the outputs of the model (the full_output point discussed above).

Why did you do this?

The docs are still not building, btw, kindly check for typos.

lbventura · 2022-07-12T17:17:52Z

The only reasons I can think of are:

Having changed the sciptype from multivariate to both. I reverted back to multivariate;

But, should it not be both? This estimator can deal with univariate data, can it not?

This was just a quick edit to see whether the problem comes from here. EDIT: it does, once one changes the scitype to "both", there are several tests in check_estimator which fail. See a few examples below:

I think it should also work for univariate, but I will check. EDIT: it also does not work in statsmodels. y is just a 1D series generated with _make_series.

Having changed the outputs of the model (the full_output point discussed above).

Why did you do this?

The docs were being built fine in a previous commit, reverted to see if this was the change that broke it, but aparently not.

The docs are still not building, btw, kindly check for typos.

Will do

fkiraly · 2022-07-12T20:44:33Z

I think it should also work for univariate, but I will check. EDIT: it also does not work in statsmodels. y is just a 1D series generated with _make_series.

Ah, ok - then the value "multivariate" that you originally had was correct.

lbventura · 2022-07-13T18:21:36Z

Tried to replicate arima.py forecaster docstring, hopefully this solves the readthedocs problem. Unfortunately the build logs are not very clear.

fkiraly · 2022-07-13T21:46:17Z

hopefully this solves the readthedocs problem. Unfortunately the build logs are not very clear.

Yes, looks like it! All green!

fkiraly

Nice, thanks!

lbventura · 2022-07-14T15:53:06Z

Nice, thanks!

Great, on to the next topic (which should be implementing predict_proba if I'm not lazy)!

Implements `predict_interval` for `DynamicFactor`, which was missing in the original PR [#2859](#2859). Interfaces the `get_forecast` and `conf_int` methods predefined in the `statsmodels` package.

Added ris-bali dynamic_factor. Added prints to forecasting testing an…

8fa0795

…d wrote small ipynb test file using check_estimator

lbventura requested review from fkiraly and aiwalter as code owners June 24, 2022 10:47

lbventura and others added 2 commits July 10, 2022 18:04

Passed all local check_estimator checks

792960e

Merge branch 'main' into dynamic-factor-tests

d7c8ec7

lbventura added 2 commits July 10, 2022 19:03

Removed pesky prints in forecasting.py

d13c386

Merge branch 'dynamic-factor-tests' of github.com:lbventura/sktime in…

50d3e44

…to dynamic-factor-tests

fkiraly changed the title ~~Added ris-bali dynamic_factor. Added prints to forecasting testing an…~~ [ENH] statsmodels DynamicFactorModel interface Jul 10, 2022

fkiraly changed the title ~~[ENH] statsmodels DynamicFactorModel interface~~ [ENH] statsmodels DynamicFactor interface Jul 10, 2022