Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

& ris-bali [ENH] statsmodels DynamicFactor interface #2859

Merged
merged 8 commits into from
Jul 14, 2022

Conversation

lbventura
Copy link
Contributor

@lbventura lbventura commented Jun 24, 2022

Reference Issues/PRs

Addresses #2349. There are multiple failed tests due to the behavior of the _predict method, but further exploration in testfile.ipynb suggests that the method's behavior is actually correct. Could be that tests are wrong?

What does this implement/fix? Explain your changes.

Deeper look into #2349 failed tests using check_estimator.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Any other comments?

(At least) Two outstanding points:

  1. Implement the predict_proba method;
  2. Create an example notebook - one cool example would be to forecast economic variables;

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG] indicating whether the PR topic is related to enhancement, maintenance, documentation, or bug.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

…d wrote small ipynb test file using check_estimator
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 24, 2022

I think I understand the problem now
the test is fine, it intentionally checks for indices that do not start at zero
the problem is with the (inner, statsmodels) dynamic factor model, it discards indices and produces forecasts starting at len(y) rather than at y.index[-1]
the solution would be to manually correct for that and simply overwrite the indices of what comes out of the model

@lbventura
Copy link
Contributor Author

Weirdly pre-commit is failing in "Install and test / code-quality (pull_request)" due to flake8, but worked in my local machine? See the latter below
Failing_Test

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 10, 2022

You probably don't have all the linting checks installed locally then?
You can always go on "details" here to see what's wrong:
image

(I know what's wrong, and it is easy to fix - but kindly have a look yourself 😄 )

@fkiraly fkiraly changed the title Added ris-bali dynamic_factor. Added prints to forecasting testing an… [ENH] statsmodels DynamicFactorModel interface Jul 10, 2022
@fkiraly fkiraly changed the title [ENH] statsmodels DynamicFactorModel interface [ENH] statsmodels DynamicFactor interface Jul 10, 2022
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Blocking comments:

  • we need to remove the notebook testfile.ipynb.
  • the estimator should be added to the API reference in an appropriate section.

I also left a number of small, non-blocking comments, above.

@lbventura
Copy link
Contributor Author

the estimator should be added to the API reference in an appropriate section.

Looks good!

Blocking comments:

* we need to remove the notebook `testfile.ipynb`.

* the estimator should be added to the API reference in an appropriate section.

I also left a number of small, non-blocking comments, above.

On the second point, what exactly should be done? One example would be helpful :)

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

On the second point, what exactly should be done? One example would be helpful :)

goto docs.source.api_reference.forecasting and add an entry, following the pattern.
If not sure, jump quickly on discord.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

the right section, is that perhaps "Structural time series models"?

@lbventura
Copy link
Contributor Author

the right section, is that perhaps "Structural time series models"?

Only saw this now but yes, it is where I placed it.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

Only saw this now but yes, it is where I placed it.

Well, you managed to do it incorrectly, as the failed doctest tells us. Let me check...

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

... odd, looks correct.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

ah, looks like a typo in the "references" section of the new estimator. See "details" next to the doc build, maybe you can find out what exactly the typo is.

@lbventura
Copy link
Contributor Author

lbventura commented Jul 11, 2022

... odd, looks correct.

It is complaining that the output is too long? Could it be due to the fact that I changed full_output to True?

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 11, 2022

hm, estimator tests are failing now - any idea why? Perhaps you would like to change it back to the last state where they were not?

@lbventura
Copy link
Contributor Author

hm, estimator tests are failing now - any idea why? Perhaps you would like to change it back to the last state where they were not?

The only reasons I can think of are:

  1. Having changed the sciptype from multivariate to both. I reverted back to multivariate;
  2. Having changed the outputs of the model (the full_output point discussed above).

Let us see if the new commit addresses the problem.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 12, 2022

The only reasons I can think of are:

  1. Having changed the sciptype from multivariate to both. I reverted back to multivariate;

But, should it not be both? This estimator can deal with univariate data, can it not?

  1. Having changed the outputs of the model (the full_output point discussed above).

Why did you do this?

The docs are still not building, btw, kindly check for typos.

@lbventura
Copy link
Contributor Author

lbventura commented Jul 12, 2022

The only reasons I can think of are:

  1. Having changed the sciptype from multivariate to both. I reverted back to multivariate;

But, should it not be both? This estimator can deal with univariate data, can it not?

This was just a quick edit to see whether the problem comes from here. EDIT: it does, once one changes the scitype to "both", there are several tests in check_estimator which fail. See a few examples below:

image

I think it should also work for univariate, but I will check. EDIT: it also does not work in statsmodels. y is just a 1D series generated with _make_series.

image

  1. Having changed the outputs of the model (the full_output point discussed above).

Why did you do this?

The docs were being built fine in a previous commit, reverted to see if this was the change that broke it, but aparently not.

The docs are still not building, btw, kindly check for typos.

Will do

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 12, 2022

I think it should also work for univariate, but I will check. EDIT: it also does not work in statsmodels. y is just a 1D series generated with _make_series.

Ah, ok - then the value "multivariate" that you originally had was correct.

@lbventura
Copy link
Contributor Author

Tried to replicate arima.py forecaster docstring, hopefully this solves the readthedocs problem. Unfortunately the build logs are not very clear.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 13, 2022

hopefully this solves the readthedocs problem. Unfortunately the build logs are not very clear.

Yes, looks like it! All green!

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@fkiraly fkiraly changed the title [ENH] statsmodels DynamicFactor interface & ris-bail [ENH] statsmodels DynamicFactor interface Jul 14, 2022
@fkiraly fkiraly changed the title & ris-bail [ENH] statsmodels DynamicFactor interface & ris-bali [ENH] statsmodels DynamicFactor interface Jul 14, 2022
@fkiraly fkiraly merged commit ef4c6d2 into sktime:main Jul 14, 2022
@lbventura
Copy link
Contributor Author

lbventura commented Jul 14, 2022

Nice, thanks!

Great, on to the next topic (which should be implementing predict_proba if I'm not lazy)!

@fkiraly fkiraly mentioned this pull request Jul 23, 2022
7 tasks
fkiraly pushed a commit that referenced this pull request Jul 30, 2022
Implements `predict_interval` for `DynamicFactor`, which was missing in the original PR [#2859](#2859). Interfaces the `get_forecast` and `conf_int` methods predefined in the `statsmodels` package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants