Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add rsquared for statespace model #4734 #6620

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

BenjaminLiuPenrose
Copy link

Notes:

  • It is essential that you add a test when making code changes. Tests are not
    needed for doc changes.
  • When adding a new function, test values should usually be verified in another package (e.g., R/SAS/Stata).
  • When fixing a bug, you must add a test that would produce the bug in master and
    then show that it is fixed with the new code.
  • New code additions must be well formatted. Changes should pass flake8. If on Linux or OSX, you can
    verify you changes are well formatted by running
    git diff upstream/master -u -- "*.py" | flake8 --diff --isolated
    
    assuming flake8 is installed. This command is also available on Windows
    using the Windows System for Linux once flake8 is installed in the
    local Linux environment. While passing this test is not required, it is good practice and it help
    improve code quality in statsmodels.
  • Docstring additions must render correctly, including escapes and LaTeX.

@pep8speaks
Copy link

pep8speaks commented Apr 4, 2020

Hello @BenjaminLiuPenrose! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-04-17 23:11:02 UTC

@coveralls
Copy link

coveralls commented Apr 4, 2020

Coverage Status

Coverage increased (+0.02%) to 87.707% when pulling 4bead54 on BenjaminLiuPenrose:my_change into 35b04e5 on statsmodels:master.

@codecov
Copy link

codecov bot commented Apr 4, 2020

Codecov Report

Merging #6620 into master will decrease coverage by <.01%.
The diff coverage is 25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6620      +/-   ##
==========================================
- Coverage   85.31%    85.3%   -0.01%     
==========================================
  Files         646      646              
  Lines      103922   103926       +4     
  Branches    11311    11311              
==========================================
+ Hits        88657    88658       +1     
- Misses      12806    12809       +3     
  Partials     2459     2459
Impacted Files Coverage Δ
statsmodels/tsa/statespace/mlemodel.py 87.14% <25%> (-0.2%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35b04e5...7c37307. Read the comment docs.

@BenjaminLiuPenrose
Copy link
Author

where should I write the test case to pass the codecov/patch

@ChadFulton
Copy link
Member

ChadFulton commented Apr 6, 2020

Thanks very much for submitting this PR, @BenjaminLiuPenrose! This will be a really nice feature to have.

I have a couple of suggestions:

  1. For computing the SSE, you're currently using the large-sample approximation of (5.5.3), because self.resid corresponds to v_t and not to \tilde v_t (which is self.standardized_forecasts_error). Instead, why don't we use the finite prediction error variance defined in (5.5.2), so that you would compute the SSE as in (5.5.13):

    d = np.maximum(self.loglikelihood_burn, self.nobs_diffuse)
    srss = np.sum(self.standardized_forecasts_error[0, d:]**2)
    f_T = self.forecasts_error_cov[0, 0, -1]
    sse = f_T * srss
  2. Since there are multiple ways to define the R^2 depending on the baseline model used as a comparison, how about defining one new method, get_rsquared that accepts one argument baseline. baseline could accept either "mean", which is what you currently have as the rsquared attribute, but could also accept "rwdrift", which is what you currently have as rsquared_difference. I think the default value should be "rwdrift". Eventually (or in this PR if you like), we could add a seasonal argument to allow specifying the R^2 that Harvey has in (5.5.17).

  3. Then the two properties that you added would just call that method with the argument baseline="mean", and baseline="rwdrift". I also suggest that you rename the attributes in MLEResults to be rsquared_mean, rsquared_rwdrift.

  4. I would suggest having the rsquared property raise a NotImplementedError, with a message letting users know that in state space models there is not a single comparison model that will always be useful, and suggesting that they use rsquared_rwdrift. You can also mention that rsquared_mean exists, but is not generally recommended. I think this will be helpful, since the basic rsquared_mean is not usually so helpful in time series contexts.

  5. We need to make sure that these don't return erroneous numbers for multivariate models. For this PR, if you want to just support univariate models, that's okay. In that case, you should raise a NotImplementedError if self.model.k_endog > 1. If you want to try to support multivariate models, that's great too!

Thanks again!

@BenjaminLiuPenrose
Copy link
Author

Thanks very much for submitting this PR, @BenjaminLiuPenrose! This will be a really nice feature to have.

I have a couple of suggestions:

  1. For computing the SSE, you're currently using the large-sample approximation of (5.5.3), because self.resid corresponds to v_t and not to \tilde v_t (which is self.standardized_forecasts_error). Instead, why don't we use the finite prediction error variance defined in (5.5.2), so that you would compute the SSE as in (5.5.13):
    d = np.maximum(self.loglikelihood_burn, self.nobs_diffuse)
    srss = np.sum(self.standardized_forecasts_error[0, d:]**2)
    f_T = self.forecasts_error_cov[0, 0, -1] * srss
    sse = f_T * rss
  2. Since there are multiple ways to define the R^2 depending on the baseline model used as a comparison, how about defining one new method, get_rsquared that accepts one argument baseline. baseline could accept either "mean", which is what you currently have as the rsquared attribute, but could also accept "rwdrift", which is what you currently have as rsquared_difference. I think the default value should be "rwdrift". Eventually (or in this PR if you like), we could add a seasonal argument to allow specifying the R^2 that Harvey has in (5.5.17).
  3. Then the two properties that you added would just call that method with the argument baseline="mean", and baseline="rwdrift". I also suggest that you rename the attributes in MLEResults to be rsquared_mean, rsquared_rwdrift.
  4. I would suggest having the rsquared property raise a NotImplementedError, with a message letting users know that in state space models there is not a single comparison model that will always be useful, and suggesting that they use rsquared_rwdrift. You can also mention that rsquared_mean exists, but is not generally recommended. I think this will be helpful, since the basic rsquared_mean is not usually so helpful in time series contexts.
  5. We need to make sure that these don't return erroneous numbers for multivariate models. For this PR, if you want to just support univariate models, that's okay. In that case, you should raise a NotImplementedError if self.model.k_endog > 1. If you want to try to support multivariate models, that's great too!

Thanks again!

Sure. Will work on that

@ChadFulton
Copy link
Member

Note: there was typo in my code computing the sse, but I have corrected it above.

@BenjaminLiuPenrose
Copy link
Author

@ChadFulton how about now?

@ChadFulton
Copy link
Member

That's great, thanks! I will have a change to make a couple of minor comments on the code itself hopefully later today. The only major thing remaining is coming up with some unit tests.

@BenjaminLiuPenrose
Copy link
Author

That's great, thanks! I will have a change to make a couple of minor comments on the code itself hopefully later today. The only major thing remaining is coming up with some unit tests.

sure;)

@BenjaminLiuPenrose
Copy link
Author

@ChadFulton any updates?

statsmodels/tsa/statespace/mlemodel.py Outdated Show resolved Hide resolved
statsmodels/tsa/statespace/mlemodel.py Outdated Show resolved Hide resolved
statsmodels/tsa/statespace/mlemodel.py Outdated Show resolved Hide resolved
@ChadFulton
Copy link
Member

I have a couple of comments, but the main thing now is to get some unit tests. Are you familiar with unit testing?

Do you know of any resources we could test this against, or have you thought about other ways that we could write some tests?

@BenjaminLiuPenrose
Copy link
Author

BenjaminLiuPenrose commented Apr 14, 2020

I have a couple of comments, but the main thing now is to get some unit tests. Are you familiar with unit testing?

Do you know of any resources we could test this against, or have you thought about other ways that we could write some tests?

@ChadFulton guess I will add test_summary_rsquared in \statsmodels\statsmodels\tsa\statespace\tests\test_mlemodel.py,

but what I can do for now is to check r2 is displayed but I don't know the 'ground true' value for each of r2 wrt to the dummy_model

also, can you also point me out how to extend r2 metric for multi-endog variables problem?

@BenjaminLiuPenrose
Copy link
Author

@ChadFulton test case added, it is a trivial case

@BenjaminLiuPenrose
Copy link
Author

@ChadFulton not sure how to print R2 for multivariate case

@ChadFulton
Copy link
Member

This is looking good, thanks!

not sure how to print R2 for multivariate case

I agree that this is a tough call. Currently for other output, like the test statistics, I just print the list, but that is not very pretty. One option would be to create a table in the summary output that displays the R2 for each endog variable in the multivariate case.

@BenjaminLiuPenrose
Copy link
Author

I'm now printing it as np array

@BenjaminLiuPenrose
Copy link
Author

any ideas why the ci/appveyor/pr test failed?

@ChadFulton
Copy link
Member

any ideas why the ci/appveyor/pr test failed?

I'm not sure, but it seems like it's unrelated.

On a side note - this is looking good, and I'm sorry for the delay - it will probably take me a day or two to get back to this, based on other time commitments.

@BenjaminLiuPenrose
Copy link
Author

any ideas why the ci/appveyor/pr test failed?

I'm not sure, but it seems like it's unrelated.

On a side note - this is looking good, and I'm sorry for the delay - it will probably take me a day or two to get back to this, based on other time commitments.

sure, np

@BenjaminLiuPenrose
Copy link
Author

@ChadFulton updates?

Copy link
Member

@ChadFulton ChadFulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some more comments. The only big thing that we need to do before this can be merged is to add unit tests for the rwdrift and seasonal cases.

@@ -2894,6 +2894,76 @@ def zvalues(self):
"""
return self.params / self.bse

def get_rsquared(self, baseline="rwdrift", **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add kwargs later if necessary, but we should avoid it unless we actually need to capture unknown keyword arguments (it can lead to problems with, e.g., misspelled keyword arguments).

statsmodels/tsa/statespace/mlemodel.py Show resolved Hide resolved
statsmodels/tsa/statespace/mlemodel.py Show resolved Hide resolved
statsmodels/tsa/statespace/mlemodel.py Show resolved Hide resolved
endog = np.array([1, 2, 4, 8, 16])
exog = np.array([1, 2, 4, 8, 16])

mod = sarimax.SARIMAX(endog, exog, order=(0, 0, 0), trend='c')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that trend='c' is doing anything here, so we may as well remove it.

exog = np.array([1, 2, 4, 8, 16])

mod = sarimax.SARIMAX(endog, exog, order=(0, 0, 0), trend='c')
res = mod.fit(disp=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to fit the model - instead, you could put this after you fit the benchmark model and use:

res = mod.smooth(benchmark_res.params)

def test_summary_rsquared():
from statsmodels.regression.linear_model import OLS
endog = np.array([1, 2, 4, 8, 16])
exog = np.array([1, 2, 4, 8, 16])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a perfect fitting model, so you should make exog not identical to endog. Also, so that the OLS R^2 matches, exog needs to include a constant column.

endog = np.array([1, 2, 4, 8, 16])
exog = np.array([1, 2, 4, 8, 16])

mod = sarimax.SARIMAX(endog, exog, order=(0, 0, 0), trend='c')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add concentrate_scale=True so you don't have to estimate the variance parameter.

@BenjaminLiuPenrose
Copy link
Author

cases

Yea, I agreed. What do you think will be a good test case for rwdrift and seasonal?

rsquared = 1. - sse / ssm
elif baseline == "seasonal":
from statsmodels.regression.linear_model import OLS
from statsmodels.tools.tools import add_constant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just use AutoReg which directly supports this model specification at a high level.

statsmodels.tsa.ar_model.AutoReg(endog, 0, trend='c', seasonal=True, periods=seasonal)

However, we have implmented `rsquared_rwdrift` and `rsquared_mean`
It is recommended to use `rsquared_rwdrift`
"""
return self.get_rsquared('')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this produce an error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error is intentional, you should directly raise NotImplementedError here. You can pull the common message outside the class to share it.

@cache_readonly
def rsquared_mean(self):
"""
(float or array) conventional R-squared, 1 - sse/ssm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring all have the wrong format. Could you please update to NumPy style?

@bashtage
Copy link
Member

Overall pretty close and a useful contribution. It would be good to get this across the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Add rsquared and variants to state space models
5 participants