Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for RegressionModel #121

Open
Nosferican opened this issue Nov 30, 2017 · 7 comments
Open

Tests for RegressionModel #121

Nosferican opened this issue Nov 30, 2017 · 7 comments

Comments

@Nosferican
Copy link

Nosferican commented Nov 30, 2017

I wanted to know if this package would be a good place to develop a suite of tests for Regression Models or if it should be a different one.

The idea would be to provide a few tests such as:

struct WaldTest <: HypothesisTest
    value::Float64
    dist::FDist
    function WaldTest(model::StatsBase.RegressionModel;
                      restrictions::Union{Matrix,Vector{Symbol}} = coefnames(model))
    ...
end

These tests would make use of the methods provided by StatsBase as much as possible. A few that might be implemented are:

Hypothesis tests

[ ] Wald test
[ ] Lagrange multiplier test (LM test)
[ ] Likelihood ratio test (LR test)
[ ] Sargan Hansen test
[ ] Durbin Wu Hausman test
[ ] Chow test

Whenever possible it would compute the robust version.

@andreasnoack
Copy link
Member

I don't know if the scope of this package has ever been defined but I've typically thought of it as a collection of functions for testing hypotheses on fairly "simple" parameters. Mostly related the mean. However, the time series section doesn't really follow this "rule" and is also more econometrics inspired. If we add regression tests to this package, a regression package (most likely GLM) should probably be added as a dependency so maybe this is not the right place for these.

An alternative could be to make a more batteries-included econometrics package with all of the typical econometrics textbook tools handy. Because, while I believe 1-3 are generally used across branches of statistics (although 2 under a different name), I haven't seen 4-6 used outside econometrics.

@Nosferican
Copy link
Author

Well the idea is to have them work for any StatsBase.RegressionModel so hopefully no regression package will be added as a dependency. Packages would just have to implement the methods in StatsBase and maybe one or two additional methods at most. I could probably write a package for RegressionTests and host the regression / econometrics / time series ones. Those when applicable could call methods defined here if suitable.

Would it possible to maybe defined testname, pvalue in StatsBase and maybe a hierarchy there for tests if it makes sense?

@nalimilan
Copy link
Member

I don't think that's the right package. At least for tests as general as LRT and Wald, I think it would be fine to have them in StatsBase, or maybe in StatsModels if we decide it's not just about translating formulas into model matrices. For others, I'm not sure. Could they be defined with only a limited set of methods part of the RegressionModel interface?

@Nosferican
Copy link
Author

I believe the tests could be structs that take a RegressionModel (and maybe some keyword arguments) as constructors and have two fields: value and a distribution and other values or methods for summary data (testname, restrictions, null hypothesis, etc.). The regression tests usually run some regression and then test some restrictions which result in some Wald Test or LR test. Thus StatsBase could have the hypothesis tests (first three) and regression tests are wrappers that construct one of the basic ones. Same for Breush-Pagan and a few others already implemented.

@nalimilan
Copy link
Member

Yes, but what methods do they need the input model object to implement?

Also see how the F-test has been implemented recently in GLM: JuliaStats/GLM.jl#182. We should probably take a similar approach for the LRT, while other tests could work differently since there are no models to compare IIUC.

@Nosferican
Copy link
Author

I would have to work a bit on those and get back to you. For most I believe modelmatrix, residuals, dof_residual, coef, coefnames, vcov, deviance, loglikelihood, etc... for other models we might have to use feed it multiple RegressionModel. I implemented Wald test before GLM got theirs so I haven’t check theirs yet. For all tests I tend to go for the robust version (usually à la Wooldridge). In many cases it is a generalization so they coincide if one uses the spherical error and independence assumptions, and correct when using a robust vcov.

@Nosferican
Copy link
Author

@lbittarello I saw that your package also had a Hausman test implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants