Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/ENH: constrained/collinear fit and degrees of freedom for terms, df_terms, anova #8512

Open
josef-pkt opened this issue Nov 8, 2022 · 0 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Nov 8, 2022

followup question to #8506 and #8336

If we (semi-) automatically drop exog columns (e.g. fit_colinear), estimated (pinv) regularized params or fit under constraints, then this can effect hypothesis tests for terms.

question is what is the relevant number of constraints for a joint hypothesis that affect terms.
Hypothesis tests on individual columns or on all params will effectively not be affected, I think. Those hypotheses we already have now.

With categorical exog and two way effects, especially nested effects, we might have zero cells, i.e. a column of zeros.
If we drop those, then the df of the term will be reduced.

AFAIR, we don't have any support for keeping track of df, number of effective parameters for terms (except, AFAIR we added it to GAM)

That is we need effective number of parameters for each term.

im_ratio might apply for pinv OLS
When we only drop columns, then we would just need to keep track of "nonzero" or unrestricted columns.

I don't have any idea (for now) what would be useful for general affine transformation of params as in fit_constrained.

In general, maybe just looking at the matrix rank of the sub-exog or terms would be enough.
I think, I do this already in some specification, lm tests, using rank to determine df_constrained for chi2 test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant