-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add multilink models and distribution.dfamilies #7793
base: main
Are you sure you want to change the base?
Conversation
Hello @josef-pkt! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-10-14 17:42:06 UTC |
This pull request introduces 1 alert when merging 20f8bc0 into a5ec3cb - view on LGTM.com new alerts:
|
This pull request introduces 1 alert when merging 438f79c into a5ec3cb - view on LGTM.com new alerts:
|
older scipy don't have stats.betabinom, e.g. scipy==1.3.3 Not worth fixing with solution for now: add attribute to scipy otherwise on pep-8 style check failure |
all green here |
This pull request introduces 1 alert when merging a523aad into a5ec3cb - view on LGTM.com new alerts:
|
funny idea: constant like, ilink/transformation function that returns a constant. That would be one way of encoding args with fixed value. But there are no params in that case, i.e. linpred would be empty. |
We need a For other models we have to decide whether to provide a predicted mean. Not too difficult for scipy distributions. |
This pull request introduces 1 alert when merging 2c30dea into 123cca1 - view on LGTM.com new alerts:
|
I copied most parts over from #7778, not included is the Het base model. This is mostly obsolete with the more flexible MultiLinkModel. We might create a subclass for MultiLinkModel for loc/mean - scale HET models, mainly for name recognition and start_params. This still includes the full THet model because it's the only one with a fixed arg adjustments are for changed paths and subclassing MultiLinkModel instead of Het Model |
a unit test on one machine fails statsmodels\regression\tests\test_quantile_regression.py:319 |
Stata has hetregress, MLE and 2 step FGLS, only since version 15. I only have version 14 |
In R: it looks like https://search.r-project.org/CRAN/refmans/crch/html/crch.html looks useful estimate by MLE or another method that I never heard about. |
large sample results compared to GLSHet are in #647 (comment) |
parking example dataset for two-way anova (while shutting down windows)
from Martin J. Crowder, 1978, Beta-Binomial Anova for Proportions |
statsmodels/othermod/base_model.py
Outdated
# TODO: here or in __init__ | ||
self.k_vars = self.exog.shape[1] | ||
self.k_params = (self.exog.shape[1] + self.exog_scale.shape[1] + | ||
self.k_extra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assumes we don't have exog for the extra
This pull request introduces 2 alerts when merging 76e9b75 into 63a32e9 - view on LGTM.com new alerts:
|
If we want to generically support a Binomial count MLE model, then we would need that the MultiLinkModel supports also the case when we have only a single darg, link and exog. Aside Zipf/Zipfian distribution is also a one parameter count distribution, with possibly finite support 0, ... n |
decision: (i)links go into the
I might still add a model.links attribute as shortcut and consistency with models that don't have (d)families. I'm still not sure how to handle a variable number of exogs and offsets. If I put them in a list or tuple, then super in base datahandling will not handle missing values across data arrays. As reference implementation: |
I think the way forward here is to keep this PR, and make new PRs with refactored versions. |
postponed for now, likely for 0.15 (I had applied for a small grant, but my proposal wasn't good enough to get it) |
closes #7778
This will be a continuation of #7778
now starting with distribution dfamilies
similar to genmod families but focused on MLE with scipy optimizers
I'm planning to add cleaner parts of #7778 based on MultiLinkModel to this PR