Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM: generic enhancements GLM, discrete #4759

Open
josef-pkt opened this issue Jun 23, 2018 · 1 comment
Open

SUMM: generic enhancements GLM, discrete #4759

josef-pkt opened this issue Jun 23, 2018 · 1 comment

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Jun 23, 2018

this is just to keep track of current additions, because I lost a bit the overview where what is being added

  • BUG: GLM score and hessian #4620 fix GLM hessian factor, clarify role of scale and concentrated loglike, no ENH
    test cases GLM vs discrete
  • Penalized mle scad rebased2 #4576 penalized Mixin and penalties
    • base Model _fit_zeros and _fit_collinear
    • penalized mixin: classes in base, but no public classes yet for specific Models
    • base/tests/test_penalized.py cases for Logit, Probit, Poisson and GLM counterparts
    • needs review for cov_params
  • ENH: add ultra-high screening with SCAD #4683 screening
    • base._screening, no public functions, needs public Penalized function for Penalized mle scad rebased2 #4576
    • base/tests/test_screening.py cases Poisson, Logit, GLM-Poisson GLM-Logit, GLM-Gaussian
    • ENH: add discrete poisson resid_pearson
  • Score/LM conditional moment tests #2096 score/lm and cm tests
    • stats/_diagnostic_other.py base functionality for score and conditional moment tests
      still includes Poisson over/excess dispersion tests
    • base/_parameter_inference.py score/lm test for attaching to model results classes
    • tests/test_score_test.py only GLM-Poisson
    • stats/tests/test_diagnostic_other.py only OLS
    • ENH: discrete Poisson: add family, score_factor, hessian_factor
    • ENH: GPP: add resid_response
  • ENH: add GLMInfluence #4732 GLMInfluence
    • stats/outliers_influence.py MLEInfluence, GLMInfluence
    • ENH: GLM: add _deriv_mean_dparams, _deriv_score_obs_dendog derivatives for MLEInfluence
    • ENH: GLMResults: add get_hat_matrix_diag and get_influence
    • stats/tests/test_influence.py cases GLM-Logit, GLM-Binomial, GLM-Gaussian

still open:

  • ENH: add score_factor, hessian_factor to discrete #4716 score tests, discrete
    • ENH discrete BinaryModel add offset and fit_constrained
    • discrete/tests/test_constrained.py case Logit
    • GLMResults: add score_test method
    • discrete/tests/test_sandwich_cov.py unit tests comparing GLM and discrete counterparts
    • ENH: add score_factor and hessian_factor to remaining models (except Negbin) ?
    • ...

targets:

  • add missing pieces to discrete classes
  • add score_test to no-extra params discrete classes Logit, Probit, Poisson
  • add get_influence with MLEInfluence for Logit, Probit, Poisson
  • extend score_test and get_influence to extra params models NBP and GPP
  • ignore NegativeBinomial ?
  • future: two part models, ZeroInflated, BetaRegression.
  • fit_constrained for extra params models ?
  • public, official Penalized classes, GAM ?
  • diagnostic tests based on score/lm test
@josef-pkt
Copy link
Member Author

(notebooks mostly on my computer, some in gists, just for me to keep track of open notebooks that I'm shutting down, roughly from newest to oldest in recent work)

influence_glm_logit_short.ipynb basis for notebook in statsmodels, with development code
influence_glm_gaussian.ipynb
try_glm_reset_score_test.ipynb : score and wald test versions or reset test, cov_type HC, Poisson example
ex_cmt_gmm.ipynb GMM version, notebook based on a 2014 script, (I think it was the basis for the included unit test for OLS
ex_cmt_gmm-poisson.ipynb cmt_gmm rewritten for Poisson
score_test_monte_carlo.ipynb monte carlo and developement code for GLM-Poisson score_test
try_screening_class.ipynb SCAD screening in Poisson, nobs, k_vars = 100, 5000, smaller version in unit tests ?
try_screening_class-cleaned.ipynb cleaned version, should be in gist
try_screening_class-cleaned-PR.ipynb cleaned version at end of PR ?
try_glm_screening_class-cleaned.ipynb another version ? with GLMPenalized
scad_penalty.ipynb plot SCAD function and derivatives, during shifting down so SCADSmoothed(0) = 0
binomial_count.ipynb notebook to check extras for GLM-Binomial count, e.g. fit_constrained, fit_regularized (IIRC I fixed some bugs during using this)
try_zero_collinear.ipynb converted from old script, used for checking fit_collinear for Poisson, NegBin and OLS
try_poisson_ultrahigh_screening_1.ipynb converted old script before rewriting _screening, plus some influence experiments
try_poisson_spline_truncpower_2.ipynb converted from old script with many different tries, doesn't work well enough, converting selected knots in truncate power basis to B-splines doesn't look very good in plots. Poisson example with SCAD penalization, outdated code, now SCAD is not the default penalty anymore.
try_poisson_fused_categorical_2.ipynb converted from old script, also uses earlier version of PenalizedMixin with SCAD default, looks good. "Poisson with pairwise categorical penalization"
try_penalized_glm.ipynb early version of Poisson and GLM-Poisson penalized, outdated as previous
ex_theil_ridge.ipynb converted from old script, various example cases for structured L2 penalization, might work with current statsmodels. with comparison of different selection criteria and cov_types

two that were still open after closing tabs:

try_screening_class-cleaned-Copy1.ipynb looks like an extended version of another notebook
openhub_stats.ipynb unrelated to issue, linear trend in LOC count using monthly openhub data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant