SUMM: generic enhancements GLM, discrete #4759

josef-pkt · 2018-06-23T21:31:23Z

this is just to keep track of current additions, because I lost a bit the overview where what is being added

BUG: GLM score and hessian #4620 fix GLM hessian factor, clarify role of scale and concentrated loglike, no ENH
test cases GLM vs discrete
Penalized mle scad rebased2 #4576 penalized Mixin and penalties
- base Model _fit_zeros and _fit_collinear
- penalized mixin: classes in base, but no public classes yet for specific Models
- base/tests/test_penalized.py cases for Logit, Probit, Poisson and GLM counterparts
- needs review for cov_params
ENH: add ultra-high screening with SCAD #4683 screening
- base._screening, no public functions, needs public Penalized function for Penalized mle scad rebased2 #4576
- base/tests/test_screening.py cases Poisson, Logit, GLM-Poisson GLM-Logit, GLM-Gaussian
- ENH: add discrete poisson resid_pearson
Score/LM conditional moment tests #2096 score/lm and cm tests
- stats/_diagnostic_other.py base functionality for score and conditional moment tests
  still includes Poisson over/excess dispersion tests
- base/_parameter_inference.py score/lm test for attaching to model results classes
- tests/test_score_test.py only GLM-Poisson
- stats/tests/test_diagnostic_other.py only OLS
- ENH: discrete Poisson: add family, score_factor, hessian_factor
- ENH: GPP: add resid_response
ENH: add GLMInfluence #4732 GLMInfluence
- stats/outliers_influence.py MLEInfluence, GLMInfluence
- ENH: GLM: add _deriv_mean_dparams, _deriv_score_obs_dendog derivatives for MLEInfluence
- ENH: GLMResults: add get_hat_matrix_diag and get_influence
- stats/tests/test_influence.py cases GLM-Logit, GLM-Binomial, GLM-Gaussian

still open:

ENH: add score_factor, hessian_factor to discrete #4716 score tests, discrete
- ENH discrete BinaryModel add offset and fit_constrained
- discrete/tests/test_constrained.py case Logit
- GLMResults: add score_test method
- discrete/tests/test_sandwich_cov.py unit tests comparing GLM and discrete counterparts
- ENH: add score_factor and hessian_factor to remaining models (except Negbin) ?
- ...

targets:

add missing pieces to discrete classes
add score_test to no-extra params discrete classes Logit, Probit, Poisson
add get_influence with MLEInfluence for Logit, Probit, Poisson
extend score_test and get_influence to extra params models NBP and GPP
ignore NegativeBinomial ?
future: two part models, ZeroInflated, BetaRegression.
fit_constrained for extra params models ?
public, official Penalized classes, GAM ?
diagnostic tests based on score/lm test

The text was updated successfully, but these errors were encountered:

josef-pkt · 2018-06-24T00:31:59Z

(notebooks mostly on my computer, some in gists, just for me to keep track of open notebooks that I'm shutting down, roughly from newest to oldest in recent work)

influence_glm_logit_short.ipynb basis for notebook in statsmodels, with development code
influence_glm_gaussian.ipynb
try_glm_reset_score_test.ipynb : score and wald test versions or reset test, cov_type HC, Poisson example
ex_cmt_gmm.ipynb GMM version, notebook based on a 2014 script, (I think it was the basis for the included unit test for OLS
ex_cmt_gmm-poisson.ipynb cmt_gmm rewritten for Poisson
score_test_monte_carlo.ipynb monte carlo and developement code for GLM-Poisson score_test
try_screening_class.ipynb SCAD screening in Poisson, nobs, k_vars = 100, 5000, smaller version in unit tests ?
try_screening_class-cleaned.ipynb cleaned version, should be in gist
try_screening_class-cleaned-PR.ipynb cleaned version at end of PR ?
try_glm_screening_class-cleaned.ipynb another version ? with GLMPenalized
scad_penalty.ipynb plot SCAD function and derivatives, during shifting down so SCADSmoothed(0) = 0
binomial_count.ipynb notebook to check extras for GLM-Binomial count, e.g. fit_constrained, fit_regularized (IIRC I fixed some bugs during using this)
try_zero_collinear.ipynb converted from old script, used for checking fit_collinear for Poisson, NegBin and OLS
try_poisson_ultrahigh_screening_1.ipynb converted old script before rewriting _screening, plus some influence experiments
try_poisson_spline_truncpower_2.ipynb converted from old script with many different tries, doesn't work well enough, converting selected knots in truncate power basis to B-splines doesn't look very good in plots. Poisson example with SCAD penalization, outdated code, now SCAD is not the default penalty anymore.
try_poisson_fused_categorical_2.ipynb converted from old script, also uses earlier version of PenalizedMixin with SCAD default, looks good. "Poisson with pairwise categorical penalization"
try_penalized_glm.ipynb early version of Poisson and GLM-Poisson penalized, outdated as previous
ex_theil_ridge.ipynb converted from old script, various example cases for structured L2 penalization, might work with current statsmodels. with comparison of different selection criteria and cov_types

two that were still open after closing tabs:

try_screening_class-cleaned-Copy1.ipynb looks like an extended version of another notebook
openhub_stats.ipynb unrelated to issue, linear trend in LOC count using monthly openhub data.

josef-pkt added type-enh comp-genmod comp-discrete labels Jun 23, 2018

josef-pkt mentioned this issue Sep 4, 2018

Use of Breusch-Godfrey test with any residue series and exog as parameter #5098

Open

josef-pkt mentioned this issue Sep 21, 2018

is sm.open_help ever defined? #5134

Closed

josef-pkt mentioned this issue Dec 31, 2019

SUMM/TST Monte Carlo verification for score tests #6377

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SUMM: generic enhancements GLM, discrete #4759

SUMM: generic enhancements GLM, discrete #4759

josef-pkt commented Jun 23, 2018 •

edited

josef-pkt commented Jun 24, 2018

SUMM: generic enhancements GLM, discrete #4759

SUMM: generic enhancements GLM, discrete #4759

Comments

josef-pkt commented Jun 23, 2018 • edited

josef-pkt commented Jun 24, 2018

josef-pkt commented Jun 23, 2018 •

edited