Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG Negbin fit regularized #1623

Merged
merged 14 commits into from Jul 22, 2014

Conversation

Projects
None yet
3 participants
@josef-pkt
Copy link
Member

commented Apr 22, 2014

adds fit_regularized method to NegativeBinomial
fixes #1453 and #1454

note:
code is mostly copy paste see #1615
took me a while to figure out copy/paste/adjust errors for example #1617

one issue not clear:
NegativeBinomial transforms parameters in some cases, I chose not to: self._transparams = False

TODO: how can I adjust the docstring?
by default if alpha is scalar, then the penalization is not applied to the extra, shape parameter alpha of nb1 and nb2

unittests follow the same pattern as the other L1 test, compare reduced unregularized estimation with regularized estimation that doesn't penalize the original parameters

@@ -524,37 +524,45 @@ class CheckL1Compatability(object):
def test_params(self):
m = self.m

This comment has been minimized.

Copy link
@jseabold

jseabold Apr 22, 2014

Member

Can use use self.res_reg.model.k_exog here to make sure it's being used systematically?

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Apr 22, 2014

Author Member

m is a setting for the test, it is here the number unpenalized beta params,
m=k_exog in the reduced version but m < k_exog in the regularized version.

in my interpreter session I don't have a model.k_exog. where is k_exog set?

This comment has been minimized.

Copy link
@jseabold

jseabold Apr 22, 2014

Member

Right. From the description I thought that the tests were testing the unregularized vs. regularized without any actual regularization.

It looks like k_exog is only systematically set in ARIMA so far. Should be moved up to base. Still TODO in #487.

assert_equal(self.res_unreg.df_resid, self.res_reg.df_resid)
extra = len(self.res_reg.params) == (self.res_reg.model.exog.shape[1] + 1)
assert_equal(self.res_unreg.df_model + extra, self.res_reg.df_model)
assert_equal(self.res_unreg.df_resid - extra, self.res_reg.df_resid)

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Apr 22, 2014

Author Member

this looks wrong both, reg and unreg, have the extra parameter
L1 counts shape parameter in df_model and df_resid, NegativeBinomial does't
?

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2014

I'm giving up on #1624 for this PR

temporary solution:
add k_extra as attribute to NegativeBinomialModel, and adjust df_model and df_resid in L1CountResults so it isn't included.
that leaves df_model and df_resid at it's current non-L1 definition for nb1 and nb2

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2014

TravisCI fails in all versions
looks like a optimization problem in unregularized fit with nb2, gets a nan in params
second failure is small precision issue

two possible problems in this PR for optimization:

  • we use the full Poisson fit as starting value, which might not be very good if there are many extra variables.
  • we don't transform alpha during optimization: I haven't checked if we avoid negative alpha.
@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2014

still red
precision failure is 2405.88524487 / 2405.85274708 - 1 = 1.3507805097123793e-05 ("amended")
why do we check f_test anyway? just to make sure some results are correct (we use normal distribution)

no idea about optimization problem, which param is nan ?

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2014

all green when I'm cheating with the start_params

@josef-pkt josef-pkt referenced this pull request Apr 22, 2014

Open

Poisson start_params #1511

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Apr 22, 2014

ready to merge, I leave the start_params to another PR.

according to the network graph, this didn't branch off from master, might need rebase before merging

@josef-pkt josef-pkt added PR labels May 22, 2014

@josef-pkt josef-pkt added this to the 0.6 milestone May 22, 2014

@coveralls

This comment has been minimized.

Copy link

commented Jul 21, 2014

Coverage Status

Coverage increased (+1.63%) when pulling 0739786 on josef-pkt:negbin_fit_regularized into b52bc09 on statsmodels:master.

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Jul 21, 2014

That's better, after standardizing exog (mean=0, std=1) TravisCI manages to estimate correctly. No more exceptions.
Note: data transformation changes the penalization here because I didn't correct for it by adjusting alpha.

#1649 example with convergence or optimization failure

another review of this PR and it is ready to merge

full_output=full_output, disp=disp, callback=callback,
alpha=alpha, trim_mode=trim_mode, auto_trim_tol=auto_trim_tol,
size_trim_tol=size_trim_tol, qc_tol=qc_tol, **kwargs)
if method in ['l1', 'l1_cvxopt_cp']:

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Jul 21, 2014

Author Member

input/argument checking should be at the beginning before we calculate anything.


if self.loglike_method.startswith('nb') and (np.size(alpha) == 1 and
alpha != 0):
# don't penalize alpha if alpha is scalar

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Jul 21, 2014

Author Member

2 different alpha !

alpha_p = alpha[:-1] if (self.k_extra and np.size(alpha) > 1) else alpha

self._transparams = False
if start_params == None:

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Jul 21, 2014

Author Member

is None

alpha=0, trim_mode='auto', auto_trim_tol=0.01, size_trim_tol=1e-4,
qc_tol=0.03, **kwargs):

if self.loglike_method.startswith('nb') and (np.size(alpha) == 1 and

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Jul 21, 2014

Author Member

we have now self.k_extra as counter for extra params, instead of self.loglike_method.startswith('nb')

start_params[-1] = cls.res_unreg.params[-1]

mod_reg = sm.NegativeBinomial(rand_data.endog, rand_exog)
cls.res_reg = mod_reg.fit_regularized(#start_params=start_params,

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Jul 21, 2014

Author Member

commented out start_params - clean up
start_params are not used anymore in here.

@coveralls

This comment has been minimized.

Copy link

commented Jul 22, 2014

Coverage Status

Coverage increased (+0.02%) when pulling 6544287 on josef-pkt:negbin_fit_regularized into 40bd7e0 on statsmodels:master.

josef-pkt added a commit that referenced this pull request Jul 22, 2014

Merge pull request #1623 from josef-pkt/negbin_fit_regularized
BUG Negbin fit regularized missing

@josef-pkt josef-pkt merged commit a40711c into statsmodels:master Jul 22, 2014

2 checks passed

continuous-integration/appveyor AppVeyor build succeeded
Details
continuous-integration/travis-ci The Travis CI build passed
Details

@josef-pkt josef-pkt deleted the josef-pkt:negbin_fit_regularized branch Jul 22, 2014

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this pull request Sep 2, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.