ENH: Allow start_params in GLM #1603

Merged
merged 4 commits into from Apr 18, 2014

Projects

None yet

3 participants

@jseabold
Member

No description provided.

@josef-pkt josef-pkt and 1 other commented on an outdated diff Apr 18, 2014
statsmodels/genmod/generalized_linear_model.py
@@ -338,7 +338,8 @@ def predict(self, params, exog=None, linear=False):
return self.family.fitted(np.dot(exog, params) + exposure + \
offset)
- def fit(self, maxiter=100, method='IRLS', tol=1e-8, scale=None):
+ def fit(self, maxiter=100, method='IRLS', tol=1e-8, scale=None,
+ start_params=None):
@josef-pkt
josef-pkt Apr 18, 2014 Member

we don't guarantee keywords as positional, move start_params to second place, after self

@jseabold
jseabold Apr 18, 2014 Member

News to me. Ok.

@josef-pkt
Member

I think we should also adjust the fit history with the start_params
history = dict(params = [None, None], deviance=[np.inf,dev])

@jseabold
Member

Should I give this a pep-8 scrubbing in this PR?

@josef-pkt
Member

Should I give this a pep-8 scrubbing in this PR?

Yes no problem (just a separate commit). There are no outstanding PRs for this module AFAIR.

@josef-pkt
Member

A better unit test: check convergence requires only <=1 iteration when starting from the optimal parameters. I don't think we need to test everything again.
(some developers complain about the time it takes to run the test suite.)

@coveralls

Coverage Status

Changes Unknown when pulling f510afa on jseabold:glm-start-params into * on statsmodels:master*.

@jseabold
Member

Time to copy and paste a test was more important here.

|10 $ nosetests test_glm.py:Test_start_params
.............
----------------------------------------------------------------------
Ran 13 tests in 0.008s

OK
@coveralls

Coverage Status

Changes Unknown when pulling 77851b8 on jseabold:glm-start-params into * on statsmodels:master*.

@jseabold
Member

I think this should be mu = self.family.fitted(np.dot(exog, start_params)) so that it goes through the inverse transform first and we're not giving linear predictors to the family that expects transformed mu. This allows us to fit problems like #1604 by giving start_params, though it doesn't solve the bad default start_params and the silently failing fit.

@josef-pkt
Member

I think you are right about mu = self.family.fitted(np.dot(exog, start_params))
I never remember which is mu and which is eta. (greek one letter names)
also, a test that doesn't check the number of iterations might not catch this.

@jseabold
Member

I added the test from #1604 which fails without the correct transform. I'm not really sure what else to do about the degenerate case here where the probability on one of the observations is pushed to 1 or 0.

This should be good to merge, and I'll leave that as an open problem. Hilbe comments in that blog post that they have an improved IRLS algorithm that doesn't fail, but I don't know what it is.

@coveralls

Coverage Status

Changes Unknown when pulling d52f144 on jseabold:glm-start-params into * on statsmodels:master*.

@jseabold jseabold merged commit d3c2add into statsmodels:master Apr 18, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@jseabold jseabold deleted the jseabold:glm-start-params branch Apr 18, 2014
@josef-pkt josef-pkt added the PR label Aug 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment