Negative Binomial Rebased #795

jseabold · 2013-04-27T03:41:00Z

Rebased. Fixed merge conflicts. Slight style refactor. Renamed ll -> loglike_method. Fixed alpha vs lnalpha standard error.

jseabold · 2013-04-27T05:52:52Z

I think this is ready to merge. I'll leave it for another look though.

jseabold · 2013-04-27T15:33:28Z

Giving the docs a once over. Any idea why the equations under loglike would show up as right-justified? My latex-fu is not strong, and I don't recall every seeing this with sphinx before.

josef-pkt · 2013-04-27T15:39:46Z

what are the many trailing backslashes for?

In general, equations align on equal sign in a list of equations (at least in scientific workplace)

josef-pkt · 2013-04-27T15:40:49Z

BTW: if there is so much latex, I used sometimes r''' to make it into a raw string to avoid doubling the backslashes

jseabold · 2013-04-27T15:51:06Z

Trailing backslashes indicate a newline.

I can put them in an align environment and they should align, I just found it odd that they show up right aligned. Never seen this, though I guess it might show up other places I just don't know about.

Yeah I changed it to a raw string.

jseabold · 2013-04-27T16:03:57Z

Relevant: http://sphinx-doc.org/latest/ext/math.html#directive-math

jseabold · 2013-04-27T16:52:59Z

Ok. Now should be good to go. Outstanding issues: NB1 fit is slow. I think we should look into Cython for approx_hess, etc. Presumably, these get called a lot and we might get decent speed-ups in optimizations that use them. Need to do some profiling first though.

Merge?

josef-pkt · 2013-04-27T17:01:10Z

statsmodels/discrete/discrete_model.py

-                hess_arr[i,j] = np.sum(-exog[:,i,None]*exog[:,j,None] *\
-                                const_arr, axis=0)
+                hess_arr[i,j] = np.sum(-exog[:,i,None] * exog[:,j,None] *
+                                       const_arr, axis=0)
        hess_arr[np.triu_indices(dim, k=1)] = hess_arr.T[np.triu_indices(dim,


getting the triu_indices into a temp variable would make this nicer (and saves to calculate them twice)

Done. I just rebased, so I'm not going to push until right before I merge if you're still looking things over.

josef-pkt · 2013-04-29T16:33:14Z

the hessian looks painful (did you use sympy to help)

Why did you switch back to bfgs?

Why does the transform not affect the hessian (except for the call to transform)?

You can merge whenever. after travis has reported in.

jseabold · 2013-04-29T16:50:10Z

Yes, the Hessian was painful. I used sympy in the end for dlnL/dalpha dalpha for nb1. I kept screwing up somewhere and got tired of trying. I should've used it from the start, but luckily I'd already done the confusing indexing for the Hessian for nb2, so that wasn't too bad.

Newton doesn't converge well in the NB1 case for alpha. I don't see why, but it's likely a problem with the step-size in our newton.

The Hessian is calculated in Model.fit, so it gets lnalpha but returns the Hessian in terms of alpha, so it's fine. Actually, I can remove the double calculation for nb1. It's not needed anymore.

jseabold · 2013-04-29T16:53:00Z

Put another way, the Hessian is now always the hessian for dlnL/d alpha. Even if you give it lnalpha or alpha, provided that transparams correctly handles lnalpha. Actually come to think of it, we need to set transparam in fit not __init__, so that you can call fit on the same model twice.

josef-pkt · 2013-04-29T17:05:59Z

Ok, I guess I roughly understand transparams is where we evaluate the hessian, but the hessian is always wrt alpha.

nice work (and quite a lot of it)

josef-pkt · 2013-04-29T17:08:45Z

Maybe back to this: Newton expects that the hessian is with respect to lnalpha. If we want to use the hessian during optimization, then we need wrt lnalpha not alpha.

For cov_params we use hessian wrt alpha as you have it now (with transparams).

jseabold · 2013-04-29T17:15:57Z

D'oh. Yeah that makes complete sense. So...we could set _transparams=False for Newton, then we'll get alpha in terms of exp(alpha). This works fine and fast in the test case. Given that the sign is correct in the Hessian my intuition says that it will always converge to the exp(alpha) with alpha > 1. It's kind of a hack, though. Thoughts?

jseabold · 2013-04-29T17:19:33Z

Yeah, it seems to work okay. I think it's alright for now and we can leave the default to bfgs.

We'll need to support transforms like this in a more general way at some point, but I'd rather leave it for the optimize refactor I've been wanting to do anyway.

josef-pkt · 2013-04-29T17:44:52Z

I think using the chain rule alpha = exp(lnalpha), we should be able to get the transformation, the additional terms for the hessian wrt lnalpha.

With some quick calculation, the cross derivative, would just need to be multiplied by alpha, but the second derivative wrt lnalpha now includes and additional first derivative term

d2 f() / (d lnalpha d lnaplha) = d2 f() / (d alpha d aplha) * alpha**2 + d f() / d alpha * alpha

d2 f() / (d alpha d aplha) is current hessian[-1, -1]

Not sure, no intuition on the second derivative wrt lnalpha adjustment.

jseabold · 2013-04-29T17:48:09Z

Hack seems to work okay. We can fix it for real later unless you want to submit a patch. I'm not going to go back through the math and have to change gears for some work I have due next week.

josef-pkt · 2013-04-29T17:50:54Z

if self.loglike_method.startswith('nb') and method != 'newton':

is clean/correct but fragile

if we have other optimizers that use the hessian then they also would need to be included.

What about the score/gradient, does it calculate wrt lnalpha depending on transparams? (I haven't checked yet.)

jseabold · 2013-04-29T17:54:38Z

Newton-conjugate gradient is the only other one that uses the Hessian and needs to be account for.

jseabold · 2013-04-29T18:01:42Z

ncg is still pretty robust. It works both ways, but it's better if exempted like newton.

jseabold · 2013-04-29T19:01:58Z

I'm going to go ahead and merge shortly. We can file a ticket for the score/hessians, but I'm not going to work on it now. Thinking about it some more, it does seem like the score and hessians should return wrt to lnalpha since this is what the optimizer needs to step in the direction of. It still seems to work decently for the gradient-based optimizers, which is what's confusing me. Might get a bit faster convergence and slightly higher precision results if we fix it.

jseabold · 2013-04-29T19:17:45Z

Hmm, actually, nevermind about merging. This may be why we get such low precision accuracy on alpha in most of the tests and should probably be fixed.

josef-pkt · 2013-04-29T19:20:41Z

Are you getting good precision with newton? without transform it should be correct and converge to a nice result (if or when it converges)

jseabold · 2013-04-29T19:26:15Z

Now that I'm looking at it briefly again, I think it's the test results not us. We agree with stata up to a large precision but just not the R results from the unpublished package. I'll update the parameters there and see. It's likely that R is using a hessian approximation or the optimization needed to be tweaked.

jseabold · 2013-04-29T19:36:17Z

Yeah, we agree with Stata up to 8 decimal places. So I'll merge and file a ticket to look at the score/hessian when I get back into this 3 years down the road...?

ENH: Add Negative Binomial Model.

jseabold mentioned this pull request Apr 27, 2013

ENH: Working NBin with more loglike choices #584

Closed

josef-pkt reviewed Apr 27, 2013
View reviewed changes

vincentarelbundock and others added 21 commits April 27, 2013 13:01

ENH: Working NBin with more loglike choices

a0d10b4

ENH: keep full loglik in SM

9cdc78f

ENH: removed traces of nbp

5b04877

ENH: nbin scoreobs

9422d6b

ENH: nbin docstring nb2 vs nb1 vs geom

308db23

ENH: reverting change on exog local shortcut in hessian

0c1f819

ENH: NBin -> NegativeBinomial

fabea24

ENH: nbin pep8

19d260e

TST: nbin tests

3de3a12

ENH: NegativeBinomial predict

a48a532

ENH: NBin pickle test

50c2934

TST: NBin crosscheck R results

8094f36

ENH: NBin formulas

c2aa7f3

DOC: NegativeBinomial example

c530654

TST: fittedvalues

3707d65

TST: smoke tests where R result not avail

1df81b3

STY: Some pep-8 cleanup

be2c0cb

REF: Remove unused NBReg class

3f534d6

STY: Move __init__ and magic methods to top of class

8999332

BUG: Fix super call

1526b1e

STY: Fix bad conflict resolution. My fault.

1638a09

REF: Remove unneeded code now with analytic Hessian

6bcb004

ENH: Guard against stale model being re-fit

6b0f408

jseabold added 2 commits April 29, 2013 13:40

ENH: Hack to make sure newton works with transparams

7aa139a

TST: Test newton for NB1/NB2

ab1c579

ENH: Allow newton conjugate gradient to skip transparams

c614232

TST: Relax precision. Use results from Stata for NB1

f2e651c

jseabold added a commit that referenced this pull request Apr 29, 2013

Merge pull request #795 from jseabold/vincent-nbin

6d95699

ENH: Add Negative Binomial Model.

jseabold merged commit 6d95699 into statsmodels:master Apr 29, 2013

jseabold deleted the vincent-nbin branch April 29, 2013 20:01

josef-pkt mentioned this pull request Jun 25, 2013

discrete Nbin has zero test coverage #256

Closed

jseabold mentioned this pull request Apr 2, 2014

NegativeBinomial strange results with bfgs #1539

Closed

josef-pkt added the PR label Apr 14, 2014

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this pull request Sep 2, 2014

Merge pull request statsmodels#795 from jseabold/vincent-nbin

3f37208

ENH: Add Negative Binomial Model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative Binomial Rebased #795

Negative Binomial Rebased #795

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

josef-pkt commented Apr 27, 2013

josef-pkt commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

josef-pkt Apr 27, 2013

jseabold Apr 27, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

Negative Binomial Rebased #795

Negative Binomial Rebased #795

Conversation

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

josef-pkt commented Apr 27, 2013

josef-pkt commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

jseabold commented Apr 27, 2013

josef-pkt Apr 27, 2013

Choose a reason for hiding this comment

jseabold Apr 27, 2013

Choose a reason for hiding this comment

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013

josef-pkt commented Apr 29, 2013

jseabold commented Apr 29, 2013

jseabold commented Apr 29, 2013