Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
streamline linear algebra for linear model #1081
df_model uses a separate svd to get rank of exog
we can use svd/pinv or qr directly to find the rank of exog, but it would need to be moved to fit instead of initialize.
do we need to check jacobian during WLS iteration in GLM and RLM ?
Related: For nonlinear models we know the rank of the Jacobian only in fit, not during
I fixed a minor duplicate, but I see the following cases as easily improvable:
@josef-pkt if this is what you had in mind, I can submit a PR that does all three.
Thanks. That sounds good to me. We've punted on this, but we might also think about switching to QR by default. I can't remember the exact performance comparisons, but I think QR is
@guyrt Thanks for looking into this.
Yes, the 3 points is what I had in mind.
A PR would be very good for this. I don't expect we have problems in linear models.
As a later followup, we can look at how it affects other possible subclasses, and classes that reuse the linear model in their estimation.
One difference, after moving df_model to the pinv in fit, is that rank will be based on the whitened exog, wexog. which might be a more appropriate choice if df_model based on exog is not the same as dv_model based on wexog.
Switching to QR as default is is a separate issue and not just an internal change it has consequences for the user, that are backwards incompatible. We can switch to it, but it will break with singular matrices which are currently allowed.
More recent scipy (>=0.10 IIRC) have rank revealing pivoting QR.
Ok, I've pushed a set of changes that reduces number of calls to svd or eig from 3 to 1.
Ignore my previous comment about QR: computing the svd of R is very cheap. In particular, since R is square in the number of variables, the complexity of svd does not depend on nobs.