Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
add_constant incorrectly detects constant column #1025
@ajmarks Out of curiousity: What's the reason that you are z-scoring?
I was wondering several times if we should offer it, or if we should make standardized beta coefficients available in the results class, but until now I haven't seen any strong reason to do it. (Not strong enough to put it on top of my priorities even though I have large parts of the code.)
When I'm running exploratory multiple regressions, I like to do it all z-scored so that I can see quickly how much of the variation in the endog comes from which endog. In particular, my current dataset has variables varying hugely in scale (i.e. some on the order of 1e9 and some on the order of 1e2), so this stops me having to look up each time and see if a parameter of 5e-7 is tiny or huge relative to its variable. The ironic thing is that since any linear regression will pass through [mean(y), ...mean(x_i)...]^t, the const regression parameter will be zero when it's run on z-scored data, but some of my other code assumes the constant is there.
It also helps when I want to plot a bunch of stuff next to each other. For dataframes, it's super easy to z-score:
def zscore(df): if np.any(df.var(0) == 0): raise Exception("Ain't gonna work") return (df-df.mean())/df.std()
Thanks for the explanation, I'm always interested in hearing different approaches and practices for this (which vary widely across fields).
Ok, that sounds like a case where standardized beta coefficients in the results would be useful. Although, if z-scoring is easy and you don't need the original and the standardized params at the same time, then it's not really important to have it.
Aside: standardized beta coefficients show "how much of the variation in the endog comes from which endog" only if the exog/design matrix is orthogonal, otherwise correlation between exogs prevents the exact variance decomposition.