We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
based on a stackoverflow question
because of numerical precision problems the pinv singular treatment doesn't kick in early enough for essentially singular design matrices.
The below script return huge non-singular numbers for about a third of the runs. Note: we are printing the warning in summary()
summary()
for example
>>> np.linalg.pinv(mod.exog).dot(mod.endog) array([ 2.5 , 3.59375 , 4.296875 , 2.9140625, 3.8046875]) >>> np.linalg.pinv(mod.exog, rcond=1.5e-15).dot(mod.endog) array([ 1.6, 2.6, 3.6, 3.4, 4.4])
qr produces similar ill determined results
qr
possible solution:
# -*- coding: utf-8 -*- """ Created on Wed Sep 16 00:44:55 2015 Author: http://stackoverflow.com/questions/32599159/robustness-issue-of-statsmodel-linear-regression-ols-python """ import numpy as np import pandas as pd from statsmodels.formula.api import ols seed = np.random.randint(100000) #seed = 62802 #df_model=4, bad numbers #seed = 46269 #df_model=4, bad numbers38680 print(seed) np.random.seed(seed) nbData = 600 rand1 = np.random.uniform(size=nbData) rand2 = np.random.uniform(size=nbData) a = 1 * (rand1 <= (1.0/3.0)) b = 1 * (((1.0/3.0)< rand1) & (rand1< (4/5.0))) c = 1-b-a d = 1 * (rand2 <= (3.0/5.0)) e = 1-d weigths = [1,2,3,1,2] y = a+2*b+3*c+4*d+5*e df = pd.DataFrame({'y':y, 'a':a, 'b':b, 'c':c, 'd':d, 'e':e}) mod = ols(formula='y ~ a + b + c + d + e - 1', data=df) res = mod.fit() print(res.summary()) # try changing pinv threshold print(np.linalg.pinv(mod.exog).dot(mod.endog)) print(np.linalg.pinv(mod.exog, rcond=1.5e-15).dot(mod.endog))
The text was updated successfully, but these errors were encountered:
No branches or pull requests
based on a stackoverflow question
because of numerical precision problems the pinv singular treatment doesn't kick in early enough for essentially singular design matrices.
The below script return huge non-singular numbers for about a third of the runs.
Note: we are printing the warning in
summary()
for example
qr
produces similar ill determined resultspossible solution:
The text was updated successfully, but these errors were encountered: