iolib not found #66

agramfort · 2011-09-01T14:27:50Z

/Users/alex/local/lib/python2.7/site-packages/scikits/statsmodels/genmod/generalized_linear_model.py in summary(self, yname, xname, title, returns)
659 """
660 import time as Time
--> 661 from iolib import SimpleTable
662 from stattools import jarque_bera, omni_normtest, durbin_watson
663

ImportError: No module named iolib

it looks like a relative import problem.

josef-pkt · 2011-09-03T13:36:28Z

The short answer, the import path needs to be adjusted, and I think Skipper did it in the pandas-integration path.

The longer answer: Summary() for other models is where Vincent was working at the end in his branch. I didn't have time to look at it, (still on launchpad and needs manual merge) and I think there are no tests for any summary methods in the test suite.

I still need to check what the status of summary for GLM, and RLM is.

josef-pkt · 2011-09-03T13:47:46Z

from scikits.statsmodels.iolib import SimpleTable
from scikits.statsmodels.stats.stattools import jarque_bera, omni_normtest, durbin_watson

but the version in 0.3 looks unfinished, and two extra '=='

agramfort · 2011-09-04T23:24:04Z

thanks for the feedback. I was looking for a method to get p-values on regression coefs out of a logistic regression. If you have a simple solution for urgent needs that would be great.

josef-pkt · 2011-09-05T00:02:11Z

results.pvalues ? binom_results.tvalues binom_results.pvalues using examples/example_glm.py

the parameter summary table works for GLM after fixing the path

should also be available using logit in discrete

josef-pkt · 2011-09-05T00:08:42Z

I'm working here, but it will still take some time
https://github.com/josef-pkt/statsmodels/commits/summary-refactoring

josef-pkt · 2011-09-05T02:26:32Z

while comparing summary methods with R, I saw that pvalues in R are based on normal distribution, the pvalues in statsmodels glm are based on t distribution, tvalues are identical.

example from R help file

values in R

>>> stats.norm.sf(np.abs(res.tvalues))*2
array([  5.42677102e-71,   1.00000000e+00,   1.00000000e+00,
         2.46471164e-02,   1.28486515e-01])

values in statsmodels

>>> stats.t.sf(np.abs(res.tvalues), res.df_resid)*2
array([  5.83392507e-05,   1.00000000e+00,   1.00000000e+00,
         8.79477411e-02,   2.03120034e-01])

very small sample, 9 observations, 5 regressors including constant, so the difference between t and norm is pretty large

>>> res.df_resid
4

agramfort · 2011-09-05T21:22:09Z

maybe a naive question but it seems I get crazy pvalues with :

import numpy as np
import scikits.statsmodels.api as sm
from sklearn.datasets import load_iris

iris = load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
X = sm.add_constant(X)
results = sm.GLM(y, X, family=sm.families.Binomial()).fit()
print results.pvalues

In [42]: run test_logreg_pvalues.py
[ 0.9997972 0.99971177 0.99951518 0.99963353 0.99996525]

what am I missing? thx

josef-pkt · 2011-09-05T21:45:26Z

I don't seem to have scikits.learn available right now on my computer.
Looking at the csv files, the effect of the features looks big. If you check the means between groups 0 and 1, they all look pretty different. So I would expect the pvalues to be large, but 0.999 looks very large, larger than I would expect.

just one guess, statsmodels binomial has a problem if there is perfect prediction. (I will have to look up the details in this case.) Are there any observations that are misclassified? Or what is the fraction of misclassified observations?

I don't have any other idea, until I look at the data.

agramfort · 2011-09-05T22:05:58Z

you can access the data here:

http://mldata.org/repository/data/viewslug/iris/

josef-pkt · 2011-09-05T22:33:37Z

>>> np.max(np.abs(results.fittedvalues - y))
3.3864987480924924e-09

looks like perfect fit,

I also tried discrete.Logit, but the numbers there don't seem to make sense.

I just had a very fast look, so it's still possible that something else is going on.

In the complete separation case the likelihood function has some problems, not finite or the wrong curvature.
We discussed this, but we haven't done anything about it. If I remember correctly except for warning the user (and stopping the maximization) there is not much we can do.

agramfort · 2011-09-05T22:36:31Z

thanks for taking a look. A warning would definitely be helpful. Out of curiosity how does R behave in this degenerated case? They might have a trick.

josef-pkt · 2011-09-05T22:42:26Z

I never looked at this case in R.
Bruce provided links to the SAS documentation, and they stop the maximization after detecting the problem without waiting to go off to infinity or hit the maxiter for the optimization, and warn the user.

josef-pkt · 2011-09-05T22:48:16Z

pvalues look more reasonable with misclassifying some observations

pvalues with y[:5] = 1
[ 0.80703843  0.39039945  0.18388126  0.91594413  0.86159542]
pvalues with y[:10] = 1
[ 0.32384945  0.71572876  0.07024508  0.95266349  0.47546027]

agramfort · 2011-09-06T00:45:39Z

better indeed. It would be great to reproduce SAS behavior.

josef-pkt · 2011-09-29T03:42:04Z

summary refactoring has been merged, see #76 and related tickets

for perfect separation, see old ticket #39
I added a warning text to the summary method of Logit and Probit

full resolution with warning or exception needs more refactoring, see perfect prediction branch

ghost assigned josef-pkt Sep 5, 2011

josef-pkt closed this as completed Sep 29, 2011

singaldhruv mentioned this issue Feb 21, 2023

BUG: Typo in argument name for HAC standard errors #8698

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iolib not found #66

iolib not found #66

agramfort commented Sep 1, 2011

josef-pkt commented Sep 3, 2011

josef-pkt commented Sep 3, 2011

agramfort commented Sep 4, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 6, 2011

josef-pkt commented Sep 29, 2011

iolib not found #66

iolib not found #66

Comments

agramfort commented Sep 1, 2011

josef-pkt commented Sep 3, 2011

josef-pkt commented Sep 3, 2011

agramfort commented Sep 4, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

josef-pkt commented Sep 5, 2011

agramfort commented Sep 6, 2011

josef-pkt commented Sep 29, 2011