-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ml_lag example failing? #33
Comments
i should add the example is copied directly from this test and that by "every time" i mean in new environments on 3 different machines (linux/osx) |
I can replicate locally |
I can look into this today or tomorrow. Just out of curiosity, why are you using ML_Lag instead of GM_Lag? We didn't pay much attention to the ML code because the only situation when GM is not preferable to it is when data is small and happens to be normally distributed, which is quite a very rare combination of events. =P |
I wasn’t actually after Ml_lag specifically, just following up on https://gist.github.com/knaaptime/9b56bf676f116472efc20cde03e6c4ec to see if I could estimate any model (and it turns out no, almost all the model classes use pearsonr so this bug affects everything I tried except ols) |
@pedrovma it doesn't matter; it happens with GM_Lag, GM_Error, GM_Combo, ML_Error.... anywhere where we use @knaaptime, to really reproduce the issue, use the from spreg import BaseML_Lag
basemodel = BaseML_Lag(y, x, w=w)
from scipy.stats import pearsonr
pearsonr(basemodel.y, basemodel.predy) If you flatten the array, it works: pearsonr(basemodel.y.flatten(), basemodel.predy.flatten()) It looks like scipy's started getting strict about column vectors versus flat vectors. This is indicated in the documentation and appears to have been the case for a while, but I'm not sure why the behavior finally changed. |
I've had other issues with the column vector rule every now and then... We do require in spreg that all of our column vectors have two explicit dimensions. This was basically done centuries ago to make sure no silly np.dot or sparse multiply would occur. Given the way spicy is now and that the output of a pandas column vectors also has one dimension only, do you think it may be the case we rethink all this 2 dimensions requirement? |
I figured it was something like that. But why in the world are tests passing? and dont these lines handle the reshaping like scipy needs? |
I didn't mean to close this, just merge =( |
it might be useful to relax the 2dim requirement. My impression is most people will have their data in a (geo)DataFrames so the most convenient interface is one that either (a) accepts dfs directly, or (b) accepts data that's easily converted from a df. A pandas series is a 1d array, so allowing 1d for alternatively, we could leave as is and handle some of the conversion in something like the patsy dispatcher (the simple version of which is now working great after the |
related, do you think that |
I think a point release makes sense. I'll try to cut one today or tomorrow. If another maintainer can do it faster, have at it! Also, I think forcing n,1 is fine, but we should use the .squeeze() method any time we call into a function like the rsquared |
I've submitted an "in-between" approach to the n,1 issue: the verification class no longer requires a n,1 array, but it will try to force n,1 to any single dimension array. This way we have an easier life for [geo]pandas users without changing our main code. I have no idea how to make a point release, so if you could do it, I would really appreciate it. =| |
awesome. I don't think i have maintainer rights on pypi, so I cant do the release, but if you want to add me I'll be happy to do it |
interesting note while i prepare this: v1.0.4 is the latest released on pypi and conda, but we're on 1.1.0 (which is currently on RTD). This release will bump us to 1.1.1 |
Wow. The latest changelog in git is also 1.0.4 (https://github.com/pysal/spreg/blob/master/changelog_1.0.4.md) |
1.1.0 was released on github it just didnt make it to pypi |
everything required by a release is scripttable
so pretty sure I can wrap that logic up in a makefile. will try some expiriments this weekend |
so here's something weird. In a brand new environment, (numpy 1.18.1, scipy 1.4.1, spreg 1.0.4) i have all tests passing locally. But if I try to recreate one of the examples:
it will fail when calling
scipy.stats.pearsonr
from my own testing, that error usually comes from passing
pearsonr
arrays that are the wrong shape, but everything is correct in the spreg code and hansn't been touched--and as i said before tests are passing. Can others reproduce this? It happens for me every timeThe text was updated successfully, but these errors were encountered: