-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH]: option to return series from lag_spatial #512
Comments
mmmm, I understand the use case. but I'm generally not a fan of a single function returning different objects based on args. It drives me crazy in cuML which returns an object based on an input (so if a cudf object is given it returns cudf, if cupy it returns cupy array...). It sounds like a nice idea but when creating something on top of that, it is a massive pain that you don't know which object comes back and you end up with endless if/else mess. Controlling this behaviour with a keyword is slightly more predictable but still feels weird. Can we just ensure that the array is properly sorted? Or opt-in for always pandas but that deviates from most of PySAL. |
id be curious to hear what the other devs think. Currently the array is properly sorted (according to the ordering of the W), but that can be very confusing if users dont realize their weights and dfs are not aligned. We cant always give back a series because it will break backwards compatibility
in general im with you, but this feels like a very different case where the return is explicit--you need to ask for a series to be provided a series (and the default is false). So building on top feels pretty stable to me? We use this pattern throughout the library (e.g. alpha shapes have at least three potential returns |
to be clear, i think it would be preferable to always return a series, but i know thats a non-starter for legacy code |
As a tangential discussion on |
My sense is we have, historically, decided to assume that in That said, we haven't really provided much in the way of help for the user to be able to check that assumption or to reorder With the new API, the keys of the I'm generally not in favor of a function returning different return types based on different kw settings, so even with the new suggestions above, I would keep the return from |
roundtripping a
W
through an adjacency list will create a valid but re-ordered matrix. That is, all the relationships are preserved with the proper weights, but the ordering of the keys may be different from the index of the dataframe it was created from. In general thats not a problem because all of the internal logic of the W still works fine.But creating a spatial lag from a combo like this (where W and dataframe are paired but in different orders) will generate a series of values that no longer match the df. A simple fix (and a useful enhancement more broadly) would be to add a kwarg to
lag_spatial
that optionally returns a pandas series indexed onW.neighbors.keys
. Should make things a little easier and more explicit without breaking backwards compatThe text was updated successfully, but these errors were encountered: