Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDtree type mismatch in knnW #771

Closed
ljwolf opened this issue Mar 23, 2016 · 7 comments
Closed

KDtree type mismatch in knnW #771

ljwolf opened this issue Mar 23, 2016 · 7 comments
Assignees

Comments

@ljwolf
Copy link
Member

ljwolf commented Mar 23, 2016

From pysal-dev:

When I am running the following code for weight generation

import pysal
import numpy as np
x,y = np.indices((5,5))
x.shape = y.shape = (25,1)
data = np.hstack([x,y])
wknn3 = pysal.knnW(data, k=3)
wknn3.neighbors[0]
wknn.s0

then I am getting the following error:

Traceback (most recent call last):
  File "C:\Python27\FOSS4G NA 2015 - Spatial Data Analysis in Python\PySAL_Workshop\mine\knnweight.py", line 9, in <module>
    wknn3 = pysal.knnW(data, k = 3)
  File "C:\Python27\FOSS4G NA 2015 - Spatial Data Analysis in Python\PySAL_Workshop\mine\pysal\weights\Distance.py", line 86, in knnW
    nnq = kdtree.query(data, k=k+1, p=p)
 AttributeError: 'numpy.ndarray' object has no attribute 'query'

Why? Please help!

@ljwolf
Copy link
Member Author

ljwolf commented Mar 23, 2016

The error actually occurs right at the knnW weights constructor.
It looks like commit a38c7d2c removed dispatch on type of input for the knnW constructor.

It used to accept numpy arrays and silently build the KDTree behind the scenes. Now, it will fail immediately if the user passes a previously-valid input.

Potential issues with this practice.

  1. We never issued any deprecation warnings on this change.
  2. This thumps a 2015 FOSS4G workshop's code
  3. Previously, proccessing ndarrays in a reasonable way as what the user intends is a KDTree obeys Postel's Law, but definitely includes unnecessary chrome around the core algorithm. (this is where I like multiple dispatch :)

But, I see nothing wrong with @jlaura's commit itself. It's just an API change that was not smoothed by a deprecation warning.

I'd mark this as notabug+wontfix, but perspective is needed.

@schmidtc
Copy link
Member

The OP's code was copy/pasted from the user guide: http://pysal.readthedocs.org/en/latest/users/tutorials/weights.html#k-nearest-neighbor-weights

(Larger issue, are the examples the user guide no longer tested?)

@ljwolf
Copy link
Member Author

ljwolf commented Mar 23, 2016

Did they used to get tested? I'm unaware of any tests that run aside from the unittests. I think even doctests & coveralls are currently turned off.

@ljwolf
Copy link
Member Author

ljwolf commented Mar 23, 2016

Further complicating things, we do a check in every constructor in Distance.py for whether something is already a KDTree. Commit b39a43db replaced all of our default constructors with cKDTrees, rather than KDTrees.

While it's reasonable to assume that cKDTree <: KDTree, it's not the case:

1]> import scipy.spatial as spat; import pysal as ps

2]> df = ps.pdio.read_files(ps.examples.get_path('columbus.shp'))

3]> data = df[['X', 'Y']].values 

4]> print(type(ps.cg.KDTree(data)))
4|   ckdtree.cKDTree

5]> issubclass(type(spat.cKDTree), spat.KDTree)
5|   False

6]> issubclass(type(spat.KDTree), spat.cKDTree)
6|   False

which means that we've now got this class of errors:

ps.Kernel(ps.cg.KDTree(data))
TypeError                                 Traceback (most recent call last)
<ipython-input-13-8e6cf2c300ce> in <module>()
----> 1 ps.Kernel(ps.cg.KDTree(data))

/home/ljw/.local/lib/python2.7/site-packages/pysal/weights/Distance.pyc in __init__(self, data, bandwidth, fixed, k, function, eps, ids, diagonal)
    271         else:
    272             self.data = data
--> 273             self.kdt = KDTree(self.data)
    274         self.k = k + 1
    275         self.function = function.lower()

/home/ljw/.local/lib/python2.7/site-packages/pysal/cg/kdtree.pyc in KDTree(data, leafsize, distance_metric, radius)
    242             return scipy.spatial.KDTree(data, leafsize)
    243         else:
--> 244             return scipy.spatial.cKDTree(data, leafsize)
    245     elif distance_metric == 'Arc':
    246         return Arc_KDTree(data, leafsize, radius)

ckdtree.pyx in ckdtree.cKDTree.__init__ (scipy/spatial/ckdtree/ckdtree.cxx:7474)()

/home/ljw/.local/lib/python2.7/site-packages/numpy/core/numeric.pyc in ascontiguousarray(a, dtype)
    559 
    560     """
--> 561     return array(a, dtype, copy=False, order='C', ndmin=1)
    562 
    563 def asfortranarray(a, dtype=None):

TypeError: float() argument must be a string or a number

AFAICT, this hits every constructor in Distance.py that dispatches on the input type. Fortunately, the fix is clear: add cKDTree to the set of kdtree types we check in the input.

Again, the main issue here is that the inheritance diagram is forked: pysal.cg.Arc_KDTree and the output of pysal.cg.KDTree have no common parent type beyond object.

@sjsrey
Copy link
Member

sjsrey commented Mar 24, 2016

At some point the tests in the documentation were turned off. I think the trail links to #593

@ljwolf
Copy link
Member Author

ljwolf commented Apr 28, 2016

This was resolved in #786 I believe.

@ljwolf
Copy link
Member Author

ljwolf commented Apr 28, 2016

Indeed:

1]> import pysal as ps, numpy as np
2]> x,y = np.indices((5,5))
3]> x.shape = y.shape = (25,1)
4]> data = np.hstack((x,y))
5]> wknn3 = ps.knnW(data, k=3)
6]> ps.version
6|  '1.11.2dev'

proceeds without a hitch.

@ljwolf ljwolf closed this as completed Apr 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants