-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arc distance in knnW #646
Comments
What about adding the I can make this change if it looks like it will serve. |
+1 on the change. I think this would allow knnW to use arc distances if needed. |
Now that I am digging in wanted to bounce some ideas around:
Some options:
|
I think adding the optional kw parameter radius=None is the cleanest solution. If that is specified with a radius then arc distance is used. I'm not sure about the second point you are raising. Presumably, if the user wants to use arc distances, the tree should be built with ArcKDTree - they wouldn't have constructed a regular KDTree and then passed that in as the first argument to knnW. Maybe I'm missing what you are getting at? |
On the second point, we support passing in a KDTree. Should the logic be, if data is a KDTree (not an ArcKDTree) and radius is specified, we throw an error or should we try to handle that by unpacking the existing KDTree and repacking into an ArcKDTree? Assuming we want to support that (for the confused user), do we run the coincident point check as well during the unpack, repack process? |
Maybe Luc's original suggestion is cleaner?
|
Given the philosophy of PySAL as a library, I think it would be cleaner to let knnW just build the weights, given a KDTree as the argument. How that KDTree is constructed (Euclidean or Arc) is not considered by knnW. In fact, this is pretty much what happens in practice when one of the knnW_ user functions is invoked. They take care of the logic of array to KDTree conversion using the correct distance metric. So I would argue for restricting knnW proper to only allow a KDTree as an argument, making it agnostic to how that tree is constructed and adding a couple of specific array to tree conversion function (to the extent that they are not there already). I think right now there is a bit of tension between the objective of knnW being a core ("base") function and its design as more of a user function (with all the checks). What we've tried to do in spreg (although it's by no means perfect) is to limit the options and arguments for the base functions to the absolute minimum, and do all the checking and alternative entry points in the user function/class. |
This example demonstrates how we can currently first build a KDTree using spherical coordinates and then pass it to knnW. If you load the linked files into qgis and follow along you will see that the calculations are correct in terms of distances. So changing the api of knnW proper as @lanselin suggests seems a good way to go. |
I may be missing something, but it seems to me that knnW in Distance.py is assuming that the distances are Euclidean and does not allow for arc distances. In practice, this will likely seldom come up as a problem, since, e.g., knnW_from_array (in user.py) takes a radius argument and passes a distance_metric argument to pysal.cg.KDTree before calling the worker bee knnW function, thus passing the correct KDTree.
However, when knnW is called directly (in Distance.py) there doesn't seem to be a way to pass a distance_metric argument and when only an array of coordinates is provided, the KDTree is called with only one argument (u) - on line 122.
Does this need to be addressed or is it taken care of in another way? One way to deal with this is to restrict knnW to always take a KDTree as an argument and remove the option to pass an array. This would then require a small helper function that creates the tree and deals with the distance_metric argument.
The text was updated successfully, but these errors were encountered: