-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Querying a vector of NaN occasionally results in invalid indices #125
Comments
I also experience the problem of non-existent large returned indices. Will have to investigate further. |
Clearly, something is going bananas somewhere. I think this just has to be honestly debugged with print statements and whatnot to find out where things go bad. |
I could finally take a look at the code. Apologies I don't have a concrete answer or a working patch yet, but here's theory to what may be happening: LOC 1 NearestNeighbors.jl/src/knn.jl Line 36 in ac0338c
At this point in the code we have had initialized the indices array with LOC 2 NearestNeighbors.jl/src/kd_tree.jl Line 208 in ac0338c
I believe the ultimate reason this ends up happening is because any tests with a distance of I think the solution for that may actually involve some decisions about how the whole thing can behave. If the metric function was returning |
Handling of nothing/null/Nan seems to be an never ending story. Initialization with A dirty compromise (still and Int, but throwing out of bound exceptions) might be returning Is there a Julian way to deal with NaN/nothings? |
I think initializing with 1 should be just fine, any match might be considered "good" for a NaN or Inf distances, as long as we have that distance value along with the result to judge what happened. It's a good thing if we guarantee always valid indices. The other approach is a neat handling of NaN as an optional class, either returning I don't believe there really is a Julian way to do this, because part of it is about application domain decisions. Although using |
I was able to reproduce the issue with the following snippet:
In some situations the function returns (1, Inf), which is just fine. Sometimes, though the returned index will be an invalid number, eg:
Ideally the returned indices should always be valid, in my opinion.
The text was updated successfully, but these errors were encountered: