Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the exact nearest neighbors? #48

Closed
hexiaoyupku opened this issue Oct 22, 2020 · 1 comment
Closed

How to get the exact nearest neighbors? #48

hexiaoyupku opened this issue Oct 22, 2020 · 1 comment

Comments

@hexiaoyupku
Copy link

hexiaoyupku commented Oct 22, 2020

Hi,
The code commets said that we could get the exact nearest neighbors by tuning parameters. But I haven't figured out how to do it.
I tried to set tau to 0.0, but didn't get the exact result. I think the reason is that KNN.searchIndices of the topTree doesn't always return all the possible partition ids which contains nearest neighbors.
Of course, I could rewrite the KNN.searchIndices to make sure all possible partition ids are included. But I still want to know if there is a way to get exact nearest neighbors by just tuning parameters in the existing code.

Thanks!

@hexiaoyupku
Copy link
Author

hexiaoyupku commented Oct 29, 2020

I figured out a way to produce exact top k neareast neighbors without changing any code:

val model = new KNN()
.setBalanceThreshold(0.0)
.setFeaturesCol(featuresCol)
.fit(trainData)

val predictData = model
.setBufferSize(Double.MaxValue)
.setDistanceCol(distanceCol)
.setNeighborsCol(neighborsCol)
.setK(k)
.transform(testData)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant