Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weighted distance #22

Closed
naarkhoo opened this issue Mar 14, 2018 · 3 comments
Closed

weighted distance #22

naarkhoo opened this issue Mar 14, 2018 · 3 comments

Comments

@naarkhoo
Copy link

naarkhoo commented Mar 14, 2018

I enjoy using the package and wonder if you can add the weight feature into the NN2 function where distances are normalised/adjusted according the weights ? (meanwhile, if there is any hack I can do, I appreciate any comments)

@jefferis
Copy link
Member

@naarkhoo thanks for your interest int he package. I am not exactly sure what weighting scheme you had in mind – perhaps you can explain. It is not so likely that I can add this since the underlying ANN library has no support for any weighting feature.

@naarkhoo
Copy link
Author

In my dataset rows are customers and columns are covariates. I have three covariates, 1)age, 2) category of movie, 3) wether they have bought a movie for someone else. The variance of covariate 3, is much lower, 95% of cases 0 and a few are 1 and the variation in 1 is higher.

So I am thinking of a scheme where one can give weight to each covariate when calculating the distances among rows(customers). If I calculate the distance as it is, I am assuming all covariates are equally important, which I don't think is the case. I believe, this package https://cran.r-project.org/web/packages/distances/distances.pdf provides a weight parameter and has some good examples. Weights either can be based on the variance of each covariance, or manually set.

I hope, I have explained it well - let me know if there is any think unclear. Thank you again.

@jefferis
Copy link
Member

@naarkhoo thanks for the explanation. However this normalisation is something that you need to do before you pass your points to RANN::nn2, by scaling each column appropriately. This is exactly what the distances package seems to do. So I am sorry, but it doesn't make sense for RANN to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants