Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce k-nearest neighbors estimators #38

Merged
merged 6 commits into from Jan 24, 2022
Merged

Introduce k-nearest neighbors estimators #38

merged 6 commits into from Jan 24, 2022

Conversation

iamDecode
Copy link
Owner

This PR introduces support for k-nearest neighbors classification and regression, and a number of different distance metrics. Some distance metrics were only supported by either scikit-learn (e.g., mahalanobis) or PMML (e.g., squaredEuclidean), and hence were left out for now. Since the implementation by scikit-learn allows for callable metric functions, support could be added in future.

Since distance metrics generally only work on either numerical or categorical, but not mixed column types, "categorical support" proved a bit challenging. Some work is being done to address this, but until then I chose to leave out categorical support. In addition, I think one-hot encoding is a bad approach and will attribute more weight to categorical columns.

@iamDecode iamDecode merged commit c7ba3cb into master Jan 24, 2022
@iamDecode iamDecode deleted the decode/knn branch January 24, 2022 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant