Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding implementation of Hartigan's K-Means #16123

Open
assaftibm opened this issue Jan 14, 2020 · 1 comment
Open

Adding implementation of Hartigan's K-Means #16123

assaftibm opened this issue Jan 14, 2020 · 1 comment

Comments

@assaftibm
Copy link

Hi,

I'm implementing Hartigan's K-Means in C++ with a Cython wrapper, and when it's done I'd be glad to contribute it to scikit-learn. The implementation follows the pseudo code described in the IJCAI '13 paper by Slonim, Aharoni and Crammer (https://dl.acm.org/doi/10.5555/2540128.2540369) + some optimizations of my own that make the run-time comparable to Lloyd's K-Means.

I'd like to know if the community welcomes this addition.

Thank you.

@ogrisel
Copy link
Member

ogrisel commented Jan 15, 2020

I was not familiar with Hartigan's K-Means but it looks interesting.

However we would rather not add anymore C++ in the scikit-learn codebase and rather focus on Cython.

But before considering implementing Hartigan's K-Means in Cython, let's focus on finishing the new implementation of Lloyd's in #11950 which is significantly more memory efficient and scalable efficient on machines with many CPU cores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants