Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MiniBatchKMeans #55

Closed
Arkoniak opened this issue Apr 13, 2020 · 0 comments
Closed

Add MiniBatchKMeans #55

Arkoniak opened this issue Apr 13, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@Arkoniak
Copy link
Collaborator

It would be nice to have MiniBatchKMeans, in the same way as it is done in scikitlearn:

MiniBatchKMeans: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html

Paper: https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf

Abstract: We present two modifications to the popular k-means clustering algorithm to address the extreme requirements for latency, scalability, and sparsity encountered in user-facing web applications. First, we propose the use of mini-batch optimization for k-means clustering. This reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent. Second, we achieve sparsity with projected gradient descent, and give a fast ϵaccurate projection onto the L1-ball. Source code is freely available: http://code.google.com/p/sofia-ml

@Arkoniak Arkoniak assigned Arkoniak and PyDataBlog and unassigned Arkoniak Apr 13, 2020
@PyDataBlog PyDataBlog added the enhancement New feature or request label Apr 23, 2020
@PyDataBlog PyDataBlog mentioned this issue May 11, 2020
3 tasks
@PyDataBlog PyDataBlog mentioned this issue Apr 5, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants