Add MiniBatchKMeans #55

Arkoniak · 2020-04-13T13:55:55Z

It would be nice to have MiniBatchKMeans, in the same way as it is done in scikitlearn:

MiniBatchKMeans: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.MiniBatchKMeans.html

Paper: https://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf

Abstract: We present two modifications to the popular k-means clustering algorithm to address the extreme requirements for latency, scalability, and sparsity encountered in user-facing web applications. First, we propose the use of mini-batch optimization for k-means clustering. This reduces computation cost by orders of magnitude compared to the classic batch algorithm while yielding significantly better solutions than online stochastic gradient descent. Second, we achieve sparsity with projected gradient descent, and give a fast ϵaccurate projection onto the L1-ball. Source code is freely available: http://code.google.com/p/sofia-ml

The text was updated successfully, but these errors were encountered:

Arkoniak assigned Arkoniak and PyDataBlog and unassigned Arkoniak Apr 13, 2020

PyDataBlog added the enhancement New feature or request label Apr 23, 2020

Arkoniak mentioned this issue Apr 24, 2020

Expose the RNG as a hyper-parameter #66

Closed

PyDataBlog mentioned this issue May 11, 2020

Release 1.0.0 #78

Closed

3 tasks

PyDataBlog closed this as completed Apr 5, 2021

PyDataBlog mentioned this issue Apr 5, 2021

Release 0.2.1 #105

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniBatchKMeans #55

Add MiniBatchKMeans #55

Arkoniak commented Apr 13, 2020

Add MiniBatchKMeans #55

Add MiniBatchKMeans #55

Comments

Arkoniak commented Apr 13, 2020