updated the fit function to conform to the upstream changes #1

NightMachinery · 2022-01-21T12:31:44Z

No description provided.

NightMachinery · 2022-01-21T12:33:07Z

PS: How is it that this algorithm is faster than the vanilla one? Aren't we calling super().fit(X)? So shouldn't this algorithm be strictly slower than the vanilla version?

gittar

makes sense. Thx

gittar · 2022-01-21T19:09:19Z

The reason that breathing k-means in this implementation is not strictly slower than the vanilla version (i.e. scikit-learn's k-means++) is the default value n_init=1 (scikit-learn uses n_init=10). This gives an initial running time reduction of about 90% for the initialization (including the initlal run of Lloyd's algorithm). However, this advantage is reduced (possibly even turned into a disadvantage) by the time it takes to execute the breathing steps. In effect breathing k-means can indeed be faster than the employed k-means++ algorithm. It depends on the data set used and the value of k (n_clusters). The running time is also positively correlated to the parameter m ("breathing depth").

updated the fit function to conform to the upstream changes

6c0557c

gittar approved these changes Jan 21, 2022

View reviewed changes

gittar merged commit d78bb2a into gittar:main Jan 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updated the fit function to conform to the upstream changes #1

updated the fit function to conform to the upstream changes #1

NightMachinery commented Jan 21, 2022

NightMachinery commented Jan 21, 2022

gittar left a comment

gittar commented Jan 21, 2022 •

edited

Loading

updated the fit function to conform to the upstream changes #1

updated the fit function to conform to the upstream changes #1

Conversation

NightMachinery commented Jan 21, 2022

NightMachinery commented Jan 21, 2022

gittar left a comment

Choose a reason for hiding this comment

gittar commented Jan 21, 2022 • edited Loading

gittar commented Jan 21, 2022 •

edited

Loading