Zero assigned clusters leading to zero means #5

queenp · 2017-10-07T15:58:45Z

The current design chooses the first k points as starting values.

If any of these data points are identical this leads the first to be assigned all the points and the second to be assigned no points (and then generating a NaN mean over its 0 members, and derailing the whole clustering algorithm).

There are 2 solutions I can think of to avoid this condition:

Select the first k distinct points for centres.
Move any centre which ends up with a cluster of size 0 to a random other point.

The first one seems simple and more predictably performant to start from.

Stunkymonkey · 2021-07-04T16:00:07Z

today the same problem appeared today. i have not tested #6 yet, but @huonw please look into it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero assigned clusters leading to zero means #5

Zero assigned clusters leading to zero means #5

queenp commented Oct 7, 2017 •

edited

Stunkymonkey commented Jul 4, 2021

Zero assigned clusters leading to zero means #5

Zero assigned clusters leading to zero means #5

Comments

queenp commented Oct 7, 2017 • edited

Stunkymonkey commented Jul 4, 2021

queenp commented Oct 7, 2017 •

edited