Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not really same size k-means #6

Closed
zamarajev opened this issue May 1, 2019 · 1 comment
Closed

Not really same size k-means #6

zamarajev opened this issue May 1, 2019 · 1 comment

Comments

@zamarajev
Copy link

zamarajev commented May 1, 2019

Hi,

I am facing an issue, where the clusters are far different in size.
Please find my dataset and code enclosed.
For the initial centroid initialization, I am using kmeans++.
All of the requirements from requirements.txt are fulfilled.

Here is the output:
Cluster 1 : 13
Cluster 2 : 13
Cluster 3 : 10
Cluster 4 : 13
Cluster 5 : 12
Cluster 6 : 13
Cluster 7 : 13
Cluster 8 : 13
Cluster 9 : 0
Cluster 10 : 0
Total : 100

Is this a normal behavior or is it a bug?
For my case the cluster size is crucial, so I need the sizes to differ in max 1-2 objects.
Is this possible?
Can a more balanced sizing be achieved?

Thanks in advance!

dataset and code.zip

@zamarajev
Copy link
Author

zamarajev commented May 2, 2019

Sorry, I was setting a wrong amount of clusters in the constructor.
This is the output now:

Cluster 1 : 10
Cluster 2 : 10
Cluster 3 : 10
Cluster 4 : 10
Cluster 5 : 10
Cluster 6 : 10
Cluster 7 : 10
Cluster 8 : 10
Cluster 9 : 10
Cluster 10 : 10
Total : 100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant