Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pyclustering.cluster.kmeans] Optimization via numpy #403

Closed
annoviko opened this issue Jan 14, 2018 · 1 comment
Closed

[pyclustering.cluster.kmeans] Optimization via numpy #403

annoviko opened this issue Jan 14, 2018 · 1 comment
Assignees
Labels
Optimization Tasks related to code optimization

Comments

@annoviko
Copy link
Owner

Introduction
Current implementation does not use numpy. If CCORE is not used then numpy should be used for that purpose to increase performance of the K-Means algorithm.

Description

  • In case of Python numpy should be used for calculation.
@annoviko annoviko added the Optimization Tasks related to code optimization label Jan 14, 2018
@annoviko annoviko self-assigned this Jan 14, 2018
@annoviko annoviko added this to To Do in 0.8.0 Optimization via automation Jan 14, 2018
@annoviko annoviko moved this from To Do to In Progress in 0.8.0 Optimization Jan 19, 2018
@annoviko
Copy link
Owner Author

annoviko commented Jan 19, 2018

The first version of K-Means using numpy. Results:
With numpy optimization:

Execution time (1000 2D-points): 1.1295836647650328
Execution time (2000 2D-points): 3.0258542384768683
Execution time (3000 2D-points): 3.934701825511546
Execution time (4000 2D-points): 13.279997776200211
Execution time (5000 2D-points): 14.245599869993242
Execution time (10000 2D-points): 16.167380278317097
Execution time (20000 2D-points): 72.8136846366796

Without numpy optimization:

Execution time (1000 2D-points): 0.0009254428355157932
Execution time (2000 2D-points): 0.001377045254325718
Execution time (3000 2D-points): 0.0019560885072316255
Execution time (4000 2D-points): 0.0025690589620557033
Execution time (5000 2D-points): 0.00314924262511012
Execution time (10000 2D-points): 0.006151372341062462
Execution time (20000 2D-points): 0.012956769902295356

numpy based implementation is faster than current. Optimization is accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization Tasks related to code optimization
Projects
No open projects
Development

No branches or pull requests

1 participant