Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyclustering.cluster.cure.cure.get_clusters() function Problem #384

Closed
PrachiPrakash opened this issue Nov 8, 2017 · 5 comments
Closed
Assignees
Labels
Refactoring Tasks related to code refactoring

Comments

@PrachiPrakash
Copy link

Actually get_clusters() function for cure algorithm return a list of clusters but each cluster is not a list of index but a list of points which is not consistent with other implementations.

Thanks

@annoviko
Copy link
Owner

annoviko commented Nov 8, 2017

Hello, @PrachiPrakash,

This is known issue, but anyway thank you for the reporting, I agree with you that get_clusters() should be unified for all clustering algorithms in the library and cure::get_clusters() should return indexes instead of points. I will update the method for Python and for C/C++ implementations.

@annoviko annoviko self-assigned this Nov 8, 2017
@annoviko annoviko added the Refactoring Tasks related to code refactoring label Nov 8, 2017
@annoviko annoviko added this to the 0.7 (release point) milestone Nov 8, 2017
@annoviko annoviko added this to In Progress in 0.7.x Feature Development Nov 8, 2017
@annoviko
Copy link
Owner

annoviko commented Nov 8, 2017

@PrachiPrakash, changes are delivered to branch 0.7.dev, they will be available in the next 0.7.3 release.

@annoviko annoviko moved this from In Progress to Done in 0.7.x Feature Development Nov 8, 2017
annoviko added a commit that referenced this issue Nov 8, 2017
@annoviko annoviko closed this as completed Nov 8, 2017
@himanshu94
Copy link

The same problem exists for X means.

Thanks

@annoviko
Copy link
Owner

annoviko commented Nov 9, 2017

Hello, @himanshu94,

Could you please explain or share the code where you have found this issue using X-Means?

There is tests for Python and C/C++ implementation and both returns indexes instead of points. I have written an example that demonstrates it:

from pyclustering.cluster.xmeans import xmeans;
from pyclustering.cluster.center_initializer import kmeans_plusplus_initializer;

from pyclustering.utils import read_sample;

from pyclustering.samples.definitions import SIMPLE_SAMPLES;

# Read dataset 'SAMPLE_SIMPLE2'
sample = read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE2);
initial_centers = kmeans_plusplus_initializer(sample, 3).initialize();

# Use Python implementation
xmeans_instance = xmeans(sample, initial_centers);
xmeans_instance.process();
clusters = xmeans_instance.get_clusters();

# Display allocated clusters
print(clusters);

# Use C/C++ implementation
xmeans_instance = xmeans(sample, initial_centers, ccore=True);
xmeans_instance.process();
clusters = xmeans_instance.get_clusters();

# Display allocated clusters
print(clusters);

Output of the code is

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [15, 16, 17, 18, 19, 20, 21, 22], [10, 11, 12, 13, 14]]
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [15, 16, 17, 18, 19, 20, 21, 22], [10, 11, 12, 13, 14]]

There option how to identify what kind of representation is used (it returns type_encoding.CLUSTER_INDEX_LIST_SEPARATION):

representaion = xmeans_instance.get_cluster_encoding();

https://codedocs.xyz/annoviko/pyclustering/classpyclustering_1_1cluster_1_1encoder_1_1type__encoding.html

@himanshu94
Copy link

Thanks @annoviko . The link you provided was helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactoring Tasks related to code refactoring
Projects
No open projects
Development

No branches or pull requests

3 participants