Kmeans clustering exception #259
Comments
Did you validate your input for every element? Just do the following.
After you check values, inform your result here. |
Yes, I did validate them. What I noticed that, that when I use kmean for 1000 x 15000 double. It works just fine. |
Hi all, It is very possible that this issue could have been fixed in the latest release of the framework (3.2, released a few days ago). If it is still possible, would it be possible to let us know if you are still experiencing the issue? Thanks! Regards, |
Hi Cesar, I'm actually on version 3.3 and experienced the same issue as described above clustering on anywhere from 1,000 - 2,000 dimensions. The data I'm trying to cluster are in the range 0 - 50. Thanks! |
I reduced the dimensions, and it worked... |
Hi Khan, thank you very much for your reply. Reducing the dimensions did work; however, in my particular situation, I'm unfortunately unable to incur that loss in fidelity. That being said, I switched to uniform seeding (as opposed to the default kmeans++) and didn't have any issues, even at rather high dimensionality. Thanks again, |
I had the same issue. This exception is originally thrown by The exception will be thrown if the version: 3.4.0 |
… distances to probabilities in the K-Means++ initialization. Updates GH-259: K-means clustering exception
I have not been able to reproduce the error myself yet, but this is probably happening due to a loss of precision when computing the discrete probability weights, making the weight vector not sum up to one. One of the possible reasons for that is the probabilities for each point becoming too small. I have added some handling to sidestep this issue and also present some better error messages. Regards, |
Should have been fixed in release 3.6.0. |
hello i want an array of clustering data points like array[10]={1,2,3,4,5,6,7,8,9} |
I am getting the following exception
I believe my input is totally correct. I was computing cosine similarity over documents with TFIDF algorithm.
input to cluster.compute() is double[][] where 0>= input[i][j] <= 1
int[] index = cluster.Compute(inputs);
using https://github.com/primaryobjects/TFIDF for tfidf
The text was updated successfully, but these errors were encountered: