Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add seed and k++ for k-means #803

Closed
jkoschinsky opened this issue May 8, 2017 · 4 comments
Closed

Add seed and k++ for k-means #803

jkoschinsky opened this issue May 8, 2017 · 4 comments

Comments

@jkoschinsky
Copy link
Collaborator

jkoschinsky commented May 8, 2017

For k-means:

  • Add seed for replicability: set it as global seed across maps
  • add k++ (check with Maia): 50-100 initial runs to get better starting point -- allow users to specify # of iterations to run

rename Distance Functions (same for the Hierarchical Clustering Settings):

Distance:
--Euclidean
--city block

Correlation
--Pearson
--absolute

Cosine:
--signed
--un-signed

Rank
--Kendall
--Spearman

(also for the Hierarchical Clustering Settings, remove indent of Method: average-linkage)

@lixun910
Copy link
Member

k-means++ (2007 by Arthur and Vassilvitskii). This algorithm comes with a theoretical guarantee to find a solution that is O(log k) competitive to the optimal k-means solution:

  1. Choose one center uniformly at random from among the data points.
  2. For each data point x, compute D(x), the distance between x and the nearest center that has already been chosen.
  3. Choose one new data point at random as a new center, using a weighted probability distribution where a point x is chosen with probability proportional to D(x)2.
  4. Repeat Steps 2 and 3 until k centers have been chosen.

@YidanJ-Wu
Copy link

YidanJ-Wu commented May 26, 2017

1.8.16.29
Windows 8
How do I change the Cosine and Rank settings?
Edit: never mind, found them.
image

@lixun910
Copy link
Member

lixun910 commented May 26, 2017 via email

@jkoschinsky
Copy link
Collaborator Author

GeoDa 1.8.16.33 — Mac OSX
Verified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants