This package implements the DP-means algorithm introduced by Kulis and Jordan in their article Revisiting k-means: New Algorithms via Bayesian Nonparametrics. Instead of specifying how many clusters to partition the data into, like one would with k-means, user specifies a penalty parameter λ which controls if/when new clusters are created during iterations:
The algorithm starts with a single cluster and then processes the data points, creating new clusters when needed, and then updates centers until convergence.
# install.packages("remotes")
remotes::install_github("bearloga/dpmclust")dp_means() returns an object with same class and components as kmeans() does, which makes it easy to use other packages that support the kmeans object (e.g. autoplot() in the ggfortify package).
y <- dp_means(x, lambda = 1)
# y$clusterNeed to implement lambda means algorithm for choosing optimal λ.
