Skip to content

AlbertPlaPlanas/size_constrained_clustering

 
 

Repository files navigation

Size Constrained Clustering Solver

Implementation of Deterministic Annealing Size Constrained Clustering. Size constrained clustering can be treated as an optimization problem. Details could be found in a set of reference paper.

This is a fork of https://github.com/jingw2/size_constrained_clustering that solves installation issues. And mantains only the Determinstic Annealing clustering.

Installation

Requirement Python >= 3.6, Numpy >= 1.13

  • install from PyPI
pip install light-size-constrained-clustering

Methods

  • Deterministic Annealling Algorithm: Input target cluster distribution, return correspondent clusters

Usage:

Deterministic Annealing

# setup
from light_size_constrained_clustering import da
import numpy as np

n_samples = 40 # number cells in spot
n_clusters = 4 # distinct number of cell types
distribution= [0.4,0.3,0.2,0.1] # distribution of each cell type (form deconv)
seed = 17

print(np.sum(distribution))
np.random.seed(seed)

X = np.random.rand(n_samples, 2)
# distribution is the distribution of cluster sizes
model = da.DeterministicAnnealing(n_clusters, distribution= distribution, random_state=seed)

model.fit(X)
centers = model.cluster_centers_
labels = model.labels_
print("Labels:")
print(labels)
print("Elements in cluster 0: ", np.count_nonzero(labels == 0))
print("Elements in cluster 1: ", np.count_nonzero(labels == 1))
print("Elements in cluster 2: ", np.count_nonzero(labels == 2))
print("Elements in cluster 3: ", np.count_nonzero(labels == 3))

In case of provided distributions not being respected due to lack of convergence, distribution can be nforced by using the parameter enforce_cluster_distribution

model.fit(X, enforce_cluster_distribution=True)

Cluster size: 16, 12, 8 and 4 in the figure above, corresponding to distribution [0.4, 0.3, 0.2, 0.1]

Copyright

Copyright (c) 2023 Jing Wang & Albert Pla. Released under the MIT License.

Third-party copyright in this distribution is noted where applicable.

Reference

About

Implementation of Size Constrained Clustering Algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.6%
  • Cython 13.4%