Algorithms

Affinity Propagation Algorithm

Overview

Affinity Propagation is a clustering algorithm that identifies a set of "exemplars" among the data points and forms clusters around these exemplars. Unlike other clustering methods (e.g., K-Means), it does not require the number of clusters to be specified beforehand. Instead, it works by exchanging messages between data points until a good set of exemplars and clusters emerge.

This example demonstrates how to perform Affinity Propagation clustering using the scikit-learn library.

Code

# Import required libraries
from numpy import unique
from numpy import where
from sklearn.datasets import make_classification
from sklearn.cluster import AffinityPropagation
from matplotlib import pyplot

# Define the dataset
X, _ = make_classification(
    n_samples=1000, 
    n_features=2, 
    n_informative=2, 
    n_redundant=0, 
    n_clusters_per_class=1, 
    random_state=4
)

# Define the model
model = AffinityPropagation(damping=0.9)

# Fit the model
model.fit(X)

# Assign a cluster to each example
yhat = model.predict(X)

# Retrieve unique clusters
clusters = unique(yhat)

# Create scatter plot for samples from each cluster
for cluster in clusters:
    # Get row indexes for samples with this cluster
    row_ix = where(yhat == cluster)
    # Create scatter plot of these samples
    pyplot.scatter(X[row_ix, 0], X[row_ix, 1])

# Show the plot
pyplot.show()

Birch Clustering

Overview

Birch (Balanced Iterative Reducing and Clustering using Hierarchies) is a clustering algorithm designed for large datasets. It incrementally builds a clustering feature tree (CF tree) and performs clustering in a hierarchical manner. It is particularly efficient for datasets with a large number of samples due to its memory-efficient structure.

This example demonstrates how to perform Birch clustering using the scikit-learn library.

Code

# Import required libraries
from numpy import unique
from numpy import where
from sklearn.datasets import make_classification
from sklearn.cluster import Birch
from matplotlib import pyplot

# Define the dataset
X, _ = make_classification(
    n_samples=1000, 
    n_features=2, 
    n_informative=2, 
    n_redundant=0, 
    n_clusters_per_class=1, 
    random_state=4
)

# Define the model
model = Birch(threshold=0.01, n_clusters=2)

# Fit the model
model.fit(X)

# Assign a cluster to each example
yhat = model.predict(X)

# Retrieve unique clusters
clusters = unique(yhat)

# Create scatter plot for samples from each cluster
for cluster in clusters:
    # Get row indexes for samples with this cluster
    row_ix = where(yhat == cluster)
    # Create scatter plot of these samples
    pyplot.scatter(X[row_ix, 0], X[row_ix, 1])

# Show the plot
pyplot.show()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algorithms

Affinity Propagation Algorithm

Overview

Code

Birch Clustering

Overview

Code

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally