Skip to content

5 Clustering

Valerio Bonometti edited this page Apr 7, 2020 · 7 revisions

Mini Batch k-means parameters

Minimum number of considered k: 2
Maximum number of considered k: 10
Batch Size: 128
Maximum Number of Iterations: 2000 
Number of Initializations: 500

The algorithm we employed for detecting the elbow:

import numpy as np

def auto_elbow(n_clusters, inertias):
    '''
    Args:
    n_clusters: list of integers, number of centroids explored
    inertias: list of floats, inertia associated to each centroid

    Returns:
    optimal_k: integer, number of centroids correspondednt to the elbow
    '''
    y = (inertias[0], inertias[-1])
    x = (n_clusters[0], n_clusters[-1])

    alpha, beta = np.polyfit(
        x,
        y,
        1
    )
    grad_line = [beta+alpha*k for k in n_clusters]
    optimal_k = np.argmax([l - i for l, i in zip(grad_line, inertias)])
    optimal_k = optimal_k + 1
    return optimal_k

HMS

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization

HMG

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization

JC3

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization

JC4

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization

LIS

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization

LISBF

Targets Embedding Visualization

Clusters Embedding Visualization

Clusters Traces Visualization