# K-Medians

As we mentioned before, K-Means is only guaranteed to converge using Euclidean distances. Therefore, it is highly discouraged to use other distance measures.

If other distance measures must be used, the algorithm needs to be slightly tweaked. One example is K-Medians. K-Medians is a variant of the K-Means algorithm which allows one to use Manhattan distance instead of Euclidean.

In [2]:
import sys ; sys.path.append("D:/source/skratch/source")
import copy

import numpy as np

from utils.distances import euclidean


class KMeans:

    def __init__(self, k=3, seed=None, n_runs=10, max_iters=300):

        self.max_iters = max_iters
        self.k = k
        self.rnd = np.random.RandomState(seed)
        self.n_runs = n_runs

In order to ensure convergence, the way centroids are defined is slightly modified. In K-Medians, a centroid is defined as the median point in a cluster. This means that the centroid, in each dimension, is composed of the median value of all points in that cluster.

In [3]:
    def _compute_centroids(self, X, labels):

        centroids = []

        for i in range(self.k):

            centroid = np.array([np.median(dim) for dim in X[labels == i].T])
            centroids.append(centroid)

        return np.array(centroids)

Thankfully, because K-Medians and K-Means share so much of their logic, this is all that is required to implement K-Medians!

![](https://skratch.valentincalomme.com/wp-content/uploads/2018/08/kmedians.gif)