Below is the simple data set which we use for testing k-means, heirarchical clustering, and gaussian mixture models.  It consists of N instances of D dimensional, normal (spherical) random variables.

In [9]:
import numpy as np
import matplotlib.pyplot as plt

def make_data():
    s=4 #separation of the clusters
    D=2 #dimension
    mu1 = np.array([0, 0])
    mu2 = np.array([s, s])
    mu3 = np.array([0, s])

    N = 900 # number of samples
    X = np.zeros((N, D))
    X[:300, :] = np.random.randn(300, D) + mu1
    X[300:600, :] = np.random.randn(300, D) + mu2
    X[600:, :] = np.random.randn(300, D) + mu3
    return(X)
    

In k-means, samples are randomly assigned to a cluster.  Then, until some criteria is met:
    Make new cluster centers the mean of the points assigned to the cluster
    Re-assign points to the nearest cluster
    
In soft k-means, points are paritally assigned to each cluster based on their distance from it, and cluster centers are recalculated based on a weighted average of all points.  K-means can be thought of as soft k-means with weights that are either 0 or 1.
The responsibility of cluster $k$ for a given sample point $n$ is
$$
r(k,n)=\frac{\exp(-\beta d(c_k,  x_n))}{\sum_k \exp(-\beta d(c_k, x_n)) }
$$
Then the new cluster position is 
$$
c_k = \frac{\sum_n r(k,n)  x_n}{\sum_n r(k,n)}
$$

In [26]:
from sklearn.cluster import KMeans
model = KMeans(n_clusters=3,max_iter=100)
X=make_data()
model.fit(X)
model.cluster_centers_



array([[-0.13557347,  4.12198146],
       [-0.03261627, -0.01462501],
       [ 4.00109127,  4.09559269]])

Meanwhile, in gaussian mixture models, the assumption is that the data comes from a mixture of k gaussians:
$$
p(x) = \sum_k \pi_k N \left(\mu_k,\Sigma_k \right)
$$
One must choose the number of clusters, and whether the covariance matrices will be spherical, diagonal, or full (or tied, in which case all components share a single full matrix).


In [32]:
from sklearn.mixture import GaussianMixture as gm
model=gm(n_components=3,covariance_type='diag')
model.fit(X)
model.means_

array([[ 3.98227795,  4.09817004],
       [-0.03332268, -0.0318029 ],
       [-0.12023829,  4.07358926]])