In [None]:
import numpy as np
import matplotlib.pylab as plt
import nibabel as nib

<h2>Unsupervised Learning and Image Segmentations with Gaussian Mixture Models</h2>

In this last section we will see an algorithm for unsupervised learning and apply it to segmenting brain MRI. This will be a rather different application of machine learning on biomedical images than the ones we have seen earlier. 

In unsupervised learning the goal is not to predict a certain label using features but approximate the distribution of the features. Mathematically, we are interested in estimating the distribution $p(x)$, where $x$ is the vector of features and $p(\cdot)$ represents the probability distribution. 

There are many methods to do this. For instance, when you fit a Gaussian distribution to your sample, i.e. estimate the mean and variance using sample statistics, you are applying one of the most basic unsupervised learning algorithm. In this section, we will see an algorithm that is slightly more complicated and very widely used in biomedical image analysis.

Let us see a 2D dataset to better motivate the need for more complicated models. 

In [None]:
features = np.loadtxt('data/features_clustering.txt')
plt.figure()
plt.scatter(features[:,0], features[:,1]),
plt.grid('on')
plt.xlabel('feature 1'), plt.ylabel('feature 2')
plt.show()

The plot as usual shows different sample points in our dataset. Notice that there are three different samples are gathered around three clusters. This scenario is not uncommon. Let us look at an MRI image (that is pre-processed for segmentation analysis) and examine the intensity histogram. 

In [None]:
I = nib.load('data/segm_mri.nii.gz')
V = I.get_data()
VI = V[:,:,110]
plt.figure()
plt.imshow(VI,cmap='gray')
plt.show()
plt.figure()
plt.hist(VI.flatten(), bins=100)
plt.show()

Obviously, there are way too many back-ground pixels. Let us look at only the foreground pixels

In [None]:
VIf = VI.flatten()
nonzero_VIf = VIf[VIf>0]
plt.figure()
plt.hist(nonzero_VIf, bins=100)
plt.show()

Observe that in this histogram, intensities seem to cluster around several centers as well. 

One important method that is widely used to model such data is called <b>Gaussian Mixture Model</b>. The main idea is to model the probability distribution of features with sum of multiple Gaussian distributions: 

$p(x) = \sum_{k=1}^K p(l=k) p(x|l=k) = \sum_{k=1}^K p(l=k)\mathcal{N}(x;\mu_k, \sigma_k)$

where $l$ is a <b>latent variable</b> or a <b>latent label</b>, $\mathcal{N}(x;\mu_k, \sigma_k)$ are Gaussian distributions and $p(l=k)$ are called mixture components with $\sum_{k=1}^K p(l=k) = 1$. $K$ is the number of components that we use in the model. In the basic application of mixture models, $K$ is a user defined parameter. However, in more advanced techniques that can be estimated as well. 

Gaussian mixture model implicitly assigns labels to each data point based on the posterior distribution $p(l=k|x)$. These implicit assignments can be used to cluster data and also segment brain MRI. 

Gaussian mixture models are implemented in scikit-learn. Let us first apply it to the 2D example we saw at the beginning. 

In [None]:
# Import the required module
from sklearn.mixture import GaussianMixture

# let us set the number of components to 3
gmm = GaussianMixture(n_components=3)
# fitting the mixture model
gmm.fit(features)
# now let us predict the latent label for each sample point
lat_labels = gmm.predict(features)

# Plotting
plt.figure()
plt.plot(features[lat_labels==0,0], features[lat_labels==0,1], 'bo')
plt.plot(features[lat_labels==1,0], features[lat_labels==1,1], 'ro')
plt.plot(features[lat_labels==2,0], features[lat_labels==2,1], 'go')
plt.grid('on')
plt.xlabel('feature 1'), plt.ylabel('feature 2')
plt.show()

We can apply the same algorithm to the intensities of the MRI image. 

In [None]:
# let us set the number of components to 3
gmm_mri = GaussianMixture(n_components=3)
# fitting the mixture model
gmm_mri.fit(nonzero_VIf[:,np.newaxis])
# now let us predict the latent label for each sample point
lat_labels_mri = gmm_mri.predict(nonzero_VIf[:,np.newaxis])

# Plotting
lat_labels_im = -1*np.ones(VIf.shape)
lat_labels_im[VIf > 0] = lat_labels_mri
plt.figure(figsize=(10,8))
plt.subplot(1,2,1)
plt.imshow(VI,cmap='gray')
plt.subplot(1,2,2)
plt.imshow(lat_labels_im.reshape(128,256))
plt.show()

<h3>Exercise 8:</h3>

In this exercise, you will play with the number of components in the mixture model and observe what type of changes it will make in the clustering in the 2D dataset or the brain MRI. 

In [None]:
# TODO