## Spectral clustering
* Spectral clustering makes no assumption on the shape of cluster
    * It can handle intertwined, spiral etc..

#### Construct a matrix representation of the graph
* Build a Laplacian matrix of the graph

<img src='./notes/spectral-cluster-0..png'>

#### Compute Eigen-value and Eigen-vectors of the matrix
* Find Eigen values and Eigen vectors of the laplacian matrix 

<img src='./notes/spectral-cluster-1.png'>

#### Map each point to a low dimensional representation based on one or more eigen-vector

<img src='./notes/spectral-cluster-2.png'>

#### Assign points to cluster based on new representation
* Look at the components of eigen vector and determine which nodes belong to which cluster


<img src='./notes/spectral-cluster-5.png'>


<a href='https://youtu.be/uxsDKhZHDcc'>Video</a> | <a href='https://youtu.be/zkgm0i77jQ8'>Video</a> | <a href='https://youtu.be/cxTmmasBiC8'>Video</a>


In [45]:
import numpy as np
import matplotlib.pyplot as plt

from sklearn.cluster import SpectralClustering
from sklearn.feature_extraction import image

#### Drawing a circle

In [68]:
# set the size of frame [100 x 100]
l = 100

# Return an array representing the indices of a grid.
rows, cols = np.indices((l, l))

# where do you want to center the circle in the [100 x 100] frame
center = (45, 30)
# what is the radius of the circle 
radius = 15

# Draw the circle : 
circle = (rows - center[0])**2 + (cols - center[1])**2 < radius**2

# plot the circle
plt.matshow(circle.T)
plt.scatter(*center, marker='x', c='r');

<img src='./plots/how-to-draw-circle.png'>

### Lets Draw some circles

What is a voxel ? 
* Voxel : (in computer-based modelling or graphic simulation) each of an array of elements of volume that constitute a notional three-dimensional space.
* In 3D computer graphics, a voxel represents a value on a regular grid in three-dimensional space. As with pixels in a 2D bitmap 
* Voxels themselves do not typically have their position (i.e. coordinates) explicitly encoded with their values. Instead, rendering systems infer the position of a voxel based upon its position relative to other voxels

`wikipedia`

*"Voxel is an image of a three-dimensional space region limited by given sizes, which has its own nodal point coordinates in an accepted coordinate system, its own form, its own state parameter that indicates its belonging to some modeled object, and has properties of modeled region."*

In [69]:

l = 100
x, y = np.indices((l, l))

center1 = (28, 24)
center2 = (40, 50)
center3 = (67, 58)
center4 = (24, 70)

radius1, radius2, radius3, radius4 = 16, 14, 15, 14

circle1 = (x - center1[0]) ** 2 + (y - center1[1]) ** 2 < radius1**2
circle2 = (x - center2[0]) ** 2 + (y - center2[1]) ** 2 < radius2**2
circle3 = (x - center3[0]) ** 2 + (y - center3[1]) ** 2 < radius3**2
circle4 = (x - center4[0]) ** 2 + (y - center4[1]) ** 2 < radius4**2

# plot the circles
circles = circle1 + circle2 + circle3 + circle4
plt.matshow(circles);

<img src='./plots/draw-four-circles.png'>

## Spectral clustering for image segmentation
* In this example, an image with connected circles is generated and spectral clustering is used to separate the circles.

<br>

### Normalized graph cuts
* In these settings, the Spectral clustering approach solves the problem know as `‘normalized graph cuts’`
* The image is seen as a graph of connected voxels, and the spectral clustering algorithm amounts to choosing graph cuts defining regions while minimizing the ratio of the gradient along the cut, and the volume of the region.

<br>

* <div style='color:salmon'>As the algorithm tries to balance the volume (ie balance the region sizes), if we take circles with different sizes, the segmentation fails.</div>

<br>

<div style='color:green'>
We use a mask that limits to the foreground: <br>The problem that we are
interested in here is not separating the objects from the background,
but separating them one from the other.
</div>

In [71]:
# create an image with four circles
img = circle1 + circle2 + circle3 + circle4

# Create a mask for limiting the foreground
mask = img.astype(bool)

img = img.astype(float)

img += 1 + 0.2 * np.random.randn(*img.shape)


fig, ax = plt.subplots(nrows=1, ncols=2)
ax[0].matshow(img)
ax[0].set(title='Image')
ax[1].matshow(mask)
ax[1].set(title='Mask');

<img src='./plots/img_and_mask.png'>

### Convert the image into a graph | impose connectivity in estimators
* Why are we converting the image into a graph ?
    * <div style='color:steelblue'>In Spectral Clustering the image is seen as a graph of connected voxels, and the spectral clustering algorithm amounts to choosing graph cuts defining regions while minimizing the ratio of the gradient along the cut, and the volume of the region.</ div>
    * <div style='color:steelblue'>For two clusters, SpectralClustering solves a convex relaxation of the normalized cuts problem on the similarity graph: cutting the graph in two so that the weight of the edges cut is small compared to the weights of the edges inside each cluster.</ div>

<br>

* Several estimators in the scikit-learn can use connectivity information between features or samples. 
    * For instance Ward clustering (Hierarchical clustering) can cluster together only neighboring pixels of an image, thus forming contiguous patches:

<br>

* **For this purpose, the estimators use a `‘connectivity’ matrix`, giving which samples are connected.**

* The function `img_to_graph` returns such a matrix from a 2D or 3D image. 
* The function `grid_to_graph` build a connectivity matrix for images given the shape of these image.

<div style='color:green'>These matrices can be used to impose connectivity in estimators that use connectivity information, such as Ward clustering (Hierarchical clustering), but also to build precomputed kernels, or similarity matrices.</div>



In [46]:
graph = image.img_to_graph(img, mask=mask)

<div style='color:salmon'>
Warning Transforming distance to well-behaved similarities<br>
<br>
Note that if the values of your similarity matrix are not well distributed, then the spectral problem will be singular and the problem is not solvable. 
<h4>e.g. with negative values or with a distance matrix rather than a similarity, the spectral problem will be singular and the problem not solvable.</h4> In which case it is advised to apply a transformation to the entries of the matrix.

<h5 style='color:seagreen'>similarity = np.exp(-beta * distance / distance.std())</h5>

<div>

In [48]:
graph.data = np.exp(-graph.data / graph.data.std())

### Find the clusters

In [51]:
spectral = SpectralClustering(n_clusters=4, eigen_solver='arpack', affinity='precomputed')
spectral.fit(graph)

In [65]:
labeled_img = np.random.randn(*mask.shape)

labeled_img[mask] = spectral.labels_
labeled_img[~mask] = -1

In [74]:
fig, ax = plt.subplots(nrows=1, ncols=2)
ax[0].imshow(img)
ax[0].set(title='Image')
ax[1].imshow(labeled_img, cmap=plt.cm.Paired)
ax[1].set(title='Clusters');

<img src='./plots/img_and_cluster.png'>