# Image Segmentation Using K-means Clustering

In an image clusters of pixel values can be thought as representing one phase of color(grey level) and can be taken as a problem of clustering as in Machine Learning where you have to cluster the pixel values according to different classes (in this case segments). 

In [1]:
import numpy as np
import cv2

img = cv2.imread("images/BSE_Image.jpg")

If we look at the image.  We can observe that there are roughly 4 clearly visible different regions which we can try to isolate.

Before applying K-means we need to flatten our image as K-means algorithm works on a list of values and not on value array. Also K-means clustering in OpenCV requires the sample to be of `np.float32` type (You can check the documentation for more info: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_ml/py_kmeans/py_kmeans_opencv/py_kmeans_opencv.html) so we need to covert the image data type to floating.

In [2]:
# Convert MxNx3 image into Kx3 where K=MxN
img2 = img.reshape((-1,3))  #-1 reshape means, in this case MxN

#We convert the unit8 values to float as it is a requirement of the k-means method of OpenCV
img2 = np.float32(img2)

Now we Define criteria also the number of clusters and then apply k-means. When this criterion is satisfied, the algorithm iteration stops. 

`cv.TERM_CRITERIA_EPS` — stop the algorithm iteration if specified accuracy, epsilon, is reached.


`cv.TERM_CRITERIA_MAX_ITER` — stop the algorithm after the specified number of iterations, max_iter.


`cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER` — stop the iteration when any of the above condition is met.

Max iterations, in this example 10. 

Epsilon, required accuracy, in this example 1.0

In [3]:
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)

# Number of clusters
k = 4

**Number of attempts:** number of times algorithm is executed using different initial labelings. After that the Algorithm return labels that yield best compactness.

*Compactness :* It is the sum of squared distance from each point to their corresponding centers.

In [4]:
attempts = 10

**Other flags needed as inputs for K-means :**
 - Specify how initial seeds are taken : We have two options, 
 `cv.KMEANS_PP_CENTERS` or `cv.KMEANS_RANDOM_CENTERS`

`cv2.kmeans` outputs 3 parameters.

1. Compactness. 
2. Labels: Label array.
3. Center: the array of centers of clusters. For k=4 we will have 4 centers.

NOTE: For RGB image, we will have center for each image layer, so total 4x3 = 12.

In [5]:
ret,label,center=cv2.kmeans(img2, k, None, criteria, attempts, cv2.KMEANS_PP_CENTERS)

Now convert center values from `float32` back into `uint8` so that we can plot the image.

In [6]:
center = np.uint8(center) 

Next, we have to access the labels to regenerate the clustered image

In [7]:
res = center[label.flatten()]
res2 = res.reshape((img.shape)) #Reshape labels to the size of original image

True

Now let us visualize the output result. 

**Use ImageJ to verify centroids of each segmented regions**

In [10]:
cv2.imwrite("images/segmented.jpg", res2)

cv2.imshow("Original Image",img2)
cv2.imshow("Segmented Image",res2)

cv2.waitKey(0)
cv2.destroyAllWindows()

# import matplotlib.pyplot as plt
# figure_size = 15
# plt.figure(figsize=(figure_size,figure_size))
# plt.subplot(1,2,1),plt.imshow(img2)
# plt.title('Original Image'), plt.xticks([]), plt.yticks([])
# plt.subplot(1,2,2),plt.imshow(res2)
# plt.title('Segmented Image when K = %i' % k), plt.xticks([]), plt.yticks([])
# plt.show()

Late we can also apply various morphological operations like opening, closing etc to clean the segments.

Also try playing with no. of clusters. For example, re-run the above codes with no. of clusters = 5.