## PERFORMANCE EVALUATION OF CLUSTERING TECHNIQUES FOR IMAGE SEGMENTATION

Image segmentation is a highly useful technique for digital picture processing and image analysis, and clustering is an approach that has been severally applied to this task in the literature. This notebook focuses on the implementation of these clustering algorithms with respect to image segmentation, and the performance evaluation on each algorithm.
First of all, clustering is an unsupervised technique in which the model groups data point using similar characteristics of those point using statistical features. It doesnt involve using labels when training like the supervised techniques therefore evaluation of clustering techniques is often overlooked as there are no labels and targets for the evaluation. 
In respect to this, the approach we would take is the "ground truth" method, where a ground truth is given as the measure of performance accoriding to the dataset in question.

#### Methdology for finding the predicted and the ground truth binary mask

When performing clustering on image segmentation, we would use only the image itself without applying the depth and masks so therefore the technique of clustering would be used to partition data points into cluster based on feature vectors. Depth would have been used here if we were not applying clustering. For performance evaluation, the predicted binary mask and the ground truth binary mask would be compared; the predicted mask is gotten after the clustering algorithm has been performed, cluster labels obtained and converted to binary mask while the ground truth binary mask can be gotten by manual dataset annotation or from an already existing labelled datasets.

## Dataset

We would be using the widely used carvana dataset for computer vision which comprises of different images of car images and masks in form of images . This dataset contain the images and masks contents. For detailed information and guidelines on the dataset, the official website is as follows:
https://www.kaggle.com/c/carvana-image-masking-challenge

### Implementation

This notebook will be implemented using python mainly as the programming language and the pytorch framework for deep learning and sklearn library for classical machine learning algorithms.

### Importing relevant libraries to prepare dataset

In [2]:
import skimage

In [1]:
import torch
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor, transforms, InterpolationMode
import os
import numpy as np
from PIL import Image
from skimage import io
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

In [2]:
root_dir =  "./car-segmentation"
image_file = "./car-segmentation/images"
mask_file = "./car-segmentation/masks"

### Creating a custom dataclass for the Carvana dataset

In [3]:
class CustomCarvanaDataset(Dataset):
    def __init__(self, root_dir, image_file, mask_file, transform = None):
        self.root_dir = root_dir
        self.image_file = image_file
        self.mask_file = mask_file
        self.image_file_paths = [os.path.join(image_file, img) for img in (os.listdir(image_file))]
        self.mask_file_paths = [os.path.join(mask_file, mask) for mask in (os.listdir(mask_file))]
        self.transform = transform
        
    def __len__(self):
        return len(self.image_file_paths)
    
    def ground_truth_bin_mask(self):
        for root, _, files in os.walk(self.mask_fil):
            for file_name in files:
                gif_path = os.path.join(root, file_name)
                image_m = io.imread(gif_path)
                image_m = np.array(image_m)
                image_m[image_m] = 1
            return image_m        
        
    def __getitem__(self, idx):
        image_path = self.image_file_paths[idx]
        mask = self.mask_file_paths[idx]
        image = np.array(Image.open(image_path).convert("L").resize((400, 400)))
        image_mask = np.array(Image.open(mask).resize((400, 400)))
        image_mask[image_mask > 0 ] = 1        
    
        return image, image_mask

Now we apply data augmentation using transforms. This augmentation helps to increase the dataset size and model performance.Some augmentations such as rotations and flips can help the model learn to be invariant to the transforms. We will implement the resize, rotation, horizontal and vertical flip transforms.

In [5]:
dataset = CustomCarvanaDataset(root_dir = root_dir, image_file = image_file, mask_file = mask_file)

In [6]:
len(dataset[0])

2

In [None]:
ground_bin_mask.shape

In [None]:
inputs = first_batch

In [None]:
inputs[0].shape

## APPLYING THE CLUSTERING ALGORITHMS

Now we apply 7 different clustering techniques on the above dataset to see which one performs better in comparison with our ground-truth value. The clustering algorithms we would implement are;
- K-means clustering
- Fuzzy C-means clustering
- Hierarchical clustering
- DBSCAN
- Mean-Shift clustering
- Spectral clustering
- Gaussian mixture models.

After which we would evaluate using some metrics such as;
- Jaccard index
- Rand index
- Fowlkes-Mallows index
- Precision, Recall, F1-score.

## Importing the libraries for the metrics for evaluation

In [None]:
from sklearn.metrics import jaccard_score, adjusted_rand_score, fowlkes_mallows_score, precision_score, recall_score, f1_score

## K-means clustering

In [None]:
from sklearn.cluster import KMeans

In [None]:
# getting data from the dataloader
X_train = []
y_train = []
for i in range(len(dataset)):
    x, y = dataset[i]
    X_train.append(x)
    y_train.append(y)
    
X_train = np.array(X_train)
y_train = np.array(y_train)

In [None]:
X_train.shape, y_train.shape

#### elblow method to determine k

In [None]:
def determine_best_k(data, max_k):
    distortions = []
    for k in range(1, max_k + 1):
        kmeans = KMeans(n_clusters = k, random_state = 0)
        kmeans.fit(data)
        distortions.append(kmeans.inertia_)
    
    # plot elbow
    plt.plot(range(1, max_k + 1), distortions, marker = "o")
    plt.xlabel("Number of clusters (k)")
    plt.ylabel("Distortion")
    plt.title("elbow")
    plt.show()
    
    #determine best k based on elbow
    deltas = np.diff(distortions)
    acceleration = np.diff(deltas)
    best_k = acceleration.argmax() + 2
    
    return best_k

In [None]:
max_k = 5

In [None]:
best_k = determine_best_k(kmeans_flat, max_k)

In [None]:
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1] * X_train.shape[2]))

In [None]:
# create and fit the kmeans model using the best k from the elbow method
kmeans = KMeans(n_clusters=2)

In [None]:
kmeans.fit(X_train)

In [None]:
cluster_labels = kmeans.labels_

In [None]:
cluster_labels

#### converting cluster labels to binary mask

In [None]:
image_shape = (508800, 100)

In [None]:
pred_binary_mask = np.zeros(image_shape, dtype = np.uint8)
for label, pixel_value in enumerate(np.unique(cluster_labels)):
    pred_binary_mask[np.where(np.array(cluster_labels) == pixel_value)] = label

In [None]:
plt.imshow(pred_binary_mask)

In [None]:
ground_bin_mask

In [None]:
pred_binary_mask.shape

In [None]:
ground_bin_mask.shape

#### resizing the predicted binary mask to the ground binary mask

In [None]:
from skimage.transform import resize

In [None]:
target_size = ground_bin_mask.shape

In [None]:
resized_pred_bin_mask = resize(pred_binary_mask, target_size, order=0, mode="constant", anti_aliasing=False)

In [None]:
resized_pred_bin_mask.shape

In [None]:
from scipy.ndimage import zoom

In [None]:
#resized_ground_mask = zoom(ground_bin_mask, (508800/1280, 100/1918))

In [None]:
#resized_ground_mask

In [None]:
#resized_ground_mask.shape

In [None]:
# flattening the arrays to 1D
ground_bin_mask_flat = ground_bin_mask.ravel()
pred_bin_mask_flat = pred_binary_mask.ravel()

#### jaccard index evaluation

In [None]:
jaccard_index = jaccard_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
jaccard_index

#### rand index

In [None]:
rand_index = adjusted_rand_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
rand_index

#### fowlkes mallow score

In [None]:
fowlkes_score = fowlkes_mallows_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
fowlkes_score

#### precision score

In [None]:
precision = precision_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
precision

#### recall score

In [None]:
recall = recall_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
recall

#### f1 score

In [None]:
f1_score = f1_score(ground_bin_mask_flat, resized_pred_bin_mask_flat)

In [None]:
f1_score

## Fuzzy C-means clustering