## LAB Assignment
Please finish the **Exercise** and answer **Questions**.
### Exercise 
In this lab, we will write a program to segment different objects in a video using *K-means* clustering. There are several steps:

-  1.Load video & extract frames
-  2.Implement *K-means* clustering
-  3.Write back to video


#### implement K-means
In this lab, you need to implement K-means, the rough procedure is:

1. **initialize centroids** of different classes

   In the simplest case, randomly choose centroids in original data

2. **calculate distances** between samples (pixels) and centroids

   Since one sample (pixel) has 3 channels, you can calculate square sum of differences in each channel between it and centroids.
   $$
   dist(S,C) = \sum_{i=1}^3(C_i-S_i)^2 \\
   \left\{
   \begin{aligned}
   &dist(S,C): \text{distance between a sample S and a centroid C}\\
   &C: \text{a centroid}\\
   &S: \text{a sample}\\
   &S_i: \text{the } i^{th} \text{ channel's value of S}\\
   &C_i: \text{the } i^{th} \text{ channel's value of C}
   \end{aligned}
   \right.
   $$
   
3. **classify** every samples

   A sample is belonging to the class whose centroid is closest to it among all centroids.
   $$
   cls(S) = argmin(\sum_{i=1}^3(C_i^k-S_i)^2), k=1,2,...,K\\
   \left\{
   \begin{aligned}
   &cls(S): \text{class of a sample S}\\
   &K: \text{number of classes}\\
   &C^k: \text{centroid of } k^{th} \text{ class}\\
   \end{aligned}
   \right.
   $$

4. **update centroid**

   You can use mean of all samples in the same class to calculate new centroid.
   $$
   C^k_i =\frac{1}{n^k}\sum^{n^k}_{n=1}S^k_{in},\ \  i=1,2,3\\
   \left\{
   \begin{aligned}
   &C^k_i: \text{the } i^{th} \text{channel's value of a centroid belonging to the } k^{th} \text{class} \\
   &n^k: \text{the number of samples in the }  k^{th} \text{class}\\
   &S^k_{in}: \text{the } i^{th} \text{channel's value of a sample which is in the } k^{th} \text{class}
   \end{aligned}
   \right.
   $$
   
5. loop until classification result doesn't change



In addition, you may find there is code like this:

```python
while ret:
    frame = np.float32(frame)
    h, w, c = frame.shape
    ...
```

Since if you don't converse the `dtype`, K-means hardly converges which means it will stuck into dead loop easily.



After you finish K-means, you will find the written video is hard to watch because **color** between adjacent frames **changes almost all the time**. Here, I want you to find a way to alleviate the situation yourself.

**It isn't compulsory**, you can try if you want.






#### Import some libraries

In [1]:
import numpy as np
import cv2
import tqdm
import os
import sys
# color of different clusters
GBR = [[0, 0, 255],
       [0, 128, 255],
       [255, 0, 0],
       [128, 0, 128],
       [255, 0, 255]]

# path configuration
project_root = os.path.abspath('.')
input_path = os.path.abspath('.')
output_path = os.path.join(os.path.abspath('.'), 'output')
if not os.path.exists(output_path):
    os.makedirs(output_path)
    
    
def kmeans(data: np.ndarray, n_cl: int):
    """
        K-means

    :param data:    original data
    :param n_cl:    number of classes
    :param seeds:   seeds
    :return:        new labels and new seeds
    """
    n_samples, channel = data.shape
    
    # TODO: firstly you should init centroids by a certain strategy
    args = np.random.choice(range(0, n_samples), n_cl, replace = False)
    centers = data[args]
    
    old_labels = np.zeros((n_samples,))
    while True:
        # TODO: calc distance between samples and centroids
        distance = np.linalg.norm(data[:, np.newaxis, :] - centers[np.newaxis, :, :], axis = 2)
            
        # TODO: classify samples
        new_labels = np.argmin(distance, axis=1)

        # TODO: update centroids
        for i in range(0, n_cl):
            centers[i] = np.mean(data[new_labels == i])
            
        
        if np.all(new_labels == old_labels):            
            break
        
        old_labels = new_labels
        

    return old_labels

#### Load video and detect
We use `opencv` to read a video.
<font color=red>Pay attention</font> that data type of `frame` is `uint8`, not `int`; In this lab, frame has 3 channels.
If you don't change `dtype` of frame into `unit8`, video you write will look strange which you can have a try.

In [2]:
def detect(video, n_cl=2):
    # load video, get number of frames and get shape of frame
    cap = cv2.VideoCapture(video)
    fps = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)),
            int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))

    # instantiate a video writer
    video_writer = cv2.VideoWriter(
        os.path.join(output_path, "result_with_%dclz.mp4" % n_cl),
        cv2.VideoWriter_fourcc(*'mp4v'),
        (fps / 10),
        size,
        isColor=True
    )

    # initialize frame and seeds
    ret, frame = cap.read()
 

    print("Begin clustering with %d classes:" % n_cl)
    bar = tqdm.tqdm(total=fps)  # progress bar
    while ret:
        frame = np.float32(frame)
        h, w, c = frame.shape

        # k-means
        data = frame.reshape((h * w, c))
        labels = kmeans(data, n_cl=n_cl)

        # give different cluster different colors
        new_frame = np.zeros((h * w, c))
        # TODO: dye pixels with colors
        for i in range(h*w):
            new_frame[i] = GBR[labels[i]]
            
        new_frame = new_frame.reshape((h, w, c)).astype("uint8")
        video_writer.write(new_frame)

        ret, frame = cap.read()
        bar.update()

    # release resources
    video_writer.release()
    cap.release()
    cv2.destroyAllWindows()


video_sample = os.path.join(input_path, "road_video.MOV")
detect(video_sample, n_cl=5)

Begin clustering with 5 classes:


100%|██████████| 35/35 [06:21<00:00, 10.91s/it]


#### Sample Result
<div  align="center"> <img src="images/image-20220804142902993.png"  alt="image-20220804142902993" width=600 align=center /></div>

<div  align="center"> <img src="images/image-20220804143125976.png"   width=600 align=center /></div>

### Questions

1. What are the strengths of K-means; when does it perform well?
2. What are the weaknesses of K-means; when does it perform poorly?
3. What makes K-means a good candidate for the clustering problem, if you have enough knowledge about the data?

1. Kmeans 是无监督学习，原理简单容易实现。高效可伸缩，计算复杂度接近于线性（N是数据量，K是聚类总数，t是迭代轮数）。收敛速度快，原理相对通俗易懂，可解释性强。如果提前过滤掉噪声值可以有助于提升效果。合理选择 K 值也可能会得到更好的结果。
2. 受初始值和异常点影响，聚类结果可能不是全局最优而是局部最优。K是超参数，一般需要按经验选择，样本点只能划分到单一的类中。
3. 任务是无监督，且已知数据集中各个类别是近乎凸的。