### K-Means → Image Segmentation

Segmentation is the process of dividing an image into its fundamental (atomic) parts.

It is useful for detecting basic objects in images — we first identify the objects and then classify them (e.g., determining whether a region is a face or not).

It has primary applications in computer graphics, but is also widely used in other fields.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn import preprocessing
from PIL import Image
import urllib.request

## Perform K-Means clustering on data extracted from an image (segmentation).

Each pixel is represented as a point in $\mathbb{R}^5$, where:

the first 3 coordinates are the RGB color values,
the last 2 coordinates are the pixel's position (x, y) in the image.



Draw the segmented image by coloring all pixels in each cluster with a single representative color.

In [None]:
# Raw GitHub URL
img_url = "https://raw.githubusercontent.com/AdamChwila/Predictive-AI/main/stinkbug.png"

# Open the URL and load image
with urllib.request.urlopen(img_url) as url:
    img = Image.open(url)
    img = np.array(img)  # convert to numpy array

# Display the image
plt.imshow(img)
plt.axis('off')
plt.show()

In [None]:
dim_x, dim_y, _ = img.shape
img_2d = np.array([np.append(img[i][j], [i, j]) for i in range(dim_x) for j in range(dim_y)])
img_2d.shape

Lets see how sinle point of the image looks as 5-element vector

In [None]:
# the first pixel representation
img_2d[1]

In [None]:
# the 100 000th pixel representation
img_2d[100_000]

### We try to divide pixels into clusters - but the result is not so good ...

In [None]:
k=2
kmeans = KMeans(n_clusters=k).fit(img_2d)
seg_img = [[kmeans.labels_[dim_y * i + j] / k for j in range(dim_y)] for i in range(dim_x)]
plt.imshow(seg_img, cmap=plt.cm.viridis)
plt.show()

The reason is that RGB color values (the first three components of the vector) are on a much smaller scale than the pixel coordinates (x, y positions in the image).

As a result, the spatial positions dominate the K-means distance calculations, causing the algorithm to largely ignore the color information.

### Lets rescale the data - so the pixel coordinates and colors have similar impact on the final result

### What is **Standard Scaler**?

Transforms features to have mean = 0 and standard deviation = 1

$$
x_{\text{scaled}} = \frac{x - \mu}{\sigma}
$$

- `μ` = mean of the feature  
- `σ` = standard deviation of the feature


### Why Use It?

 Features on different scales (e.g., age: 0–100, income: 0–1M) after scaling are all on same scale 


In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(img_2d)
img_2d_standard_scaler = scaler.transform(img_2d)

### Lets compare how different vectors of our pixels look like before and after scaling

In [None]:
# before scaling
img_2d[1]

In [None]:
# after scaling
img_2d_standard_scaler[1]

In [None]:
# before scaling
img_2d[100_000]

In [None]:
# after scaling
img_2d_standard_scaler[100_000]

In [None]:
k=2
kmeans_standard_scaler = KMeans(n_clusters=k).fit(img_2d_standard_scaler)
seg_img_standard_scaler = [[kmeans_standard_scaler.labels_[dim_y * i + j] / k for j in range(dim_y)] for i in range(dim_x)]
plt.imshow(seg_img_standard_scaler, cmap=plt.cm.viridis)
plt.show()

### What happens if we increase the number of clusters in the above picture from 2 to larger numbers? Does larger numbers always improve the final result?