 # K-Means from Scratch Experiment



 This notebook demonstrates the use of the K-Means clustering algorithm implemented from scratch.



 We will load the processed penguins dataset, use the elbow method to determine the optimal number of clusters,

 train the K-Means model, and visualize the clusters if the data is 2D.

In [None]:
import os
import sys

import matplotlib.pyplot as plt
import numpy as np

# Set project root directory and add it to the system path
project_root = os.path.abspath(os.path.join(os.getcwd(), "..", "..", ".."))
sys.path.append(project_root)

from src.scratch.models.K_Means import KMeans
from src.scratch.utils.viz_utils import plot_clusters, plot_elbow_method

# Load the processed data
X = np.load("../../../data/processed/kmeans_penguins_data.npy")

 ## Data Exploration



 The penguins dataset contains features suitable for clustering, such as measurements of penguin attributes.

In [None]:
# Print data shape
print(f"Data shape: {X.shape}")


 ## Elbow Method



 We use the elbow method to determine the optimal number of clusters by plotting inertia against k.

In [None]:
# Compute inertias for different k values
k_range = range(1, 11)
model = KMeans(n_init=10, max_iter=300, tol=1e-4, verbose=False)
inertias = model.compute_inertias(X, k_range)

# Plot the elbow curve
plot_elbow_method(k_range, inertias, title="Elbow Method for K-Means", show_fig=True)


 ## Model Training



 Based on the elbow plot, we choose k=3 (this may vary depending on the plot).

In [None]:
# Initialize and train the KMeans model
k = 4
model = KMeans(n_clusters=k, n_init=10, max_iter=300, tol=-1, verbose=True)
model.fit(X)

# Predict cluster labels
labels = model.predict(X)
centroids = model.centroids_

 ## Visualization



 If the data is 2D, we visualize the clusters and their centroids.

In [None]:
# Visualize clusters if data is 2D
if X.shape[1] == 2:
    plot_clusters(X, labels, centroids, title="K-Means Clustering (Scratch)")
else:
    print("Data is not 2D, cannot plot clusters directly.")