# K-MEANS CLUSTERING FROM SCRATCH

## INTRODUCTION & OBJECTIVES

IN THIS PROJECT, WE IMPLEMENT THE **K-MEANS CLUSTERING ALGORITHM** ENTIRELY FROM FIRST PRINCIPLES. WE WILL NOT USE `SCIKIT-LEARN` OR ANY BLACK-BOX CLUSTERING UTILITIES. THE GOAL IS TO DEMONSTRATE A DEEP MATHEMATICAL AND ALGORITHMIC UNDERSTANDING OF UNSUPERVISED LEARNING.

### CORE OBJECTIVES
* **MATHEMATICAL RIGOR**: IMPLEMENT EUCLIDEAN, MANHATTAN, AND COSINE DISTANCE METRICS FROM SCRATCH.
* **MODULAR ARCHITECTURE**: DESIGN A ROBUST `KMEANS` CLASS WITH EXPLICIT HYPERPARAMETERS.
* **OPTIMIZATION LOGIC**: IMPLEMENT THE ITERATIVE EXPECTATION-MAXIMIZATION (E-M) LOOP MANUALLY.
* **CONVERGENCE ANALYSIS**: VISUALIZE LOSS CURVES AND CENTROID STABILITY.

### RESTRICTIONS
* **NO SCIKIT-LEARN**.
* **ONLY NUMPY, PANDAS, MATPLOTLIB, SEABORN**.
* **MANUAL SCALING AND CALCULATIONS**.

## IMPORTS & SETUP

### LIBRARIES

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### VISUALIZATION CONFIGURATION

In [3]:
# SETTING GLOBAL PARAMS TO ENSURE ALL PLOTS FOLLOW THE REQUIRED STYLE
plt.rcParams['figure.figsize'] = (8, 6)
plt.rcParams['figure.dpi'] = 500
plt.rcParams['axes.grid'] = True
plt.rcParams['grid.linestyle'] = '--'
plt.rcParams['grid.alpha'] = 0.7
plt.rcParams['font.weight'] = 'bold'
plt.rcParams['axes.labelweight'] = 'bold'
plt.rcParams['axes.titleweight'] = 'bold'
plt.rcParams['xtick.labelsize'] = 14
plt.rcParams['ytick.labelsize'] = 14
plt.rcParams['axes.labelsize'] = 16
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['legend.fontsize'] = 16

def enforce_bold_ticks(ax):
    """
    HELPER FUNCTION TO ENSURE TICKS ARE BOLD.
    """
    for label in ax.get_xticklabels() + ax.get_yticklabels():
        label.set_fontweight('bold')

print("LIBRARIES LOADED. VISUALIZATION STYLE CONFIGURED.")

LIBRARIES LOADED. VISUALIZATION STYLE CONFIGURED.
