# üõçÔ∏è Mall Customers ‚Äì K-Means Clustering
### üß© Problem Definition
##### We want to segment mall customers into distinct groups based on their income and spending patterns using K-Means clustering.
##### This helps understand customer types such as high-income low-spending, average customers, and target segments for marketing.

### üéØ Objective
##### Use unsupervised learning (K-Means) to:
##### 1- Load and inspect the Mall_Customers.csv dataset.
##### 2- Visualize relationships between customer attributes.
##### 3- Use the Elbow Method to find the optimal number of clusters.
##### 4- Train a K-Means model and label each cluster.
##### 5- Visualize the clusters in 2D.
##### 6- Interpret and save the results

### üìÅ Dataset
##### - File: Mall_Customers.csv
##### - Source: Kaggle ‚Äì Mall Customer Segmentation Data
##### - Columns:
#####     - CustomerID
#####     - Gender
#####     - Age
#####     - Annual Income (k$)
#####     - Spending Score (1-100)


### Import libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

### Load data

In [None]:
df = pd.read_csv("./data/Mall_Customers.csv")
df.head()

### Explore data

In [None]:
df.info()
df.describe()

### Select features

In [None]:
X = df[["Annual Income (k$)", "Spending Score (1-100)"]]
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

### Elbow Method

In [None]:
inertia = []
K = range(1, 11)
for k in K:
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(X_scaled)
    inertia.append(kmeans.inertia_)

plt.plot(K, inertia, 'bo-')
plt.title("Elbow Method for Optimal k")
plt.xlabel("Number of clusters")
plt.ylabel("Inertia")
plt.show()

### K-Means with optimal k (e.g. k=5)

In [None]:
kmeans = KMeans(n_clusters=5, random_state=42)
df["Cluster"] = kmeans.fit_predict(X_scaled)
df.head()

### Visualize clusters

In [None]:
plt.figure(figsize=(8,6))
plt.scatter(X_scaled[:,0], X_scaled[:,1], c=df["Cluster"], cmap="viridis")
plt.title("Customer Clusters")
plt.xlabel("Annual Income (scaled)")
plt.ylabel("Spending Score (scaled)")
plt.show()

### Export results

In [None]:
df.to_csv("./data/mall_customers_clustered.csv", index=False)
print("‚úÖ Clustered data saved as mall_customers_clustered.csv")