In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
data=pd.read_csv("sales_data_sample.csv",encoding="unicode_escape")
data.head()

In [None]:
data.shape

In [None]:
data.isnull().sum()

In [None]:
data.dropna(inplace=True)

In [None]:
data.isnull().sum()

In [None]:
data.info()

In [None]:
X=data[['SALES','QUANTITYORDERED','PRICEEACH']]
X

In [None]:
from sklearn.preprocessing import StandardScaler

sc=StandardScaler()

X_scaled=sc.fit_transform(X)
X_scaled

## The Elbow Method in K-Means Clustering

The **Elbow Method** is a technique used to determine the optimal number of clusters in *K-Means Clustering*, an unsupervised learning algorithm often used for data segmentation.

### How the Elbow Method Works

1. **Choose a range of values for $ k $**: Run the K-Means clustering algorithm multiple times, each time with a different number of clusters \( k \), usually starting from $ k=1 $ up to a larger number (like 10 or 20).

2. **Calculate the Within-Cluster Sum of Squares (WCSS)** for each value of $ k $: The WCSS measures the total variance within each cluster. For each cluster, this is the sum of squared distances between each point and the centroid of the cluster. Lower WCSS values indicate that points are closer to their cluster centroids, which is generally desirable.

 
   $$ \text{WCSS} = \sum_{i=1}^{k} \sum_{x \in C_i} \| x - \mu_i \|^2 $$
   

   where:
   - $ k $ is the number of clusters,
   - $ C_i $ represents each cluster,
   - $ x $ is a data point in cluster $ C_i $, and
   - $ \mu_i $ is the centroid of cluster $ C_i $.

3. **Plot $ k $ vs. WCSS**: As $ k $ increases, WCSS typically decreases (as more clusters can better "fit" the data points within each cluster). However, after a certain point, the marginal decrease in WCSS becomes minimal.

4. **Identify the "Elbow" Point**: Look for a point on the plot where the rate of decrease sharply slows down, creating an "elbow" shape. The location of this "elbow" is considered the optimal number of clusters, as increasing $ k $ beyond this point results in diminishing returns in terms of improved cluster compactness.

### Why the Elbow Point is Optimal

The elbow point indicates a balance between two competing factors:
   - **Minimizing WCSS**: Having well-defined clusters where points are close to their centroids.
   - **Avoiding Overfitting**: Not having so many clusters that the model starts to "overfit" the natural groupings in the data.

Choosing $ k $ at the elbow helps achieve meaningful clustering while maintaining model simplicity.

### Inertia in K-Means Clustering

In the context of K-Means clustering and the Elbow Method, **inertia** is another term for the *Within-Cluster Sum of Squares (WCSS)*. It represents how well the clusters formed by the K-Means algorithm fit the data points. Specifically, inertia measures the sum of squared distances between each data point and the centroid of the cluster to which it has been assigned.

In [None]:
from sklearn.cluster import KMeans

In [None]:
# Elbow method to find optimal number of clusters

# Inertia represents how well the clusters formed by the K-Means algorithm fit the data points.

inertia=[]
K=range(1,11)
for k in K:
    kmeans=KMeans(n_clusters=k)
    kmeans.fit(X_scaled)
    inertia.append(kmeans.inertia_)

In [None]:
plt.figure(figsize=(12,8))
plt.plot(K,inertia,'bo-')
plt.xlabel("K")
plt.ylabel("Inertia/SSE")
plt.title("K means Clustering")
plt.show()

In [None]:
data.columns

In [None]:
optimal_k=4
kmeans=KMeans(n_clusters=optimal_k)
data['clusters']=kmeans.fit_predict(X_scaled)

data[['SALES','QUANTITYORDERED','PRICEEACH','clusters']]

In [None]:
data['clusters'].values