# Mall Customer Segmentation Using K-Means Clustering

## Project Overview

This project aims to segment mall customers into distinct groups based on their spending score, annual income, and other features using the K-means clustering algorithm. The goal is to identify optimal customer segments for targeted discount strategies. The elbow method was used to determine the optimal number of clusters.

## Dataset

The dataset `Mall_Customers.csv` contains the following columns:
- `CustomerID`: Unique ID for each customer
- `Gender`: Gender of the customer
- `Age`: Age of the customer
- `Annual Income (k$)`: Annual income of the customer in thousand dollars
- `Spending Score (1-100)`: Spending score assigned by the mall based on customer behavior and spending nature

## Analysis

The analysis was conducted using the following steps:

1. **Data Preprocessing**:
   - Load the dataset and perform basic cleaning and preprocessing.
   - Handle missing values and encode categorical variables if necessary.

2. **Feature Selection**:
   - Select relevant features for clustering, such as `Annual Income (k$)` and `Spending Score (1-100)`.

3. **Elbow Method**:
   - Use the elbow method to determine the optimal number of clusters.
   - Plot the within-cluster sum of squares (WCSS) against the number of clusters to find the "elbow point".

4. **K-Means Clustering**:
   - Apply the K-means clustering algorithm with the optimal number of clusters.
   - Assign cluster labels to each customer.

5. **Cluster Analysis**:
   - Analyze the characteristics of each cluster.
   - Visualize the clusters using scatter plots.

## Results

The key findings from the analysis are as follows:
- The optimal number of clusters was determined using the elbow method.
- The K-means algorithm successfully segmented customers into distinct clusters.
- Each cluster represents a group of customers with similar features.

## Visualizations

![Elbow Method]
![image.png](attachment:image.png)
*Figure 1: Elbow Method for Determining Optimal Number of Clusters*


![Clusters]
![image-2.png](attachment:image-2.png)
*Figure 2: Clusters Based on features*

## How to Use

1. **Dependencies**: Ensure you have the following Python libraries installed:
   - pandas
   - numpy
   - matplotlib
   - seaborn
   - scikit-learn

2. **Running the Analysis**:
   - Load the dataset using pandas: `df = pd.read_csv('Mall_Customers.csv')`
   - Follow the steps in the provided Jupyter Notebook or Python script to perform the analysis.

3. **Elbow Method**:
   - Use the elbow method to determine the optimal number of clusters:
     

4. **K-Means Clustering**:
   - Apply the K-means clustering algorithm with the optimal number of clusters:
     

5. **Visualize Clusters**:
   - Visualize the clusters using scatter plots:
     

## Conclusion

This project demonstrates the use of the K-means clustering algorithm and the elbow method to segment mall customers based on their spending score and annual income. The identified customer segments can be used to develop targeted discount strategies, enhancing customer satisfaction and business profitability.

## Author

- SAURABH

