# Customer Segmentation Analysis with Pandas

Your marketing team has tasked you with understanding customer behaviors based on their purchasing patterns over time. The goal is to segment customers into distinct groups to personalize marketing efforts effectively.

## Scenario

Your boss has noticed that the company's marketing campaigns are not as effective as they could be. To address this, the marketing team wants to understand different customer behaviors and segment them into groups such as loyal customers, one-time buyers, and seasonal shoppers. This will help in tailoring marketing strategies to each group.

You have been provided with a dataset containing purchase history of 5 customers over a period of time. Your task is to perform customer segmentation using K-Means Clustering.

## Step 1: Load the Dataset

Let's start by loading the dataset and taking a quick look at the first few rows to understand its structure.

So in our dataset we have the following columns:

- **Customer ID**: Unique identifier for each customer.  
- **Purchase Frequency**: Number of times the customer has made a purchase.  
- **Total Amount Spent**: Total monetary value spent by the customer.  
- **Last Purchase Date**: The most recent date the customer made a purchase. 

## Step 2: Data Exploration

Before diving into clustering, it's important to explore the dataset to understand the distribution of data. Let's check for any missing values and get a summary of the dataset.

## Step 3: Data Preprocessing

For clustering, we need to preprocess the data. This includes handling missing values, normalizing the data, and converting date information into a numerical format that can be used in clustering.

## Step 4: Data Normalization

To ensure that each feature contributes equally to the distance calculations in K-Means, we need to normalize the data.

## Step 5: K-Means Clustering

Now that our data is preprocessed and normalized, we can apply K-Means Clustering to segment the customers. We'll start by determining the optimal number of clusters using the Elbow Method.

#### **Finding the "Elbow" Point**
- Inertia decreases sharply as clusters increase.  
- The **elbow** is where the decrease slows down, forming a bend.  

#### **Optimal Number of Clusters**
- The best choice is at the **elbow**, where adding more clusters **barely reduces inertia**.  
- In this plot, **K = 3 or 4** seems optimal.  

#### **Why Stop at the Elbow?**
- Before the elbow: More clusters **significantly** reduce inertia.  
- After the elbow: Diminishing returns, risk of **overfitting**.  

Thus, **3 or 4 clusters** are the best choice for segmentation.  

## Step 6: Apply K-Means

Based on the Elbow Method, we should choose `4` as the number of clusters and apply K-Means to segment the customers.

## Step 7: Visualize the Clusters

To better understand the customer segments, let's visualize the clusters using a scatter plot.

first let's convert the `cluster` column to `str`

Let's use `plotly` to visualize our customer segments

### **Interpreting Customer Segments**  

1. **Cluster 0 (Green) - One-Time Buyers**  
   - Low purchase frequency, low spending.  
   - **Strategy:** Re-engagement campaigns, discount offers.  

2. **Cluster 1 (Blue) - Loyal Customers**  
   - High purchase frequency, high spending.  
   - **Strategy:** Loyalty programs, exclusive discounts.  

3. **Cluster 2 (Red) - Moderate Buyers**  
   - Medium frequency, moderate spending.  
   - **Strategy:** Personalized recommendations, limited-time offers.  

4. **Cluster 3 (Purple) - Seasonal Shoppers**  
   - Infrequent purchases, high spending.  
   - **Strategy:** Seasonal promotions, event-based campaigns.  

### **Business Insights**  
- **Nurture loyal customers** with rewards.  
- **Convert one-time buyers** into repeat customers.  
- **Target seasonal shoppers** before peak sales.  
- **Encourage moderate buyers** with incentives.  
