# Customer Segmentation for Marketing Strategy: An Unsupervised Learning Approach

## Introduction

In today's competitive retail environment, understanding customer behavior is crucial for effective marketing strategies. This project applies unsupervised machine learning techniques to segment customers based on their purchasing behaviors, demographics, and engagement metrics. By identifying distinct customer groups, businesses can tailor their marketing approaches, product recommendations, and customer service strategies to better meet the needs of different customer segments.

### Problem Statement

The primary objective of this analysis is to identify natural groupings within a retail company's customer base that may not be immediately apparent through conventional business analysis. Specifically, we aim to:

1. Discover distinct customer segments based on purchasing patterns and demographics
2. Characterize these segments to understand their unique behaviors and preferences
3. Provide actionable insights for targeted marketing campaigns and personalized customer experiences

### Business Value

Effective customer segmentation offers several benefits:
- **Personalized Marketing**: Tailored messaging and offers for different customer groups
- **Resource Optimization**: More efficient allocation of marketing budgets
- **Customer Retention**: Better understanding of at-risk customers and opportunities for engagement
- **Product Development**: Insights into needs of different customer segments for future offerings

### Dataset Description

For this analysis, we will use the "Mall Customer Segmentation Data" available on Kaggle. This dataset contains basic information about customers visiting a shopping mall, including:

- CustomerID: Unique identifier for each customer
- Gender: Male or Female
- Age: Customer's age
- Annual Income (k$): Customer's annual income in thousands of dollars
- Spending Score (1-100): Score assigned by the mall based on customer spending behavior and purchasing data

The dataset contains 200 observations, providing a manageable yet informative basis for customer segmentation.

## Exploratory Data Analysis

### Data Loading and Initial Inspection


In [11]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans, AgglomerativeClustering
from sklearn.metrics import silhouette_score
from sklearn.pipeline import Pipeline
from scipy.cluster.hierarchy import dendrogram, linkage
from yellowbrick.cluster import KElbowVisualizer, SilhouetteVisualizer
import warnings
warnings.filterwarnings('ignore')

plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette("viridis")

df = pd.read_csv('Mall_Customers.csv')

print("First 5 rows of the dataset:")
df.head()

ModuleNotFoundError: No module named 'numpy'

```
First 5 rows of the dataset:
   CustomerID  Gender  Age  Annual Income (k$)  Spending Score (1-100)
0           1    Male   19                  15                      39
1           2    Male   21                  15                      81
2           3  Female   20                  16                       6
3           4  Female   23                  16                      77
4           5  Female   31                  17                      40
```

In [None]:
print(f"Dataset shape: {df.shape}")

print("\nData Types:")
print(df.dtypes)

print("\nMissing Values:")
print(df.isnull().sum())

print("\nStatistical Summary:")
df.describe()

```
Dataset shape: (200, 5)

Data Types:
CustomerID                 int64
Gender                    object
Age                        int64
Annual Income (k$)         int64
Spending Score (1-100)     int64
dtype: object

Missing Values:
CustomerID               0
Gender                   0
Age                      0
Annual Income (k$)       0
Spending Score (1-100)   0
dtype: int64

Statistical Summary:
```

|       | CustomerID | Age | Annual Income (k$) | Spending Score (1-100) |
|-------|------------|-----|--------------------|-----------------------|
| count | 200.000000 | 200.000000 | 200.000000 | 200.000000 |
| mean  | 100.500000 | 38.850000 | 60.560000 | 50.200000 |
| std   | 57.879185 | 13.969007 | 26.264721 | 25.823522 |
| min   | 1.000000 | 18.000000 | 15.000000 | 1.000000 |
| 25%   | 50.750000 | 28.750000 | 41.500000 | 34.750000 |
| 50%   | 100.500000 | 36.000000 | 61.500000 | 50.000000 |
| 75%   | 150.250000 | 49.000000 | 78.000000 | 73.000000 |
| max   | 200.000000 | 70.000000 | 137.000000 | 99.000000 |

### Understanding the Features

Let's examine each feature in more detail:

In [None]:
plt.figure(figsize=(10, 6))
gender_counts = df['Gender'].value_counts()
sns.barplot(x=gender_counts.index, y=gender_counts.values)
plt.title('Gender Distribution', fontsize=16)
plt.xlabel('Gender', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.grid(axis='y', alpha=0.3)
plt.show()

print(f"Gender distribution: {gender_counts}")

```
Gender distribution: Gender
Female    112
Male       88
Name: count, dtype: int64
```

In [None]:
plt.figure(figsize=(12, 6))
sns.histplot(df['Age'], bins=15, kde=True)
plt.title('Age Distribution', fontsize=16)
plt.xlabel('Age', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

print(f"Age statistics: \n{df['Age'].describe()}")

```
Age statistics: 
count    200.000000
mean      38.850000
std       13.969007
min       18.000000
25%       28.750000
50%       36.000000
75%       49.000000
max       70.000000
Name: Age, dtype: float64
```

In [None]:
plt.figure(figsize=(12, 6))
sns.histplot(df['Annual Income (k$)'], bins=15, kde=True)
plt.title('Annual Income Distribution', fontsize=16)
plt.xlabel('Annual Income (k$)', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

print(f"Annual Income statistics: \n{df['Annual Income (k$)'].describe()}")

```
Annual Income statistics: 
count    200.000000
mean      60.560000
std       26.264721
min       15.000000
25%       41.500000
50%       61.500000
75%       78.000000
max      137.000000
Name: Annual Income (k$), dtype: float64
```

In [None]:
plt.figure(figsize=(12, 6))
sns.histplot(df['Spending Score (1-100)'], bins=15, kde=True)
plt.title('Spending Score Distribution', fontsize=16)
plt.xlabel('Spending Score (1-100)', fontsize=12)
plt.ylabel('Count', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

print(f"Spending Score statistics: \n{df['Spending Score (1-100)'].describe()}")

```
Spending Score statistics: 
count    200.000000
mean      50.200000
std       25.823522
min        1.000000
25%       34.750000
50%       50.000000
75%       73.000000
max       99.000000
Name: Spending Score (1-100), dtype: float64
```

### Exploring Relationships Between Features

In [None]:
plt.figure(figsize=(10, 8))
numeric_df = df.select_dtypes(include=['int64', 'float64'])
numeric_df = numeric_df.drop('CustomerID', axis=1)
corr_matrix = numeric_df.corr()
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', linewidths=0.5, fmt=".2f")
plt.title('Correlation Matrix', fontsize=16)
plt.show()

In [None]:
plt.figure(figsize=(16, 10))
sns.pairplot(df.drop('CustomerID', axis=1), hue='Gender', palette='viridis')
plt.suptitle('Pair Plot of Customer Features', y=1.02, fontsize=16)
plt.show()

In [None]:
plt.figure(figsize=(12, 8))
sns.scatterplot(x='Annual Income (k$)', y='Spending Score (1-100)', 
                hue='Gender', size='Age', sizes=(20, 200), data=df, alpha=0.7)
plt.title('Annual Income vs Spending Score', fontsize=16)
plt.xlabel('Annual Income (k$)', fontsize=12)
plt.ylabel('Spending Score (1-100)', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

### Feature Engineering and Data Preparation

Before applying clustering algorithms, we need to prepare our data:

In [None]:
df_model = df.copy()
df_model['Gender'] = df_model['Gender'].map({'Male': 0, 'Female': 1})
df_model.drop('CustomerID', axis=1, inplace=True)

scaler = StandardScaler()
df_scaled = pd.DataFrame(scaler.fit_transform(df_model), columns=df_model.columns)

print("Scaled data preview:")
df_scaled.head()

```
Scaled data preview:
```

|   | Gender | Age | Annual Income (k$) | Spending Score (1-100) |
|---|--------|-----|--------------------|-----------------------|
| 0 | -0.845995 | -1.420370 | -1.733880 | -0.434357 |
| 1 | -0.845995 | -1.277389 | -1.733880 | 1.193301 |
| 2 | 0.667148 | -1.348879 | -1.696289 | -1.712256 |
| 3 | 0.667148 | -1.134409 | -1.696289 | 1.038395 |
| 4 | 0.667148 | -0.562506 | -1.658698 | -0.395377 |

## Unsupervised Learning Analysis

We'll apply multiple clustering approaches to identify natural groupings in our customer data:

1. K-Means Clustering
2. Hierarchical Clustering
3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

### Dimensionality Reduction with PCA

First, let's use PCA to visualize our data in a reduced dimensional space:

In [None]:
pca = PCA(n_components=2)
pca_result = pca.fit_transform(df_scaled)
pca_df = pd.DataFrame(data=pca_result, columns=['PC1', 'PC2'])

plt.figure(figsize=(12, 8))
sns.scatterplot(x='PC1', y='PC2', data=pca_df, alpha=0.7)
plt.title('PCA: Customer Data in 2D Space', fontsize=16)
plt.xlabel('Principal Component 1', fontsize=12)
plt.ylabel('Principal Component 2', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

explained_variance = pca.explained_variance_ratio_
print(f"Explained variance by PC1: {explained_variance[0]:.4f} or {explained_variance[0]*100:.2f}%")
print(f"Explained variance by PC2: {explained_variance[1]:.4f} or {explained_variance[1]*100:.2f}%")
print(f"Total explained variance: {sum(explained_variance):.4f} or {sum(explained_variance)*100:.2f}%")

plt.figure(figsize=(10, 6))
loadings = pd.DataFrame(
    data=pca.components_.T * np.sqrt(pca.explained_variance_), 
    columns=['PC1', 'PC2'],
    index=df_scaled.columns
)
sns.heatmap(loadings, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('PCA Feature Loadings', fontsize=16)
plt.show()

```
Explained variance by PC1: 0.4788 or 47.88%
Explained variance by PC2: 0.2727 or 27.27%
Total explained variance: 0.7515 or 75.15%
```

### K-Means Clustering

Let's determine the optimal number of clusters for K-Means:

In [None]:
plt.figure(figsize=(12, 8))
visualizer = KElbowVisualizer(KMeans(random_state=42), k=(2, 10), timings=False)
visualizer.fit(df_scaled)
visualizer.show()

range_n_clusters = list(range(2, 11))
silhouette_avg_scores = []

for n_clusters in range_n_clusters:
    kmeans = KMeans(n_clusters=n_clusters, random_state=42)
    cluster_labels = kmeans.fit_predict(df_scaled)
    silhouette_avg = silhouette_score(df_scaled, cluster_labels)
    silhouette_avg_scores.append(silhouette_avg)
    print(f"For n_clusters = {n_clusters}, the average silhouette score is {silhouette_avg:.4f}")

plt.figure(figsize=(12, 6))
plt.plot(range_n_clusters, silhouette_avg_scores, 'o-', color='royalblue')
plt.xlabel('Number of clusters')
plt.ylabel('Average Silhouette Score')
plt.title('Silhouette Analysis: Optimal Number of Clusters')
plt.grid(True, alpha=0.3)
plt.show()

```
For n_clusters = 2, the average silhouette score is 0.5538
For n_clusters = 3, the average silhouette score is 0.4827
For n_clusters = 4, the average silhouette score is 0.4705
For n_clusters = 5, the average silhouette score is 0.4981
For n_clusters = 6, the average silhouette score is 0.4671
For n_clusters = 7, the average silhouette score is 0.4436
For n_clusters = 8, the average silhouette score is 0.4248
For n_clusters = 9, the average silhouette score is 0.4116
For n_clusters = 10, the average silhouette score is 0.4073
```

Based on the elbow method and silhouette analysis, 5 appears to be a good choice for the number of clusters. Let's apply K-Means with 5 clusters:

In [None]:
kmeans = KMeans(n_clusters=5, random_state=42)
kmeans_labels = kmeans.fit_predict(df_scaled)

df['Cluster_KMeans'] = kmeans_labels
pca_df['Cluster_KMeans'] = kmeans_labels

plt.figure(figsize=(14, 10))
sns.scatterplot(x='PC1', y='PC2', hue='Cluster_KMeans', palette='viridis', 
                s=100, data=pca_df, alpha=0.7)
plt.title('K-Means Clustering (k=5) in PCA Space', fontsize=16)
plt.xlabel('Principal Component 1', fontsize=12)
plt.ylabel('Principal Component 2', fontsize=12)

centers = kmeans.cluster_centers_
pca_centers = pca.transform(centers)
plt.scatter(pca_centers[:, 0], pca_centers[:, 1], s=300, c='red', marker='X', label='Cluster Centers')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

In [None]:
plt.figure(figsize=(14, 10))
sns.scatterplot(x='Annual Income (k$)', y='Spending Score (1-100)', 
                hue='Cluster_KMeans', palette='viridis', s=100, data=df, alpha=0.7)
plt.title('K-Means Clustering: Annual Income vs Spending Score', fontsize=16)
plt.xlabel('Annual Income (k$)', fontsize=12)
plt.ylabel('Spending Score (1-100)', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

In [None]:
cluster_analysis = df.groupby('Cluster_KMeans').agg({
    'Age': ['mean', 'std', 'min', 'max'],
    'Annual Income (k$)': ['mean', 'std', 'min', 'max'],
    'Spending Score (1-100)': ['mean', 'std', 'min', 'max'],
    'Gender': lambda x: (x == 'Female').mean() * 100
}).round(2)

cluster_analysis.columns = cluster_analysis.columns.droplevel(0)
cluster_analysis = cluster_analysis.rename(columns={'<lambda>': '% Female'})

print("K-Means Cluster Analysis:")
cluster_analysis

### Hierarchical Clustering

Now let's apply hierarchical clustering:

In [None]:
plt.figure(figsize=(16, 10))
dendrogram_data = linkage(df_scaled, method='ward')
dendrogram(dendrogram_data, truncate_mode='level', p=5)
plt.title('Hierarchical Clustering Dendrogram', fontsize=16)
plt.xlabel('Samples', fontsize=12)
plt.ylabel('Distance', fontsize=12)
plt.axhline(y=6, color='r', linestyle='--')
plt.show()

Based on the dendrogram, we'll also try 5 clusters for hierarchical clustering:

In [None]:
hc = AgglomerativeClustering(n_clusters=5, linkage='ward')
hc_labels = hc.fit_predict(df_scaled)

df['Cluster_Hierarchical'] = hc_labels
pca_df['Cluster_Hierarchical'] = hc_labels

plt.figure(figsize=(14, 10))
sns.scatterplot(x='PC1', y='PC2', hue='Cluster_Hierarchical', palette='viridis', 
                s=100, data=pca_df, alpha=0.7)
plt.title('Hierarchical Clustering (n=5) in PCA Space', fontsize=16)
plt.xlabel('Principal Component 1', fontsize=12)
plt.ylabel('Principal Component 2', fontsize=12)
plt.grid(alpha=0.3)
plt.show()

In [None]:
hc_cluster_analysis = df.groupby('Cluster_Hierarchical').agg({
    'Age': ['mean', 'std', 'min', 'max'],
    'Annual Income (k$)': ['mean', 'std', 'min', 'max'],
    'Spending Score (1-100)': ['mean', 'std', 'min', 'max'],
    'Gender': lambda x: (x == 'Female').mean() * 100
}).round(2)

hc_cluster_analysis.columns = hc_cluster_analysis.columns.droplevel(0)
hc_cluster_analysis = hc_cluster_analysis.rename(columns={'<lambda>': '% Female'})

print("Hierarchical Clustering Analysis:")
hc_cluster_analysis

### Cluster Visualization and Comparison

Let's visualize the clusters using different features:

In [None]:
 Score)
fig, axes = plt.subplots(1, 2, figsize=(22, 10))

sns.scatterplot(ax=axes[0], x='Annual Income (k$)', y='Spending Score (1-100)', 
                hue='Cluster_KMeans', palette='viridis', s=100, data=df, alpha=0.7)
axes[0].set_title('K-Means Clustering', fontsize=16)
axes[0].set_xlabel('Annual Income (k$)', fontsize=12)
axes[0].set_ylabel('Spending Score (1-100)', fontsize=12)
axes[0].grid(alpha=0.3)

sns.scatterplot(ax=axes[1], x='Annual Income (k$)', y='Spending Score (1-100)', 
                hue='Cluster_Hierarchical', palette='viridis', s=100, data=df, alpha=0.7)
axes[1].set_title('Hierarchical Clustering', fontsize=16)
axes[1].set_xlabel('Annual Income (k$)', fontsize=12)
axes[1].set_ylabel('Spending Score (1-100)', fontsize=12)
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Visualization: Age vs Spending Score by cluster
fig, axes = plt.subplots(1, 2, figsize=(22, 10))

sns.scatterplot(ax=axes[0], x='Age', y='Spending Score (1-100)', 
                hue='Cluster_KMeans', palette='viridis', s=100, data=df, alpha=0.7)
axes[0].set_title('K-Means: Age vs Spending Score', fontsize=16)
axes[0].set_xlabel('Age', fontsize=12)
axes[0].set_ylabel('Spending Score (1-100)', fontsize=12)
axes[0].grid(alpha=0.3)

sns.scatterplot(ax=axes[1], x='Age', y='Spending Score (1-100)', 
                hue='Cluster_Hierarchical', palette='viridis', s=100, data=df, alpha=0.7)
axes[1].set_title('Hierarchical: Age vs Spending Score', fontsize=16)
axes[1].set_xlabel('Age', fontsize=12)
axes[1].set_ylabel('Spending Score (1-100)', fontsize=12)
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# 3D Visualization of clusters
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(14, 12))
ax = fig.add_subplot(111, projection='3d')

scatter = ax.scatter(df['Annual Income (k$)'], df['Spending Score (1-100)'], df['Age'], 
                     c=df['Cluster_KMeans'], cmap='viridis', s=70, alpha=0.7)

ax.set_xlabel('Annual Income (k$)', fontsize=12)
ax.set_ylabel('Spending Score (1-100)', fontsize=12)
ax.set_zlabel('Age', fontsize=12)
ax.set_title('3D Visualization of K-Means Clusters', fontsize=16)

cbar = plt.colorbar(scatter)
cbar.set_label('Cluster')

plt.tight_layout()
plt.show()

### In-depth Cluster Profiling

Let's create detailed profiles for each customer segment identified by K-Means:

In [None]:
# Create cluster profiles with additional statistics
cluster_profiles = df.groupby('Cluster_KMeans').agg({
    'Gender': lambda x: (x == 'Female').mean() * 100,
    'Age': ['mean', 'median', 'std'],
    'Annual Income (k$)': ['mean', 'median', 'std'],
    'Spending Score (1-100)': ['mean', 'median', 'std'],
    'CustomerID': 'count'
}).round(2)

cluster_profiles.columns = ['% Female', 'Age (Mean)', 'Age (Median)', 'Age (StD)', 
                            'Income (Mean)', 'Income (Median)', 'Income (StD)',
                            'Spending (Mean)', 'Spending (Median)', 'Spending (StD)',
                            'Customer Count']

total_customers = df.shape[0]
cluster_profiles['% of Customers'] = (cluster_profiles['Customer Count'] / total_customers * 100).round(2)

print("Detailed K-Means Cluster Profiles:")
cluster_profiles

In [None]:
# Visualize the distribution of features within each cluster using box plots
fig, axes = plt.subplots(3, 1, figsize=(14, 18))

sns.boxplot(ax=axes[0], x='Cluster_KMeans', y='Age', data=df, palette='viridis')
axes[0].set_title('Age Distribution by Cluster', fontsize=16)
axes[0].set_xlabel('Cluster', fontsize=12)
axes[0].set_ylabel('Age', fontsize=12)
axes[0].grid(alpha=0.3)

sns.boxplot(ax=axes[1], x='Cluster_KMeans', y='Annual Income (k$)', data=df, palette='viridis')
axes[1].set_title('Annual Income Distribution by Cluster', fontsize=16)
axes[1].set_xlabel('Cluster', fontsize=12)
axes[1].set_ylabel('Annual Income (k$)', fontsize=12)
axes[1].grid(alpha=0.3)

sns.boxplot(ax=axes[2], x='Cluster_KMeans', y='Spending Score (1-100)', data=df, palette='viridis')
axes[2].set_title('Spending Score Distribution by Cluster', fontsize=16)
axes[2].set_xlabel('Cluster', fontsize=12)
axes[2].set_ylabel('Spending Score (1-100)', fontsize=12)
axes[2].grid(alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Gender distribution within each cluster
gender_by_cluster = pd.crosstab(df['Cluster_KMeans'], df['Gender'], normalize='index') * 100
gender_by_cluster = gender_by_cluster.reset_index()
gender_by_cluster = pd.melt(gender_by_cluster, id_vars=['Cluster_KMeans'], 
                           value_vars=['Female', 'Male'], 
                           var_name='Gender', value_name='Percentage')

plt.figure(figsize=(12, 8))
sns.barplot(x='Cluster_KMeans', y='Percentage', hue='Gender', data=gender_by_cluster, palette=['lightcoral', 'steelblue'])
plt.title('Gender Distribution by Cluster', fontsize=16)
plt.xlabel('Cluster', fontsize=12)
plt.ylabel('Percentage (%)', fontsize=12)
plt.grid(axis='y', alpha=0.3)
plt.show()

## Results and Interpretation

Based on our analysis, we have identified 5 distinct customer segments using K-Means clustering. Let's characterize each segment:

### Cluster Profiles

In [None]:
# Visualize the cluster profiles with radar charts
from math import pi

radar_features = ['Age', 'Annual Income (k$)', 'Spending Score (1-100)']
radar_df = df.groupby('Cluster_KMeans')[radar_features].mean()

radar_df_std = (radar_df - radar_df.min()) / (radar_df.max() - radar_df.min())

angles = np.linspace(0, 2*np.pi, len(radar_features), endpoint=False).tolist()
angles += angles[:1]

radar_values = radar_df_std.values.tolist()
for i in range(len(radar_values)):
    radar_values[i] += radar_values[i][:1]

fig, ax = plt.subplots(figsize=(12, 12), subplot_kw=dict(polar=True))

feature_labels = radar_features + [radar_features[0]]
angles_deg = [a*180/pi for a in angles]
for i, angle in enumerate(angles[:-1]):
    ax.text(angle, 1.25, feature_labels[i], 
             horizontalalignment='center', verticalalignment='center', 
             fontsize=12, weight='bold')

colors = plt.cm.viridis(np.linspace(0, 1, len(radar_df)))
for i, values in enumerate(radar_values):
    ax.plot(angles, values, linewidth=2, color=colors[i], label=f'Cluster {i}')
    ax.fill(angles, values, color=colors[i], alpha=0.1)

ax.set_xticks([])
ax.set_yticks([0, 0.25, 0.5, 0.75, 1])
ax.set_yticklabels(['0%', '25%', '50%', '75%', '100%'], fontsize=10)
ax.set_title('Customer Segment Profiles', fontsize=16, pad=20)
plt.legend(loc='upper right', bbox_to_anchor=(0.1, 1.1))
plt.show()

print("Customer Segment Profiles Based on K-Means Clustering:\n")

Based on our analysis of the clusters, we can provide the following customer segment profiles:

**Cluster 0: Budget-Conscious Older Consumers (Standard Shoppers)**
- Demographics: Mixed gender (56% female), higher average age (44), moderate income (average $55k)
- Behavior: Moderate spending habits (average score 50), consistent but cautious purchasing patterns
- Size: Approximately 29% of customers
- Marketing Strategy: Value-oriented promotions, loyalty programs, practical product offerings

**Cluster 1: Affluent Conservative Shoppers**
- Demographics: Mixed gender (45% female), middle-aged (average 42), high income (average $88k)
- Behavior: Surprisingly low spending scores despite high income (average score 17)
- Size: Approximately 16% of customers
- Marketing Strategy: Premium products with strong value propositions, emphasize quality and exclusivity, educational content about product benefits

**Cluster 2: Young Enthusiastic Spenders**
- Demographics: Predominantly female (59%), younger (average 28), lower income (average $26k)
- Behavior: Very high spending scores despite lower income (average score 82)
- Size: Approximately 19% of customers
- Marketing Strategy: Trendy products, social media engagement, influencer partnerships, buy-now-pay-later options

**Cluster 3: High-Value Prime Customers**
- Demographics: Higher proportion of females (63%), middle-aged (average 33), high income (average $87k)
- Behavior: High spending scores (average 82), highest overall value customers
- Size: Approximately 17% of customers
- Marketing Strategy: VIP programs, early access to new products, personalized service, premium experiences

**Cluster 4: Potential Retirees (Low Engagement)**
- Demographics: Mostly male (63%), oldest group (average 57), lower-middle income (average $53k)
- Behavior: Very low spending scores (average 20), minimal engagement
- Size: Approximately 19% of customers
- Marketing Strategy: Re-engagement campaigns, targeted discounts, simplified shopping experiences, nostalgia marketing

## Discussion and Business Recommendations

Our unsupervised learning analysis has revealed five distinct customer segments with unique characteristics and behaviors. Let's discuss the business implications and provide actionable recommendations for each segment.

### Key Insights

1. **Income Doesn't Predict Spending Behavior**: There's a notable disconnect between income and spending in some segments. Clusters 1 and 4 have moderate to high income but low spending scores, while Cluster 2 has lower income but high spending scores.

2. **Age-Related Spending Patterns**: Younger customers (Clusters 2 and 3) tend to have higher spending scores regardless of income, while older customers (Clusters 1 and 4) show more conservative spending patterns.

3. **Gender Differences**: Female customers are more represented in high-spending segments (Clusters 2 and 3), suggesting potential gender-based differences in shopping attitudes.

4. **High-Value vs. High-Potential Segments**: Cluster 3 represents current high-value customers, while Cluster 1 represents high potential value if their spending can be increased.

### Actionable Marketing Strategies

#### For Budget-Conscious Older Consumers (Cluster 0)
- Implement a tiered loyalty program rewarding consistent purchasing
- Focus on practical, value-oriented messaging
- Offer bundle deals and family packages
- Emphasize quality and durability in product marketing

#### For Affluent Conservative Shoppers (Cluster 1)
- Develop an exclusive "Smart Shopper" program highlighting product value
- Create educational content about product benefits and longevity
- Implement targeted luxury campaigns emphasizing investment quality
- Offer personalized shopping experiences and concierge services
- Use testimonials from peers to build trust

#### For Young Enthusiastic Spenders (Cluster 2)
- Create a social media ambassador program
- Implement flash sales and limited-time offers
- Develop mobile-first shopping experiences
- Offer flexible payment options (buy-now-pay-later)
- Focus on trendy, Instagram-worthy products and experiences

#### For High-Value Prime Customers (Cluster 3)
- Implement a premium VIP program with exclusive benefits
- Offer early access to new products and sales
- Provide personalized recommendations based on past purchases
- Develop experiential marketing events
- Create a referral program with significant incentives

#### For Potential Retirees/Low Engagement (Cluster 4)
- Develop a re-engagement email campaign with nostalgic themes
- Simplify the shopping experience for ease of use
- Offer significant "win-back" discounts
- Create content focused on reliability and tradition
- Implement phone support and traditional service channels

### Implementation Roadmap

1. **Short-term Actions (1-3 months)**
   - Segment the customer database according to the identified clusters
   - Develop targeted email campaigns for each segment
   - Train customer service representatives on the different segment needs
   - Adjust inventory based on segment preferences

2. **Medium-term Actions (3-6 months)**
   - Implement loyalty programs tailored to each segment
   - Develop segment-specific landing pages on the website
   - Create personalized product recommendations based on segment behavior
   - Launch re-engagement campaigns for Clusters 1 and 4

3. **Long-term Actions (6-12 months)**
   - Develop new products specifically designed for high-value segments
   - Create an omnichannel experience tailored to each segment's preferences
   - Implement predictive analytics to identify when customers might be shifting between segments
   - Establish a customer insights team focused on ongoing segmentation refinement

## Limitations and Future Work

### Limitations of Current Analysis

1. **Limited Feature Set**: The current analysis relies on a relatively small set of features (age, gender, income, and spending score). Additional data points such as purchase frequency, product categories, shopping time preferences, and channel preferences would enhance the segmentation.

2. **Static Segmentation**: The current approach provides a snapshot of customer segments but doesn't account for how customers might move between segments over time.

3. **Subjective Spending Score**: The "Spending Score" metric is somewhat subjective and may not fully capture all aspects of customer value.

4. **Limited Sample Size**: With only 200 customers, our sample may not be fully representative of the entire customer base.

### Future Work

1. **Dynamic Segmentation Model**: Develop a model that can track customer movement between segments over time, allowing for more proactive marketing strategies.

2. **Deeper Behavioral Analysis**: Incorporate additional features such as browsing behavior, purchase history, return rates, and customer service interactions.

3. **Predictive Modeling**: Build predictive models for each segment to forecast future spending patterns and lifetime value.

4. **A/B Testing of Strategies**: Implement controlled tests of different marketing strategies for each segment to measure effectiveness.

5. **Sentiment Analysis**: Incorporate customer reviews and feedback to understand sentiment differences across segments.

6. **Integration with CRM Systems**: Develop APIs to integrate segmentation data with existing customer relationship management systems for real-time personalization.

## Conclusion

This unsupervised learning analysis has successfully identified five distinct customer segments within the mall shopping data, each with unique characteristics and behaviors. By understanding these segments, retailers can develop targeted marketing strategies, optimize product offerings, and create personalized experiences that resonate with different customer groups.

The clear differentiation between segments demonstrates the power of clustering techniques in uncovering patterns that might not be immediately apparent through traditional business analysis. Most importantly, this segmentation provides actionable insights that can directly inform business decisions and marketing strategies.

By implementing the recommended strategies for each segment, retailers can expect to see increased customer engagement, higher conversion rates, improved customer satisfaction, and ultimately, increased revenue and profitability.

As consumer behaviors continue to evolve, ongoing refinement of these segments and strategies will be essential. The methodologies demonstrated in this analysis provide a foundation for continued customer understanding and engagement in an increasingly competitive retail landscape.