### ✅ Cluster Validation Summary

To determine the optimal number of clusters (K), we used two validation techniques: the Elbow Method and the Silhouette Score. Both visualizations are shown below.

#### 📉 Elbow Method
![Elbow Method](../img/elbow_method.png)

- The curve shows a noticeable drop in inertia up to **K = 3**, after which the improvements become marginal.
- This suggests that **K = 3** is a solid choice, offering a good trade-off between model simplicity and clustering accuracy.

#### 📈 Silhouette Scores
![Silhouette Scores](../img/silhouette-scores.png)

- The highest silhouette score occurs at **K = 2**, but **K = 3** still achieves a relatively high value.
- Since K = 3 aligns with business expectations and segmentation clarity, it remains the preferred option.

#### ✅ Final Conclusion
Combining both metrics, we conclude that **K = 3** offers a well-balanced solution — maintaining **cluster cohesion**, **separation**, and **practical interpretability**.

### 🧩 Cluster Summary – Averages and Distribution

Below are the average profiles and size distribution of each cluster identified in the segmentation. Together, they provide a quantitative overview of customer groups.

<div style="display: flex; gap: 20px; justify-content: center;">

<div>
<strong>Cluster Averages</strong><br>
<img src="../img/cluster-averages.png" width="500"/>
</div>

<div>
<strong>Cluster Distribution</strong><br>
<img src="../img/cluster-distribution.png" width="400"/>
</div>

</div>

#### ✅ Cluster Validation Summary

- **Cluster Averages**:  
  Each cluster shows distinct patterns in average age, income, and spending score, indicating meaningful behavioral differences.

- **Cluster Distribution**:  
  The number of clients per cluster is relatively balanced, with no dominant or underrepresented group.

- **Conclusion**:  
  The segmentation into 3 clusters is consistent and actionable for marketing strategies.


---

## PCA Cluster Visualization 

![Tabela de clusters](../img/pca_clusters.png)


## 📊 Key Insights


🟢 Cluster 0 (green)
Located in the lower region of the PCA plot.
Clearly separated from the other clusters.
Represents a more homogeneous group, likely composed of younger clients with lower income.
Matches the profile seen in boxplots and radar chart (low spending behavior).


🟠 Cluster 1 (orange)
Positioned near the center, acting as a bridge between extremes.
Represents a more diverse group with middle-aged clients, average income, and high spending.
Likely the ideal target for retention and premium offers.


🔵 Cluster 2 (blue)
Distributed in the upper-right region, with moderate separation.
Likely consists of younger, high-income clients with untapped purchasing potential.
Great target for upselling and aspirational campaigns.


### 🧠 Cluster Summary and Strategic Recommendations

| Cluster | Main Profile                        | Suggested Strategy                                     |
|---------|-------------------------------------|--------------------------------------------------------|
| 0       | Young, low income, low spending     | Loyalty programs, entry-level or accessible products   |
| 1       | Middle-aged, average income, high spending | Premium offers, personalized experiences               |
| 2       | Young, high income, moderate spending | Upselling, aspirational marketing, unique experiences  |

---

### 📊 Variable Distribution by Cluster

The boxplots below illustrate the distribution of the three key variables — age, annual income, and spending score — for each cluster. These visualizations help interpret behavioral and economic differences between the groups.

![Boxplots by Cluster](../img/boxplots.png)

#### 🔍 Interpretation:

- **Age**: Highlights which clusters contain younger or older clients.
- **Annual Income**: Reveals groups with higher or lower purchasing power.
- **Spending Score**: Indicates which clusters have a higher tendency to spend.

These patterns provide actionable insights into customer behavior and segmentation strategy.


---

### 🔗 Variable Relationships by Cluster – Pairplot

The scatter matrix below displays the relationships between age, annual income, and spending score across all clusters. It provides a multidimensional view of how the groups differ from each other.

![Pairplot by Cluster](../img/pairplot.png)

#### 🧠 What insights can be drawn from this plot:

📌 **Age vs. Annual Income**:
- Cluster 2 tends to group **younger clients with higher incomes**.
- Cluster 1 includes **older clients with moderate incomes**.
- Cluster 0 is more **scattered**, covering various ages but generally lower income levels.

📌 **Age vs. Spending Score**:
- High spending scores appear in **both younger (Cluster 2)** and **older (Cluster 1)** clients.
- This suggests high spending potential across age ranges — possibly due to different motivations (e.g., disposable income vs. lifestyle).

📌 **Annual Income vs. Spending Score**:
- Higher income does not always correlate with higher spending.
- This disconnect reveals **strategic opportunities**, like promoting spending incentives for high-income but low-engagement clients.


---

### 🕸️ Normalized Cluster Profiles – Radar Chart

The radar chart below compares the average (normalized) values of age, income, and spending score across each cluster. This visualization highlights key behavioral patterns and helps define strategies per segment.

![Radar Chart](../img/radar-chart.png)

#### 🔍 Cluster Insights

🔵 **Cluster 0**  
- Younger clients with lower income and low spending.  
- 🧠 Strategy: Entry-level products, loyalty programs, and affordable offers.

🟠 **Cluster 1**  
- Older clients with medium income and the **highest spending levels**.  
- 🧠 Strategy: Premium offerings, value-added services, and retention-focused campaigns.

🟢 **Cluster 2**  
- Young clients with high income and moderate spending.  
- 🧠 Strategy: Aspirational marketing and exclusive experiences to boost spending potential.


---