# Country Development Clustering: Identifying Priority Nations for Aid  
## Third Notebook: Objective 

Building on the clustering analysis performed in Notebook 2, this third notebook focuses on **translating structural patterns among countries into actionable insights** for HELP International. While the previous work identified groups of nations with similar socio-economic and health characteristics, this notebook aims to **interpret these clusters in a policy-relevant context**, highlighting where targeted aid or development programs could have the most impact.

Specifically, we will:

- **Analyze the composition of each cluster** to understand the development profile of countries (e.g., low vs. high development, trade patterns, health indicators).  
- **Identify potential priority nations for intervention** based on cluster membership and key socio-economic challenges.  
- **Provide next steps to identify actionable recommendations** for resource allocation, program design, and monitoring strategies.  
- **Visualize clusters in a clear, communicable format** to support decision-making by stakeholders.

This notebook bridges **data-driven insights** with **practical recommendations**, ensuring that clustering results can inform strategic aid planning effectively.

**Author:** J-F Jutras  
**Date:** September 2025  
**Dataset:** *Unsupervised Learning on Country Data – Kaggle*


## 3.1-Cluster Analysis

The analysis initially explored GDP-relative ratios (% of GDP), which provided a useful practical exercise. However, for the final clustering and recommendations, we focus exclusively on the **Amounts dataset** (absolute per-capita values).  

**Why:**  
- Ratios reflect economic structure but not actual resource capacity; countries with similar ratios can have vastly different levels of wealth.  
- Absolute values are more actionable for aid allocation and avoid redundancy that can distort clustering results.  
- Among the clustering algorithms tested, **K-Means was the only one that produced acceptable, interpretable clusters** according to evaluation metrics, reinforcing the choice of absolute values for final analysis.  

The final insights and recommendations are based on absolute per-capita economic and health indicators, using K-Means clustering to ensure meaningful and actionable groupings for HELP International.



## 3.2-Identify Potential Priority Nations for Intervention

![image.png](attachment:bdd21b0d-f61f-4508-9b3f-0b6978ea753c.png)

### Cluster 0 (80 countries)
**Description:** Slightly above-average exports and imports, moderate health spending, slightly below-average income and GDP. Mostly mid-level countries in economic and health measures.  

**Countries:**  
Albania, Algeria, Antigua and Barbuda, Argentina, Armenia, Azerbaijan, Barbados, Belarus, Belize, Bhutan, Bolivia, Bosnia and Herzegovina, Botswana, Brazil, Bulgaria, Cape Verde, Chile, China, Colombia, Costa Rica, Croatia, Dominican Republic, Ecuador, Egypt, El Salvador, Equatorial Guinea, Estonia, Fiji, Gabon, Georgia, Grenada, Guatemala, Guyana, Hungary, Indonesia, Iran, Iraq, Jamaica, Jordan, Kazakhstan, Kyrgyz Republic, Latvia, Lebanon, Libya, Lithuania, Macedonia, FYR, Malaysia, Maldives, Mauritius, Micronesia, Fed. Sts., Moldova, Mongolia, Montenegro, Morocco, Namibia, Panama, Paraguay, Peru, Philippines, Poland, Romania, Russia, Samoa, Serbia, Seychelles, South Africa, Sri Lanka, St. Vincent and the Grenadines, Suriname, Thailand, Tonga, Tunisia, Turkey, Turkmenistan, Ukraine, Uruguay, Uzbekistan, Vanuatu, Venezuela, Vietnam

### Cluster 1 (39 countries)
**Description:** High exports, imports, health spending, income, and GDP. Represents the wealthiest and most developed countries.  

**Countries:**  
Australia, Austria, Bahamas, Bahrain, Belgium, Brunei, Canada, Cyprus, Czech Republic, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Japan, Kuwait, Luxembourg, Malta, Netherlands, New Zealand, Norway, Oman, Portugal, Qatar, Saudi Arabia, Singapore, Slovak Republic, Slovenia, South Korea, Spain, Sweden, Switzerland, United Arab Emirates, United Kingdom, United States

### Cluster 2 (47 countries)
**Description:** Very low exports, imports, health spending, income, and GDP. Represents the least developed countries.  

**Countries:**  
Afghanistan, Angola, Bangladesh, Benin, Burkina Faso, Burundi, Cambodia, Cameroon, Central African Republic, Chad, Comoros, Congo, Dem. Rep., Congo, Rep., Côte d'Ivoire, Eritrea, Gambia, Ghana, Guinea, Guinea-Bissau, Haiti, India, Kenya, Kiribati, Lao, Lesotho, Liberia, Madagascar, Malawi, Mali, Mauritania, Mozambique, Myanmar, Nepal, Niger, Pakistan, Rwanda, Senegal, Sierra Leone, Solomon Islands, Sudan, Tajikistan, Tanzania, Timor-Leste, Togo, Uganda, Yemen, Zambia

### Cluster 3 (1 country)
**Description:** Nigeria is a unique case with low to moderate economic and health measures, forming a single-country cluster due to its distinct profile.  

**Country:**  
Nigeria


![image.png](attachment:639c6b54-4bb2-46a3-9b89-1ca8b09ca511.png)

The K-Means clustering analysis revealed that Nigeria forms its own distinct cluster (Cluster 3) in the Amounts dataset. This separation is justified by its exceptionally high values across several key economic indicators compared to other countries in the dataset.

**Key Observations from the Radar Plot:**
- Nigeria's per-capita exports, imports, health spending, income, and GDP are significantly higher than the averages of Cluster 0 (mid-level countries) and Cluster 2 (lower-level countries).
- The radar plot clearly shows that Nigeria's profile is extreme along multiple dimensions simultaneously, which makes it an outlier in terms of absolute economic and health measures.
- Other clusters, such as Cluster 0 and Cluster 2, contain countries with more balanced or moderate values, highlighting the stark contrast with Nigeria.

**Interpretation:**
- Nigeria's exceptional economic and health indicators justify its placement in a standalone cluster.
- From a clustering perspective, K-Means naturally separates extreme cases to avoid distorting cluster centroids of more representative groups.
- Recognizing Nigeria as a unique cluster ensures that the analysis of other clusters (Clusters 0, 1, and 2) remains meaningful and actionable for aid allocation.

![image.png](attachment:ef593d23-969a-4dca-9fc2-e66007548a45.png)

The bar chart of average standardized values per cluster provides a clear overview of the development profiles across the groups of countries. Cluster 1 stands out as the wealthiest group, with high per-capita exports, imports, health spending, income, and GDP. Cluster 0 represents mid-level countries with slightly above-average trade activity, moderate health spending, and slightly below-average income and GDP. Cluster 2 includes the least developed nations, characterized by very low values across all economic and health indicators. Finally, Cluster 3, consisting solely of Nigeria, highlights its unique profile, which is extreme on several indicators compared to other countries.

## 3.3-Provide Next Steps to Identify Actionable Recommendations

The cluster analysis provides a structured way to identify groups of countries with similar development profiles and serves as a first-pass “pool” of potential priority nations for aid. However, this grouping alone does not determine the exact allocation of resources. To translate these insights into actionable interventions with real impact, the following steps are recommended:

**Prioritize Clusters as a Starting Point**  
   Use clusters to highlight countries that are generally low in absolute per-capita economic and health measures (Cluster 2 and parts of Cluster 0) as initial candidates for aid programs. These clusters represent a first-order view of vulnerability.

**Quantify Key Drivers with Statistical and Predictive Analysis**  
   Determine which features most strongly influence critical outcomes such as child mortality, life expectancy, or income:  
   - **Correlation Analysis:** Identify variables most associated with poor outcomes.  
   - **Regression Models:** Quantify the contribution of each feature to key health and economic indicators.  
   - **Feature Importance / Machine Learning Models:** Use Random Forests, XGBoost, or SHAP values to rank variables by impact.  
   This step allows prioritization not just by overall development, but by the specific dimensions where interventions can make the most difference.

**Identify Country-Specific Needs Beyond Cluster Averages**  
   Even within higher-performing clusters, some countries may have severe deficiencies in particular areas. For example, a country in Cluster 0 may have moderate economic indicators but very low health spending or high child mortality. Statistical insights can highlight these outliers, ensuring aid targets the most pressing gaps rather than cluster membership alone.

**Link Feature-Level Insights to Project Planning**  
   Aid programs and projects should be designed to directly address the features identified as most critical. For instance:  
   - If health spending and child mortality are top drivers, projects could include vaccination campaigns, maternal health programs, or healthcare infrastructure improvements.  
   - If income and GDP per capita are critical, programs could focus on vocational training, microfinance, or local enterprise support.  
   This ensures that project-level decisions are aligned with the data-driven understanding of which interventions can maximize impact.

**Allocate Resources Strategically and Monitor Impact**  
   Combine cluster-based prioritization with feature-level insights to distribute funds where they can achieve measurable improvements. Monitor KPIs tied to the targeted features and adjust programs iteratively as new data informs the effectiveness of interventions.

Clustering provides an initial identification of priority countries, but meaningful impact depends on feature-level analysis and project-specific planning. By linking country clusters to targeted projects addressing the most critical socio-economic or health indicators, HELP International can ensure that aid programs are both strategic and results-driven.
