## Project Overview

This project explores international debt data sourced from the World Bank, focusing on debt obligations held by developing countries across multiple financial indicators. My objective is to use SQL to perform structured exploratory data analysis, apply aggregations, and extract comparative insights from a multi-country financial dataset.

The dataset contains country-level debt metrics (in current USD) categorized by specific debt indicators. Using SQL, I analyze country-level borrowing patterns and repayment trends to identify high-level financial distributions across nations.

---

## Analytical Objectives

The analysis is driven by the following key business questions:

1. How many distinct countries are represented in the dataset?
2. Which country holds the highest total amount of debt (aggregated across indicators)?
3. Which country has the lowest total repayment amount?

To answer these questions, I leverage:

- `COUNT(DISTINCT ...)` for entity-level validation  
- `SUM()` aggregations for financial consolidation  
- `GROUP BY` for country-level summarization  
- `ORDER BY` and ranking logic for comparative analysis  
- Filtering and conditional logic to isolate repayment indicators  

---

## Dataset Description

### Table: `international_debt`

| Column | Definition | Data Type |
|--------|------------|-----------|
| `country_name` | Name of the country | `varchar` |
| `country_code` | Code representing the country | `varchar` |
| `indicator_name` | Description of the debt indicator | `varchar` |
| `indicator_code` | Code representing the debt indicator | `varchar` |
| `debt` | Value of the debt indicator for the given country (in current US dollars) | `float` |

Each row represents a countryâ€“indicator pair, meaning that total country-level debt must be computed using aggregation across multiple indicators.

---

## Technical Focus

This project demonstrates:

- Relational data exploration using SQL  
- Financial metric aggregation across categorical indicators  
- Comparative country-level analysis  
- Data validation and summarization techniques  
- Writing clean, structured, and reproducible SQL queries  

The goal is to simulate a real-world financial analysis scenario in which a data analyst must summarize large-scale economic data and extract meaningful insights for stakeholders.

In [15]:
-- num_distinct_countries 
SELECT COUNT(DISTINCT country_name) AS total_distinct_countries
FROM international_debt;

Unnamed: 0,total_distinct_countries
0,124


In [16]:
-- highest_debt_country 
SELECT country_name, SUM(debt) AS total_debt
FROM international_debt
GROUP BY country_name
ORDER BY total_debt DESC
LIMIT 1;

Unnamed: 0,country_name,total_debt
0,China,285793500000.0


In [17]:
-- lowest_principal_repayment 
SELECT country_name, indicator_name, debt AS lowest_repayment
FROM international_debt
WHERE indicator_code = 'DT.AMT.DLXF.CD'
ORDER BY lowest_repayment
LIMIT 1;

Unnamed: 0,country_name,indicator_name,lowest_repayment
0,Timor-Leste,"Principal repayments on external debt, long-te...",825000
