#  Evaluating GDP per Capita using CASE and IF

##  Learning Objectives
By the end of this training, you should be able to:
- Understand the concept of **GDP per capita** and its importance in measuring economic performance.
- Calculate **GDP per capita** using SQL.
- Apply **CASE** and **IF** statements to categorize countries based on GDP per capita.



##  Overview
In this notebook, we will explore how to evaluate **Gross Domestic Product (GDP) per capita** using SQL.  

GDP per capita measures the total output of a country divided by its population — giving the **average economic output per person**.  
It’s a useful metric for comparing living standards and economic health across different countries and time periods.

We’ll be using the `united_nations.Access_to_Basic_Services` table, which includes:
- Estimated GDP in billions  
- Estimated population in millions  
- Country and time period information  

Let's begin by calculating GDP per capita for each country.


In [1]:
%load_ext sql

In [2]:
%sql mysql+pymysql://root:Mysql%40505@localhost:3306/united_nations

##  Task 1 — Calculate GDP per Capita

To calculate **GDP per capita**, we’ll use the formula:

\[
GDP\_per\_capita = \frac{Est\_gdp\_in\_billions}{Est\_population\_in\_millions} \times 1000
\]

We’ll extract the following columns:
- `Country_name`
- `Time_period`
- `Est_population_in_millions`
- `Est_gdp_in_billions`
- And create a new calculated column `GDP_per_capita`.

This gives us a view of each country's economic output per person.


In [3]:
%%sql

SELECT
    Country_name,
    Time_period,
    Est_population_in_millions,
    Est_gdp_in_billions,
    ROUND((Est_gdp_in_billions / Est_population_in_millions) * 1000, 2) AS GDP_per_capita
FROM united_nations.Access_to_Basic_Services
WHERE Est_population_in_millions IS NOT NULL
  AND Est_gdp_in_billions IS NOT NULL
LIMIT 10;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Est_population_in_millions,Est_gdp_in_billions,GDP_per_capita
Kazakhstan,2015,17.542806,184.39,10510.86
Kazakhstan,2016,17.794055,137.28,7714.94
Kazakhstan,2017,18.037776,166.81,9247.81
Kazakhstan,2018,18.276452,179.34,9812.63
Kazakhstan,2019,18.513673,181.67,9812.75
Kazakhstan,2020,18.755666,171.08,9121.51
Tajikistan,2015,8.524063,8.27,970.19
Tajikistan,2016,8.725318,6.99,801.12
Tajikistan,2017,8.925525,7.54,844.77
Tajikistan,2018,9.128132,7.77,851.21


##  Task 2 — Add Poverty Line

Next, we’ll create a new column called **Poverty_line**.  

Using the **IF function**, we’ll assign:
- `1.90` if the `Time_period` is **before 2017**
- `2.50` otherwise.

This helps us account for changes in global poverty standards over time.


In [4]:
%%sql

SELECT
    Country_name,
    Time_period,
    Est_population_in_millions,
    Est_gdp_in_billions,
    ROUND((Est_gdp_in_billions / Est_population_in_millions) * 1000, 2) AS GDP_per_capita,
    IF(Time_period < 2017, 1.90, 2.50) AS Poverty_line
FROM united_nations.Access_to_Basic_Services
WHERE Est_population_in_millions IS NOT NULL
  AND Est_gdp_in_billions IS NOT NULL
LIMIT 10;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Est_population_in_millions,Est_gdp_in_billions,GDP_per_capita,Poverty_line
Kazakhstan,2015,17.542806,184.39,10510.86,1.9
Kazakhstan,2016,17.794055,137.28,7714.94,1.9
Kazakhstan,2017,18.037776,166.81,9247.81,2.5
Kazakhstan,2018,18.276452,179.34,9812.63,2.5
Kazakhstan,2019,18.513673,181.67,9812.75,2.5
Kazakhstan,2020,18.755666,171.08,9121.51,2.5
Tajikistan,2015,8.524063,8.27,970.19,1.9
Tajikistan,2016,8.725318,6.99,801.12,1.9
Tajikistan,2017,8.925525,7.54,844.77,2.5
Tajikistan,2018,9.128132,7.77,851.21,2.5


##  Task 3 — Categorize Countries Using CASE and IF

We’ll now combine **CASE** and **IF** statements to categorize each country into income groups based on GDP per capita.

Let’s define the following classifications:

| Income Group | Condition |
|---------------|------------|
| Low Income | GDP_per_capita < 1026 |
| Lower-Middle Income | 1026 ≤ GDP_per_capita < 3996 |
| Upper-Middle Income | 3996 ≤ GDP_per_capita < 12376 |
| High Income | GDP_per_capita ≥ 12376 |

We’ll also include the Poverty Line value for reference.


In [5]:
%%sql

SELECT
    Country_name,
    Time_period,
    Est_population_in_millions,
    Est_gdp_in_billions,
    ROUND((Est_gdp_in_billions / Est_population_in_millions) * 1000, 2) AS GDP_per_capita,
    IF(Time_period < 2017, 1.90, 2.50) AS Poverty_line,
    CASE
        WHEN (Est_gdp_in_billions / Est_population_in_millions) * 1000 < 1026 THEN 'Low Income'
        WHEN (Est_gdp_in_billions / Est_population_in_millions) * 1000 BETWEEN 1026 AND 3995 THEN 'Lower-Middle Income'
        WHEN (Est_gdp_in_billions / Est_population_in_millions) * 1000 BETWEEN 3996 AND 12375 THEN 'Upper-Middle Income'
        ELSE 'High Income'
    END AS Income_Group
FROM united_nations.Access_to_Basic_Services
WHERE Est_population_in_millions IS NOT NULL
  AND Est_gdp_in_billions IS NOT NULL
ORDER BY GDP_per_capita DESC
LIMIT 10;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Est_population_in_millions,Est_gdp_in_billions,GDP_per_capita,Poverty_line,Income_Group
Bermuda,2019,0.063911,7.42,116098.95,2.5,High Income
Bermuda,2018,0.063918,7.23,113113.68,2.5,High Income
Bermuda,2017,0.063873,7.14,111784.32,2.5,High Income
Bermuda,2020,0.063893,6.89,107836.54,2.5,High Income
Bermuda,2016,0.064554,6.9,106887.26,1.9,High Income
Bermuda,2015,0.065237,6.65,101936.02,1.9,High Income
Cayman Islands,2016,0.062255,4.91,78869.17,1.9,High Income
Cayman Islands,2015,0.060911,4.71,77325.93,1.9,High Income
Qatar,2015,2.414573,161.74,66984.93,1.9,High Income
Singapore,2018,5.638676,377.0,66859.67,2.5,High Income


##  Summary

In this notebook, we learned how to calculate and interpret **GDP per capita** — one of the most important economic indicators.

- We computed GDP per capita using SQL arithmetic expressions.
- We introduced **IF statements** to dynamically set poverty line thresholds.
- We applied **CASE statements** to classify countries by income group.

This exercise demonstrates how SQL can be used not only for data retrieval but also for **economic and analytical reasoning** — enabling data scientists to generate insights that guide global development analysis.
