# EDA
- This notebook conducts data analysis of the World Development Indicators Dataset
- The main variables of interest are GDP, consumption, investment, exports, imports, population, and the birth rate

## Techniques
1. GROUP BY - aggregate by region or income groups
2. JOIN - combine population data and main data
3. LAG - compute differences over time
4. OVER() - compute lags at the country-indicator level

## Findings
### Inequality
1. The income share of East Asia & Pacific has increased significantly and the income shares of Europe/Central Asia and North America have declined from 2000 to 2020
2. Sub-Saharan Africa has the highest increase in the population share from 2000 to 2020

### Crisis Episodes 
1. High income countries were hit worse by the global financial crisis
2. Small, Latin American countries faced the largest downturn due to Covid 19

### Growth
1. Sub-Saharan African countries have faced the largest declines in GDP per capita during the 2010s
2. East Asian & Pacific countries have had the largest growth in GDP per capita during the 2010s


In [175]:
%load_ext sql
%sql postgresql:///wdi.db

The sql extension is already loaded. To reload it, use:
  %reload_ext sql
Traceback (most recent call last):
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 145, in __init__
    self._dbapi_connection = engine.raw_connection()
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 3293, in raw_connection
    return self.pool.connect()
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 452, in connect
    return _ConnectionFairy._checkout(self)
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 1269, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 716, in checkout
    rec = pool._do_get()
  File "/Users/mitch/envs/main/lib/python3.9/site-packages/sqlalchemy/pool/impl.py", line 170, in _do_get
    self._dec_overflow()
  File "/Users/

In [176]:
%%sql
postgresql:///wdi

## EDA

- Let's first examine the main columns

In [177]:
%%sql
SELECT * FROM main
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country_code,country,indicator,year,value,growth_rate,region,income_group
ABW,Aruba,Birth_Rate,1960,35.679,,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1961,34.529,-3.22,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1962,33.32,-3.5,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1963,32.05,-3.81,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1964,30.737,-4.1,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1965,29.413,-4.31,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1966,28.121,-4.39,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1967,26.908,-4.31,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1968,25.817,-4.05,Latin America & Caribbean,High income
ABW,Aruba,Birth_Rate,1969,24.872,-3.66,Latin America & Caribbean,High income


In [178]:
%%sql
SELECT * FROM pop
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country_code,year,population,population_growth,region,income_group
ABW,1960,54208,,Latin America & Caribbean,High income
ABW,1961,55434,2.26,Latin America & Caribbean,High income
ABW,1962,56234,1.44,Latin America & Caribbean,High income
ABW,1963,56699,0.83,Latin America & Caribbean,High income
ABW,1964,57029,0.58,Latin America & Caribbean,High income
ABW,1965,57357,0.58,Latin America & Caribbean,High income
ABW,1966,57702,0.6,Latin America & Caribbean,High income
ABW,1967,58044,0.59,Latin America & Caribbean,High income
ABW,1968,58377,0.57,Latin America & Caribbean,High income
ABW,1969,58734,0.61,Latin America & Caribbean,High income


In [179]:
%%sql
SELECT DISTINCT(indicator) "Indicator"
FROM main
ORDER BY indicator

 * postgresql:///wdi
7 rows affected.


Indicator
Birth_Rate
Consumption
Exports
GDP
Imports
Investment
Population


In [180]:
%%sql
SELECT DISTINCT(income_group) "Income Group"
FROM main
ORDER BY income_group

 * postgresql:///wdi
4 rows affected.


Income Group
High income
Low income
Lower middle income
Upper middle income


In [181]:
%%sql
SELECT DISTINCT(region) "Region"
FROM main
ORDER BY region

 * postgresql:///wdi
7 rows affected.


Region
East Asia & Pacific
Europe & Central Asia
Latin America & Caribbean
Middle East & North Africa
North America
South Asia
Sub-Saharan Africa


## Groups: Region

In [200]:
%%sql
SELECT region, ROUND(AVG(value / 1000000000)::numeric, 0) "Average GDP (Billions)"
FROM main
WHERE indicator = 'GDP'
AND year = 2020
GROUP BY region
ORDER BY AVG(value)

 * postgresql:///wdi
7 rows affected.


region,Average GDP (Billions)
Sub-Saharan Africa,39
Latin America & Caribbean,138
Middle East & North Africa,168
South Asia,404
Europe & Central Asia,421
East Asia & Pacific,776
North America,6969


## Population

### Most Populated Countries

In [183]:
%%sql
SELECT country, ROUND((value / 1000000)::numeric, 0) "Population (Millions)"
FROM main
WHERE indicator = 'Population'
AND year = 2020
ORDER BY value DESC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,Population (Millions)
China,1411
India,1380
United States,329
Indonesia,274
Pakistan,221
Brazil,213
Nigeria,206
Bangladesh,165
Russia,144
Mexico,129


### Least Populated Countries

In [184]:
%%sql
SELECT country, ROUND(value::numeric, 0) "Population"
FROM main
WHERE indicator = 'Population'
AND year = 2020
ORDER BY value ASC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,Population
Nauru,10834
Tuvalu,11792
Palau,18092
British Virgin Islands,30237
Gibraltar,33691
San Marino,33938
Liechtenstein,38137
St. Martin (French part),38659
Turks and Caicos Islands,38718
Monaco,39244


### Fastest Growing
- The fastest growing countries are in the Middle East and Africa

In [185]:
%%sql
SELECT country, region, growth_rate population_growth
FROM main
WHERE indicator = 'Population'
AND year = 2020
ORDER BY growth_rate DESC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,population_growth
Malta,Middle East & North Africa,4.21
Niger,Sub-Saharan Africa,3.84
Bahrain,Middle East & North Africa,3.68
Equatorial Guinea,Sub-Saharan Africa,3.47
Uganda,Sub-Saharan Africa,3.32
Angola,Sub-Saharan Africa,3.27
Dem. Rep. Congo,Sub-Saharan Africa,3.19
Burundi,Sub-Saharan Africa,3.12
Mali,Sub-Saharan Africa,3.02
Chad,Sub-Saharan Africa,3.0


### Slowest Growing
- The slowest growing countries are in Europe and Central Asia

In [186]:
%%sql
SELECT country, region, growth_rate population_growth
FROM main
WHERE indicator = 'Population'
AND year = 2020
ORDER BY growth_rate ASC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,population_growth
Moldova,Europe & Central Asia,-1.67
Curaçao,Latin America & Caribbean,-1.54
Kosovo,Europe & Central Asia,-0.75
Latvia,Europe & Central Asia,-0.64
Bosnia and Herzegovina,Europe & Central Asia,-0.61
Bulgaria,Europe & Central Asia,-0.6
Albania,Europe & Central Asia,-0.58
Ukraine,Europe & Central Asia,-0.57
Serbia,Europe & Central Asia,-0.53
Romania,Europe & Central Asia,-0.44


## Birth Rate

### Highest Birth Rate
- The countries with the highest birth rate are Niger, Chad, and Somolia
- The majority of high birth rate countries are all in Sub-Saharan Africa

In [187]:
%%sql
SELECT country, region, ROUND(value::Numeric, 0) "Births / 1000 People"
FROM main
WHERE year = 2019
AND indicator = 'Birth_Rate'
ORDER BY value DESC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,Births / 1000 People
Niger,Sub-Saharan Africa,46
Chad,Sub-Saharan Africa,42
Somalia,Sub-Saharan Africa,42
Mali,Sub-Saharan Africa,41
Dem. Rep. Congo,Sub-Saharan Africa,41
Angola,Sub-Saharan Africa,40
Burundi,Sub-Saharan Africa,38
The Gambia,Sub-Saharan Africa,38
Burkina Faso,Sub-Saharan Africa,37
Nigeria,Sub-Saharan Africa,37


### Lowest Birth Rate
- The countries with the lowest birth rates are Korea, Puerto Rico, San Marino, and Japan
- The majority of low birth rate countries are in Europe/Central Asia and East Asia/Pacific

In [201]:
%%sql
SELECT country, region, ROUND(value::numeric, 2) "Births / 1000 People"
FROM main
WHERE year = 2019
AND indicator = 'Birth_Rate'
ORDER BY value ASC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,Births / 1000 People
Korea,East Asia & Pacific,5.9
Puerto Rico,Latin America & Caribbean,6.4
San Marino,Europe & Central Asia,6.7
Japan,East Asia & Pacific,7.0
Andorra,Europe & Central Asia,7.0
Italy,Europe & Central Asia,7.0
"Hong Kong SAR, China",East Asia & Pacific,7.0
Spain,Europe & Central Asia,7.6
Greece,Europe & Central Asia,7.8
Bosnia and Herzegovina,Europe & Central Asia,7.94


## Groups: Income

In [189]:
%%sql
DROP TABLE IF EXISTS by_income

 * postgresql:///wdi
Done.


[]

In [190]:
%%sql
SELECT income_group, year, indicator, SUM(value) value
INTO by_income
FROM main
WHERE indicator IN ('GDP', 'Consumption', 'Population')
GROUP BY income_group, year, indicator
ORDER BY year ASC, indicator, income_group

 * postgresql:///wdi
732 rows affected.


[]

In [191]:
%%sql
SELECT * FROM by_income
LIMIT 10


 * postgresql:///wdi
10 rows affected.


income_group,year,indicator,value
High income,1960,Consumption,647939774374.7781
Low income,1960,Consumption,12550118794.625605
Lower middle income,1960,Consumption,324144722096.1312
Upper middle income,1960,Consumption,474077812064.57336
High income,1960,GDP,7090154643143.816
Low income,1960,GDP,50356352925.71689
Lower middle income,1960,GDP,479457172362.7047
Upper middle income,1960,GDP,895311627184.0739
High income,1960,Population,768612039.0
Low income,1960,Population,140853136.0


### Income Share
- What percent of world income does each income group take up?
- The high income group takes up more than half of world income, at 61%.
- The upper middle income group takes another 29%.
- The low income group only earns 0.5% of world income.

In [192]:
%%sql
SELECT income_group, 
ROUND(100*(SUM(value) / (SELECT SUM(value) FROM main WHERE indicator = 'GDP' AND year = 2020))::numeric, 2) income_share
FROM main
WHERE indicator = 'GDP'
AND year = 2020
GROUP BY income_group
ORDER BY SUM(value) DESC

 * postgresql:///wdi
4 rows affected.


income_group,income_share
High income,61.21
Upper middle income,28.93
Lower middle income,9.34
Low income,0.52


In [193]:
%%sql
-- alternative: use the temporary table
SELECT income_group,
ROUND(100*(value / (SELECT SUM(value) FROM by_income WHERE year = 2020 AND indicator = 'GDP'))::numeric, 2) income_share
FROM by_income
WHERE year = 2020
AND  indicator = 'GDP'
ORDER BY income_share DESC

 * postgresql:///wdi
4 rows affected.


income_group,income_share
High income,61.21
Upper middle income,28.93
Lower middle income,9.34
Low income,0.52


### Population Share

In [194]:
%%sql
SELECT income_group, 
ROUND(100*(SUM(value) / (SELECT SUM(value) FROM main WHERE indicator = 'Population' AND year = 2020))::numeric, 2) "Population Share"
FROM main
WHERE indicator = 'Population'
AND year = 2020
GROUP BY income_group
ORDER BY SUM(value) DESC

 * postgresql:///wdi
4 rows affected.


income_group,Population Share
Lower middle income,43.07
Upper middle income,32.34
High income,15.77
Low income,8.82


In [197]:
%%sql
-- alternative: use the temporary table
SELECT income_group,
ROUND(100*(value / (SELECT SUM(value) FROM by_income WHERE year = 2020 AND indicator = 'Population'))::numeric, 2) population_share
FROM by_income
WHERE year = 2020
AND  indicator = 'Population'
ORDER BY population_share DESC

 * postgresql:///wdi
4 rows affected.


income_group,population_share
Lower middle income,43.07
Upper middle income,32.34
High income,15.77
Low income,8.82


## Long run Changes in Income Share

In [198]:
%%sql
WITH inequality_2000 AS (
    SELECT income_group,
    ROUND(100*(value / (SELECT SUM(value) FROM by_income WHERE year = 2000 AND indicator = 'GDP'))::numeric, 2) income_share
    FROM by_income
    WHERE year = 2000
    AND  indicator = 'GDP'
), 
inequality_2020 AS (
    SELECT income_group,
    ROUND(100*(value / (SELECT SUM(value) FROM by_income WHERE year = 2020 AND indicator = 'GDP'))::numeric, 2) income_share
    FROM by_income
    WHERE year = 2020
    AND  indicator = 'GDP'
)
SELECT inequality_2000.income_group, 
inequality_2000.income_share income_share_2000,
inequality_2020.income_share income_share_2020
FROM inequality_2000 
JOIN inequality_2020 
ON inequality_2000.income_group = inequality_2020.income_group

 * postgresql:///wdi
4 rows affected.


income_group,income_share_2000,income_share_2020
High income,76.75,61.21
Low income,0.44,0.52
Lower middle income,6.19,9.34
Upper middle income,16.61,28.93


## GDP Per Capita

In [125]:
%%sql
DROP TABLE IF EXISTS gdp_table

 * postgresql:///wdi
Done.


[]

- GDP per Capita is the total GDP of the economy divided by the population
- It serves as a broad measure of the average 'income' in each country and hence the standard of living
- This query computes GDP per Capita for each country-year instance

In [126]:
%%sql
SELECT country,
main.region,
main.income_group,
main.year,
ROUND((value / population)::numeric, 0) "GDP per Capita"
INTO gdp_table
FROM main
LEFT JOIN pop
ON main.country_code = pop.country_code
AND main.year = pop.year
WHERE main.indicator = 'GDP'

 * postgresql:///wdi
9598 rows affected.


[]

In [127]:
%%sql
SELECT * 
FROM gdp_table
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,income_group,year,GDP per Capita
Aruba,Latin America & Caribbean,High income,1986,17231
Aruba,Latin America & Caribbean,High income,1987,20263
Aruba,Latin America & Caribbean,High income,1988,24343
Aruba,Latin America & Caribbean,High income,1989,27313
Aruba,Latin America & Caribbean,High income,1990,27884
Aruba,Latin America & Caribbean,High income,1991,28954
Aruba,Latin America & Caribbean,High income,1992,29032
Aruba,Latin America & Caribbean,High income,1993,29325
Aruba,Latin America & Caribbean,High income,1994,29989
Aruba,Latin America & Caribbean,High income,1995,29367


### Richest Countries
- Which countries have the highest GDP per capita
- We can see Monaco, Luxembourg, and Bermuda have the highest GDP per capita

In [128]:
%%sql
SELECT * FROM gdp_table
WHERE year = 2020
ORDER BY "GDP per Capita" DESC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,income_group,year,GDP per Capita
Monaco,Europe & Central Asia,High income,2020,159222
Luxembourg,Europe & Central Asia,High income,2020,104529
Bermuda,North America,High income,2020,100412
Switzerland,Europe & Central Asia,High income,2020,85682
Ireland,Europe & Central Asia,High income,2020,78558
Cayman Islands,Latin America & Caribbean,High income,2020,76999
Norway,Europe & Central Asia,High income,2020,75017
United States,North America,High income,2020,58560
Singapore,East Asia & Pacific,High income,2020,58057
Australia,East Asia & Pacific,High income,2020,58044


### Poorest Countries
- Burundi, Malawi, and Central African Republic have the lowest GDP per capita.
- The majority of the lowest income countries are in Sub-Saharan Africa

In [129]:
%%sql
SELECT * FROM gdp_table
WHERE year = 2020
ORDER BY "GDP per Capita" ASC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,income_group,year,GDP per Capita
Burundi,Sub-Saharan Africa,Low income,2020,271
Malawi,Sub-Saharan Africa,Low income,2020,394
Central African Republic,Sub-Saharan Africa,Low income,2020,414
Madagascar,Sub-Saharan Africa,Low income,2020,442
Somalia,Sub-Saharan Africa,Low income,2020,445
Dem. Rep. Congo,Sub-Saharan Africa,Low income,2020,505
Niger,Sub-Saharan Africa,Low income,2020,523
Afghanistan,South Asia,Low income,2020,530
Mozambique,Sub-Saharan Africa,Low income,2020,575
Liberia,Sub-Saharan Africa,Low income,2020,616


### Average GDP Per Capita by Income Group

In [130]:
%%sql
SELECT income_group, 
ROUND(AVG("GDP per Capita")::numeric, 0) "Average GDP per Capita"
FROM gdp_table
WHERE year = 2020
GROUP BY income_group
ORDER BY AVG("GDP per Capita") DESC

 * postgresql:///wdi
4 rows affected.


income_group,Average GDP per Capita
High income,36272
Upper middle income,6812
Lower middle income,2449
Low income,668


### Average GDP Per Capita by Region
- Here we can see North America and Europe/Central Asia have the highest average GDP per capita.
- South Asia and Sub-Saharan Africa have the lowest.

In [131]:
%%sql
SELECT region, 
ROUND(AVG("GDP per Capita")::numeric , 0) "Average GDP per Capita"
FROM gdp_table
WHERE year = 2020
GROUP BY region
ORDER BY AVG("GDP per Capita") DESC

 * postgresql:///wdi
7 rows affected.


region,Average GDP per Capita
North America,67089
Europe & Central Asia,27439
Middle East & North Africa,14588
East Asia & Pacific,14449
Latin America & Caribbean,11664
South Asia,2507
Sub-Saharan Africa,2155


## Group: Region

### Population Share
- East Asia/Pacific take up the highest population share, whereas North America takes the lowest

In [199]:
%%sql
SELECT region "Region",
 ROUND(100*(SUM(value) / (SELECT SUM(value) FROM main WHERE indicator = 'Population' AND year = 2020))::numeric, 0) "Population Share (%)"
FROM main
WHERE indicator = 'Population'
AND year = 2020
GROUP BY region
ORDER BY SUM(value) DESC

 * postgresql:///wdi
7 rows affected.


Region,Population Share (%)
East Asia & Pacific,30
South Asia,24
Sub-Saharan Africa,15
Europe & Central Asia,12
Latin America & Caribbean,8
Middle East & North Africa,6
North America,5


In [209]:
%%sql
DROP TABLE IF EXISTS by_region

 * postgresql:///wdi
Done.


[]

In [210]:
%%sql
SELECT region, year, indicator, SUM(value) value
INTO by_region
FROM main
WHERE indicator != 'Birth_Rate'
GROUP BY region, year, indicator
ORDER BY year ASC, indicator, region

 * postgresql:///wdi
2558 rows affected.


[]

In [211]:
%%sql
SELECT * FROM by_region
LIMIT 20

 * postgresql:///wdi
20 rows affected.


region,year,indicator,value
East Asia & Pacific,1960,Consumption,247403499494.73895
Europe & Central Asia,1960,Consumption,430524892131.4877
Latin America & Caribbean,1960,Consumption,473388893666.2933
Middle East & North Africa,1960,Consumption,76034145879.4618
South Asia,1960,Consumption,163970586521.15033
Sub-Saharan Africa,1960,Consumption,67390409636.97604
East Asia & Pacific,1960,Exports,36849747268.41247
Europe & Central Asia,1960,Exports,51117694194.75154
Latin America & Caribbean,1960,Exports,37535654357.97549
Middle East & North Africa,1960,Exports,51733738404.13526


## Long Run Changes in Population
- The population shares of East Asia and Europe have declined by more than 2%
- The population share of Sub-Saharan has increased by nearly 4%

In [221]:
%%sql
WITH inequality_2000 AS (
    SELECT region,
    ROUND(100*(value / (SELECT SUM(value) FROM by_region WHERE year = 2000 AND indicator = 'Population'))::numeric, 2) population_share
    FROM by_region
    WHERE year = 2000
    AND  indicator = 'Population'
), 
inequality_2020 AS (
    SELECT region,
    ROUND(100*(value / (SELECT SUM(value) FROM by_region WHERE year = 2020 AND indicator = 'Population'))::numeric, 2) population_share
    FROM by_region
    WHERE year = 2020
    AND  indicator = 'Population'
)
SELECT inequality_2000.region, 
inequality_2000.population_share "2000 Population Share (%)",
inequality_2020.population_share "2020 Population Share (%)", 
inequality_2020.population_share - inequality_2000.population_share "Difference"
FROM inequality_2000 
JOIN inequality_2020 
ON inequality_2000.region = inequality_2020.region

 * postgresql:///wdi
7 rows affected.


region,2000 Population Share (%),2020 Population Share (%),Difference
East Asia & Pacific,33.38,30.33,-3.05
Europe & Central Asia,14.19,11.98,-2.21
Latin America & Caribbean,8.19,8.1,-0.09
Middle East & North Africa,5.2,6.03,0.83
North America,5.16,4.77,-0.39
South Asia,22.92,24.1,1.18
Sub-Saharan Africa,10.96,14.7,3.74


### Income Shares

In [217]:
%%sql
SELECT region "Region",
 ROUND(100*(SUM(value) / (SELECT SUM(value) FROM main WHERE indicator = 'GDP' AND year = 2020))::numeric, 0) "Income Share (%)"
FROM main
WHERE indicator = 'GDP'
AND year = 2020
GROUP BY region
ORDER BY SUM(value) DESC

 * postgresql:///wdi
7 rows affected.


Region,Income Share (%)
East Asia & Pacific,32
Europe & Central Asia,26
North America,26
Latin America & Caribbean,6
South Asia,4
Middle East & North Africa,4
Sub-Saharan Africa,2


### Changes in Income by Region
- The income share of East Asia has increased by more than 10%
- The income shares of North America and and Europe/Central Asia decreased by 5 and 7%

In [220]:
%%sql
WITH inequality_2000 AS (
    SELECT region,
    ROUND(100*(value / (SELECT SUM(value) FROM by_region WHERE year = 2000 AND indicator = 'GDP'))::numeric, 2) gdp_share
    FROM by_region
    WHERE year = 2000
    AND  indicator = 'GDP'
), 
inequality_2020 AS (
    SELECT region,
    ROUND(100*(value / (SELECT SUM(value) FROM by_region WHERE year = 2020 AND indicator = 'GDP'))::numeric, 2) gdp_share
    FROM by_region
    WHERE year = 2020
    AND  indicator = 'GDP'
)
SELECT inequality_2000.region, 
inequality_2000.gdp_share "2000 Income Share (%)",
inequality_2020.gdp_share "2020 Income Share (%)", 
inequality_2020.gdp_share - inequality_2000.gdp_share "Difference"
FROM inequality_2000 
JOIN inequality_2020 
ON inequality_2000.region = inequality_2020.region

 * postgresql:///wdi
7 rows affected.


region,2000 Income Share (%),2020 Income Share (%),Difference
East Asia & Pacific,20.85,31.72,10.87
Europe & Central Asia,33.41,26.06,-7.35
Latin America & Caribbean,7.01,6.15,-0.86
Middle East & North Africa,3.7,3.96,0.26
North America,31.08,25.9,-5.18
South Asia,2.27,4.01,1.74
Sub-Saharan Africa,1.67,2.21,0.54


# Covid 19 Crisis

- This section examines the covid-19 crisis
- Across countries, the crisis produced an average decline in GDP of 5%

In [134]:
%%sql
SELECT year, 
ROUND(AVG(growth_rate)::numeric, 2) "Average GDP Growth"
FROM main
WHERE indicator = 'GDP'
AND year BETWEEN 2015 AND 2020
GROUP BY year

 * postgresql:///wdi
6 rows affected.


year,Average GDP Growth
2017,3.17
2015,2.66
2016,3.14
2018,3.06
2019,2.81
2020,-5.0


- How did each component of national accounts change?
- We can access this by grouping on the indicator and computing the average over all countries
- Exports faced the strongest downturn, at 11.5%
- GDP faced a downturn of 5%, whereas consumption displayed a smaller downturn of 2.68%

In [135]:
%%sql
SELECT indicator, 
ROUND(AVG(growth_rate)::numeric, 2) "Percentage Change"
FROM main
WHERE year = 2020
GROUP BY indicator
ORDER BY AVG(growth_rate)

 * postgresql:///wdi
6 rows affected.


indicator,Percentage Change
Exports,-11.48
Imports,-9.65
Investment,-7.82
GDP,-5.0
Consumption,-2.68
Population,1.16


## Covid 19: Which Countries fared the Worst?

- Which countries fared worst during the crisis?
- We can Macao SAR, the Maldives, and Libya fared the worst
- Countries with the worst downturns were typically in Latin America and the Caribbean 

In [136]:
%%sql
SELECT country, region, income_group,
growth_rate downturn
FROM main
WHERE indicator = 'GDP'
AND year = 2020
ORDER BY growth_rate
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,income_group,downturn
"Macao SAR, China",East Asia & Pacific,High income,-54.01
Maldives,South Asia,Upper middle income,-33.5
Libya,Middle East & North Africa,Upper middle income,-31.3
Turks and Caicos Islands,Latin America & Caribbean,High income,-26.78
Lebanon,Middle East & North Africa,Lower middle income,-21.46
St. Lucia,Latin America & Caribbean,Upper middle income,-20.37
Antigua and Barbuda,Latin America & Caribbean,High income,-20.19
Barbados,Latin America & Caribbean,High income,-18.98
Curaçao,Latin America & Caribbean,High income,-18.4
Panama,Latin America & Caribbean,High income,-17.94


### Covid 19: Larger Countries
- The countries with the largest downturns are all quite small
- What about larger countries?
- To study this, I restrict that each countries population share is more than 0.5% of the world population
- Using this restriction, we can see that Iraq, Spain, and Argentina fared the worst

In [140]:
%%sql
WITH mod AS (
    SELECT main.country, main.growth_rate downturn,
    ROUND((100*pop.population / (SELECT SUM(population) FROM pop WHERE year = 2020)), 2)::numeric population_share
    FROM 
    main LEFT JOIN  pop
    ON main.country_code = pop.country_code
    AND main.year = pop.year
    WHERE main.year = 2020
    AND indicator = 'GDP'
)
SELECT * FROM mod
WHERE population_share >= 0.5
ORDER BY downturn ASC
LIMIT 20

 * postgresql:///wdi
20 rows affected.


country,downturn,population_share
Iraq,-15.67,0.52
Spain,-10.82,0.61
Argentina,-9.9,0.59
Philippines,-9.57,1.42
United Kingdom,-9.4,0.87
Italy,-8.94,0.77
Mexico,-8.31,1.67
France,-7.86,0.87
India,-7.25,17.91
Colombia,-6.8,0.66


# Global Financial Crisis

- How did national accounts change during the global financial crisis?
- We observe an average decline in GDP of only 0.02%.
- Imports, exports, and investment display larger declines.

In [81]:
%%sql
SELECT indicator, 
ROUND(AVG(growth_rate)::numeric, 2) "Percentage Change"
FROM main
WHERE year = 2009
GROUP BY indicator
ORDER BY indicator

 * postgresql:///wdi
7 rows affected.


indicator,Percentage Change
Birth_Rate,-0.82
Consumption,2.06
Exports,-5.7
GDP,-0.02
Imports,-6.95
Investment,-3.77
Population,1.49


### Global Financial Crisis: High Income Countries
- Let's focus on high income countries.
- Here we can see the decline is more dramatic relative to the average.
- Why? The global financial crisis was a financial crisis, so it disproportionately impacted high income countries.
- GDP declines by 3.6% on average.
- Investment displayed a sharped decline of 12.7%.

In [82]:
%%sql
SELECT indicator, 
ROUND(AVG(growth_rate)::numeric, 2) "Percentage Change"
FROM main
WHERE year = 2009
AND income_group = 'High income'
GROUP BY indicator
ORDER BY indicator

 * postgresql:///wdi
7 rows affected.


indicator,Percentage Change
Birth_Rate,-0.85
Consumption,-1.0
Exports,-8.72
GDP,-3.6
Imports,-12.63
Investment,-12.66
Population,1.27


### Global Financial Crisis: Low Income Countries
- Let's examine low income countries.
- We can see that low income countries maintained increases in consumption, GDP, and investment.

In [85]:
%%sql
SELECT indicator, 
ROUND(AVG(growth_rate)::numeric, 2) "Percentage Change"
FROM main
WHERE year = 2009
AND income_group = 'Low income'
GROUP BY indicator
ORDER BY indicator

 * postgresql:///wdi
7 rows affected.


indicator,Percentage Change
Birth_Rate,-1.23
Consumption,4.73
Exports,-0.66
GDP,5.05
Imports,4.97
Investment,9.63
Population,2.82


# Long Run Changes
- This section studies long run changes in GDP per capita from 2000 to 2019

## Least Improved
- Which countries faced the largest decline in GDP per capita from 2000 to 2019?
- The countries with the largest decline are Guinea, Syria, and Libya

In [225]:
%%sql
WITH focus AS (
    SELECT *, 
    LAG("GDP per Capita", 1) OVER(PARTITION BY country ORDER BY year ASC) LGDP
    FROM gdp_table
    WHERE year = 2010
    OR year = 2019
)
SELECT country, region,
ROUND((100*("GDP per Capita" - LGDP )/ LGDP )::numeric, 2) change
FROM focus
WHERE year = 2019
ORDER BY (100*("GDP per Capita" - LGDP )/ LGDP ) ASC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,change
Equatorial Guinea,Sub-Saharan Africa,-46.72
Syrian Arab Republic,Middle East & North Africa,-40.14
Libya,Middle East & North Africa,-32.68
Sudan,Sub-Saharan Africa,-30.05
Congo,Sub-Saharan Africa,-26.01
Lebanon,Middle East & North Africa,-25.32
Virgin Islands,Latin America & Caribbean,-23.75
Central African Republic,Sub-Saharan Africa,-21.43
Oman,Middle East & North Africa,-19.12
San Marino,Europe & Central Asia,-15.54


## Most Improved
- Which countries faced the largest increase in GDP per capita from 2000 to 2019?
- The countries with the largest increase are Nauru, Turkmenistan, and China
- The majority of countries are in East Asia/ Pacific

In [226]:
%%sql
WITH focus AS (
    SELECT *, 
    LAG("GDP per Capita", 1) OVER(PARTITION BY country ORDER BY year ASC) LGDP
    FROM gdp_table
    WHERE year = 2010
    OR year = 2019
)
SELECT country, region,
ROUND((100*("GDP per Capita" - LGDP )/ LGDP )::numeric, 2) change
FROM focus
WHERE year = 2019
AND LGDP IS NOT NULL
ORDER BY (100*("GDP per Capita" - LGDP )/ LGDP ) DESC
LIMIT 10

 * postgresql:///wdi
10 rows affected.


country,region,change
Nauru,East Asia & Pacific,102.23
Turkmenistan,Europe & Central Asia,80.33
China,East Asia & Pacific,79.83
Ethiopia,Sub-Saharan Africa,76.6
Myanmar,East Asia & Pacific,71.43
Mongolia,East Asia & Pacific,67.11
Bangladesh,South Asia,64.85
Ireland,Europe & Central Asia,63.59
Lao PDR,East Asia & Pacific,62.61
Cambodia,East Asia & Pacific,61.55


# Moving Averages


In [202]:
%%sql
-- computing the moving average
SELECT country_code, year, 
ROUND(AVG(growth_rate) OVER(
    PARTITION BY country_code ORDER BY year ASC ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING
    )::numeric, 2) moving_average
FROM main
WHERE 
indicator = 'GDP'
AND country_code = 'USA'
ORDER BY year DESC
LIMIT 20

 * postgresql:///wdi
20 rows affected.


country_code,year,moving_average
USA,2020,2.46
USA,2019,2.53
USA,2018,2.3
USA,2017,2.28
USA,2016,2.25
USA,2015,2.15
USA,2014,1.13
USA,2013,0.74
USA,2012,0.66
USA,2011,0.92


In [203]:
%%sql
SELECT m1.country_code, m1.indicator, m1.year,
 ROUND(AVG(m2.growth_rate)::numeric, 2) moving_average
FROM main m1
FULL OUTER JOIN
(
    SELECT country_code, indicator, year, growth_rate
    FROM main 
    WHERE country_code = 'USA'
) m2
ON m1.country_code = m2.country_code
AND m1.indicator = m2.indicator
AND m2.year BETWEEN m1.year - 5 AND m1.year - 1
WHERE m1.country_code = 'USA'
AND m1.indicator = 'GDP'
GROUP BY m1.country_code, m1.indicator, m1.year
ORDER BY m1.country_code, m1.indicator, m1.year DESC
LIMIT 20

 * postgresql:///wdi
20 rows affected.


country_code,indicator,year,moving_average
USA,GDP,2020,2.46
USA,GDP,2019,2.53
USA,GDP,2018,2.3
USA,GDP,2017,2.28
USA,GDP,2016,2.25
USA,GDP,2015,2.15
USA,GDP,2014,1.13
USA,GDP,2013,0.74
USA,GDP,2012,0.66
USA,GDP,2011,0.92
