# GDP Analysis - Exercise 2, Problem Set 1

## The task

- Find monthly or yearly GDP data for UK, USA, Brazil, Japan, China, Germany, and Switzerland from 2000 to 2022.
- Write code to:
    - load the data
    - clean the data
    - plot the data
- Write your code as functions.
- Create an analysis Notebook that imports your own functions and executes them to showcase the results.
- Briefly describe the GDP development of these different countries over the years

### Inputs:

data.py = an API fetching data from the World Bank API: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD with parameter "country_code" for the different countries asked in the task:

- United Kingdom = GB
- United States of America = US
- Brazil = BR
- Japan = JP
- China = CN
- Germany = DE
- Switzerland = CH

# Cleaning the data

Keeping it simple for this problem set:
- [x] Select the useful columns (reduce the noise)
- [x] Rename them but include the mapping
- [x] Make sure the column types are correct
- [x] Remove missing values

In [1]:
import data
'''
Fetching and cleaning data for the countries stated in the task
'''

df_GB = data.clean_data(data.fetch_data("GB"))
df_US = data.clean_data(data.fetch_data("US"))
df_BR = data.clean_data(data.fetch_data("BR"))
df_JP = data.clean_data(data.fetch_data("JP"))
df_CN = data.clean_data(data.fetch_data("CN"))
df_DE = data.clean_data(data.fetch_data("DE"))
df_CH = data.clean_data(data.fetch_data("CH"))

Fetched data:
  countryiso3code  date         value unit obs_status  decimal  \
0             GBR  2022  3.114042e+12                        0   
1             GBR  2021  3.143323e+12                        0   
2             GBR  2020  2.696778e+12                        0   
3             GBR  2019  2.851407e+12                        0   
4             GBR  2018  2.871340e+12                        0   

     indicator.id    indicator.value country.id   country.value  
0  NY.GDP.MKTP.CD  GDP (current US$)         GB  United Kingdom  
1  NY.GDP.MKTP.CD  GDP (current US$)         GB  United Kingdom  
2  NY.GDP.MKTP.CD  GDP (current US$)         GB  United Kingdom  
3  NY.GDP.MKTP.CD  GDP (current US$)         GB  United Kingdom  
4  NY.GDP.MKTP.CD  GDP (current US$)         GB  United Kingdom  
Cleaned data:
   Year           GDP         Country
0  2022  3.114042e+12  United Kingdom
1  2021  3.143323e+12  United Kingdom
2  2020  2.696778e+12  United Kingdom
3  2019  2.851407e+12  Unit

# Plot the data

Using the python library: plotly for more interactiveness. I prefer this package compared to the standard matplotlib as it helps with understanding the data in a time series graph, and I love the tooltip feature. I use Tableau a lot in my professional work, and I found plotly to be similar in the viewing and interactiveness compared to the other packages for visualisation.

In [2]:
from plot_utils import plot_single
'''
Plotting the data for United Kingdom as an example
'''

plot_single(df_GB, "United Kingdom")

From the plot above, I have taken the United Kingdom as an example. GDP had almost doubled from 2000 to 2005. However, this doubling affect was potentially due to the GBP rose in price compared to the USD (see below chart). However, since 2005 it has been ranging in between $2.4 and $3.1 Triilion. Between 2007 and 2009 we had the financial crash but we recovered over a 5 year period to close to previous highs.

![GBP/USD Chart](image_gbp_usd.jpg)

In [3]:
from plot_utils import plot_multiple_countries
import pandas as pd

'''
Plotting multiple countries together using the concatenate function from pandas into a single dataframe. Concat is sticking on one dataframe underneath the first.
'''

df_all = pd.concat([df_GB, df_US, df_BR, df_JP, df_CN, df_DE, df_CH])
plot_multiple_countries(df_all)

In [5]:
print('Data for each country in the year 2000:')
df_all[df_all['Year'] == 2000]

Data for each country in the year 2000:


Unnamed: 0,Year,GDP,Country
22,2000,1665535000000.0,United Kingdom
22,2000,10250950000000.0,United States
22,2000,655448200000.0,Brazil
22,2000,4968359000000.0,Japan
22,2000,1223755000000.0,China
22,2000,1966981000000.0,Germany
22,2000,279216000000.0,Switzerland


In [6]:
print('Data for each country in the year 2022:')
df_all[df_all['Year'] == 2022]

Data for each country in the year 2022:


Unnamed: 0,Year,GDP,Country
0,2022,3114042000000.0,United Kingdom
0,2022,26006890000000.0,United States
0,2022,1951924000000.0,Brazil
0,2022,4262463000000.0,Japan
0,2022,18316770000000.0,China
0,2022,4163596000000.0,Germany
0,2022,828508900000.0,Switzerland


In [7]:
df2 = df_all[df_all["Year"].isin([2000, 2022])]

df_pivot = df2.pivot(index="Country", columns="Year", values="GDP")
df_pivot["pct_change_2000_2022"] = (df_pivot[2022] - df_pivot[2000]) / df_pivot[2000] * 100

result = df_pivot[["pct_change_2000_2022"]].sort_values("pct_change_2000_2022", ascending=False)
print(result)

Year            pct_change_2000_2022
Country                             
China                    1396.767426
Brazil                    197.799864
Switzerland               196.726845
United States             153.702222
Germany                   111.674490
United Kingdom             86.969514
Japan                     -14.207825


Analysis:

- Using the pivot table to look at the percentage change of GDP for each country between 2022 and 2000, China had the highest change with 1400%, meaning they had the steepest growth. Japan actually lost GDP from 2000 to 2022, meaning their economy shrunk by 14%. So generally, most countries had increased their GDP from 2000 to 2022 except for Japan in USD terms.
- During the financial crisis of 2008, China and Japan were the only two markets unaffected in USD terms.
- During the Covid-19 outbreak, everyone but Japan grew their economies (mostly from increasing money supply).
- The United States has the highest GDP sitting at $26 Trillion in 2022. They have been the largest economy over the period.


## Reflection Question

**Question**: When you made a pull request adding the Jupyter Notebook, what did you realise?

**Answer**: Honestly, nothing much. Working on this solo, I noticed after pushing my commit to the server repository, the git pull had no conflicts which makes sense. But it was cool to see the history tracking in every moment and in every file.