<a href="https://colab.research.google.com/github/cjang1129/world_development_explorer/blob/main/PartB.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Individual Project - Part B
## What is it that you are investigating/exploring/analyzing (provide sufficient background information)?
### My project will focus on the Financial sector for the years 2010-2019. Mainly on the number of stocks traded, the percent change of the S&P Global Equity Indices, and the percent of inflation in consumer prices.

## Why is it important to you and/or to others?
### Numerous factors affect the financial sector and the stock market. I will be comparing the annual percentage of inflation (consumer prices) for both the United States (North America) and South Korea (East Asia & Pacific). These factors include presidential regimes, the global pandemic, and regional crises (i.e. housing crash of '08).

## What questions do you have in mind and would like to answer?
### Does the volume of stocks being traded or the stability of the S&P Global Equity Indices give any prediction on the level of inflation?
### Can deflation or inflation be predicting factors on the stock market?
### Does inflation/deflation and the stock market have a correlation of the suicide rate?

## Where do you get the data to help answer your questions?
### All data and information would be coming from the World Development Indicators (WDI - http://www.worlddev.xyz/)
### The WDI is a cross-country analytical tool that provides data on 1400+ socioeconomic indicators of 200+ countries over the span of 50+ years. It's primary datasource is the World Bank.
### The World Development Explorer (WDX) allows users to interact with the WDI on a click & play type of interface. The unique feature of the WDX is the incorporation of Hofestede's Cultural Dimensions which allows users to view how culture affects socioeconomic factors.

## What process/step you use to analyze the situation/issue
### For this part of the project, I will be utilizing the various graphs from the world development website (www.worlddev.xyz)

## Suicide rate in 2010 for the 4 Countries based on Stocks Traded & the S&P Global Equity Indices

### First step is to always bring in the necessary libraries

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import plotly.express as px

### Next, we import our datasets and check the top 5 values

In [2]:
df1 = pd.read_csv(r'https://raw.githubusercontent.com/cjang1129/world_development_explorer/main/data_files/wdi_data.csv')
df1.head()

Unnamed: 0.1,Unnamed: 0,Year,SH.STA.SUIC.P5,CM.MKT.TRAD.GD.ZS,FP.CPI.TOTL.ZG,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2010,13.0,87.078705,1.776872,CAN,Canada,North America,High income,Not classified
1,1,2015,12.5,70.448465,1.125241,CAN,Canada,North America,High income,Not classified
2,2,2016,12.5,75.470099,1.42876,CAN,Canada,North America,High income,Not classified
3,3,2010,34.1,142.452634,2.939181,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified
4,4,2015,28.3,125.78556,0.706208,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified


In [3]:
df2 = pd.read_csv(r"https://raw.githubusercontent.com/cjang1129/world_development_explorer/main/data_files/foreigninvestment_data.csv")
df2.head()

Unnamed: 0.1,Unnamed: 0,Year,SH.STA.SUIC.P5,FP.CPI.TOTL.ZG,BM.KLT.DINV.CD.WD,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2010,13.0,1.776872,36341410000.0,CAN,Canada,North America,High income,Not classified
1,1,2015,12.5,1.125241,83913840000.0,CAN,Canada,North America,High income,Not classified
2,2,2016,12.5,1.42876,67605970000.0,CAN,Canada,North America,High income,Not classified
3,3,2010,34.1,2.939181,28221600000.0,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified
4,4,2015,28.3,0.706208,23687100000.0,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified


In [4]:
df3 = pd.read_csv(r"https://raw.githubusercontent.com/cjang1129/world_development_explorer/main/data_files/s%26p_data.csv")
df3.head()

Unnamed: 0.1,Unnamed: 0,Year,SH.STA.SUIC.P5,CM.MKT.TRAD.GD.ZS,CM.MKT.INDX.ZG,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2010,13.0,87.078705,22.02772,CAN,Canada,North America,High income,Not classified
1,1,2015,12.5,70.448465,-26.166835,CAN,Canada,North America,High income,Not classified
2,2,2016,12.5,75.470099,22.562033,CAN,Canada,North America,High income,Not classified
3,3,2010,34.1,142.452634,25.258151,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified
4,4,2015,28.3,125.78556,-5.012121,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified


### Let's clean up the column names and remove unwanted columns

In [5]:
df1.rename(columns={'SH.STA.SUIC.P5':'Suicide Rate', 'CM.MKT.TRAD.GD.ZS':'Stocks Traded', 'FP.CPI.TOTL.ZG':'CPI'}, inplace=True)
df2.rename(columns={'SH.STA.SUIC.P5':'Suicide Rate', 'CM.MKT.TRAD.GD.ZS':'Stocks Traded', 'BM.KLT.DINV.CD.WD':'Foreign Investment'}, inplace=True)
df3.rename(columns={'SH.STA.SUIC.P5':'Suicide Rate', 'CM.MKT.TRAD.GD.ZS':'Stocks Traded', 'CM.MKT.INDX.ZG':'S&P500'}, inplace=True)

In [6]:
df1.drop(columns=['Unnamed: 0', 'Lending Type'],inplace=True)
df2.drop(columns=['Unnamed: 0', 'Lending Type'],inplace=True)
df3.drop(columns=['Unnamed: 0', 'Lending Type'],inplace=True)

### This bubble chart allows for the visualization of the suicide mortality rate by the size of the bubble. The larger the bubble, the higher the suicide mortality rate. We can see that the USA has the highest volume of stocks traded (% of GDP) in 2010 and has a moderate level of suicide mortality rate. In comparison, the United Emirates has lowest volume of stocks traded with a very low amount of suicide mortality rate. I would like to focus on South Korea which seemingly has a moderate level of stocks traded and a very large level of suicide rates.


In [7]:
fig = px.scatter(df1.query("Year==2010"), x='CPI', y='Stocks Traded', size ='Suicide Rate', color='Country Name', log_x=True, size_max=60)
fig.show()

In [8]:
fig = px.scatter(df1.query("Year==2015"), x='CPI', y='Stocks Traded', size ='Suicide Rate', color='Country Name', log_x=True, size_max=60)
fig.show()

### Looking at the same variables, we are able to visualize the changes across a five year span. The variablity of change in suicide mortality rates seem to be the same. South Korea is once again the largest bubble signifying the highest rate of suicide. The USA and United Emirates have the highest and lowest (respectively) volume of stocks traded. Inflation does not seem to affect suicide rates.

## Inflation Across the Years 2010-2018
### Using a line graph provides an excellent visualization of the changes in inflation over the years. Focusing on South Korea (which was shown to have the highest suicide mortality rates in both 2010 and 2015), we can see that the inflation rate decreases from 3% in 2010 to 0.75% in 2015. Notably, the United States also has a drastic change in inflation rates from 0.9% in 2010 to 2.6% in 2015.

In [9]:
fig = px.line(df1, x='Year', y='CPI', color='Country Name')
fig.show()

## Foreign Direct Investments (Net Outflows) Across the Years 2010-2019
### Other than the United States, the three other countries seems to have a flat line in net outflows during the years selected. The United States shows a steady decline in foreign investments throughout the years.

In [11]:
fig = px.line(df2, x='Year', y='Foreign Investment', color='Country Name')
fig.show()

## S&P 500 Global Equity Indices 2010-2015
### Analyzing the stock market directly with the S&P 500 Global Equity Indices, we can see that South Korea and Canada both show high levels of annual percent change.


In [22]:
is2010 = df3['Year']==2010
print(is2010.head())

0     True
1    False
2    False
3     True
4    False
Name: Year, dtype: bool


In [23]:
df2010 = df3[is2010]

In [24]:
fig = px.bar(df2010, x='Country Name', y='S&P500', color='Country Name')
fig.show()

### When viewing the same metrics in 2015, all four countries have negative percent changes with Canada and the United Emirates showing the greatest change.

In [26]:
is2015 = df3['Year']==2015
print(is2015.head())

0    False
1     True
2    False
3    False
4     True
Name: Year, dtype: bool


In [27]:
df2015 = df3[is2015]

In [28]:
fig = px.bar(df2015, x='Country Name', y='S&P500', color='Country Name')
fig.show()