# Starting A Career in one of Four Asian Tigers. Where do you go?

# 1. What is "Four Asian Tigers"?

Quoting from wikipedia article https://en.wikipedia.org/wiki/Four_Asian_Tigers:

*"Four Asian Tigers, are the economies of Hong Kong, Singapore, South Korea and Taiwan, which underwent rapid industrialization and maintained exceptionally high growth rates (in excess of 7 percent a year) between the early 1960s (mid-1950s for Hong Kong) and 1990s. By the early 21st century, all four had developed into high-income economies, specializing in areas of competitive advantage. Hong Kong and Singapore have become world-leading international financial centres, whereas South Korea and Taiwan are world leaders in manufacturing electronic components and devices. Their economic success stories have served as role models for many developing countries, especially the Tiger Cub Economies of southeast Asia"*

# 2. Why do you care?

Let's say you are a young graduate from university. Having completed the Applied Data Science course from University of Michigan, you feel empowered with your newly acquired knowledge and you are aspired to start a career as a data scientist. 

However, you are unsure which sector you would like to go. Browsing through Wikipedia article, you are intrigued in the Four Asian Tigers. However, you are unsure which country you should be starting your career at.

After much consideration, you decided that you would like to **start your career in the country that potentially give you the highest starting salary.**

# 3. Which of the Four Asian Tigers is the richest?

You know there are a plenty of data from World Bank (https://data.worldbank.org). You start your research by taking a look at an economic indicator called Gross National Income (GNI) per Capita, reflecting the average income of each country's citizen. https://en.wikipedia.org/wiki/List_of_countries_by_GNI_(nominal)_per_capita

From World Bank, you found two interesting data, i.e.:
* Each country's GNI data : https://data.worldbank.org/indicator/NY.GNP.ATLS.CD, 
* Each country's population : https://data.worldbank.org/indicator/SP.POP.TOTL

To ensure these data can work nicely with Pandas, you perform manual cleaning by removing all unnecessary components and leaving the main tables alone. You push them into your own repository.

You can then begin your data analysis and plotting some charts

### 3.1. Import necessary Python packages

In [1]:
import numpy as np
import pandas as pd

import requests

## 3.2. Read Datasets

In [2]:
df_GNI = pd.read_csv("gnp.csv")

In [3]:
df_GNI.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018
0,Aruba,ABW,"GNI, Atlas method (current US$)",NY.GNP.ATLS.CD,,,,,,,...,2411359000.0,2274836000.0,2290680000.0,2412460000.0,2473122000.0,2542891000.0,2481324000.0,2469662000.0,2490319000.0,
1,Afghanistan,AFG,"GNI, Atlas method (current US$)",NY.GNP.ATLS.CD,,,,,,,...,12652730000.0,14872480000.0,16077120000.0,19553140000.0,21216130000.0,21109370000.0,20583400000.0,19893950000.0,19763600000.0,
2,Angola,AGO,"GNI, Atlas method (current US$)",NY.GNP.ATLS.CD,,,,,,,...,71273450000.0,75763530000.0,82745470000.0,104804600000.0,124418700000.0,134935100000.0,125915200000.0,108726700000.0,106188900000.0,
3,Albania,ALB,"GNI, Atlas method (current US$)",NY.GNP.ATLS.CD,,,,,,,...,12529070000.0,12709490000.0,12804490000.0,12658730000.0,13156700000.0,13115220000.0,12636420000.0,12431260000.0,12415260000.0,
4,Andorra,AND,"GNI, Atlas method (current US$)",NY.GNP.ATLS.CD,,,,,,,...,,,,,,,,,,


##  3.3. Browse through Datasets

In [17]:
a = np.where(df_GNI['Country Name'].str.contains("Singapore"))

a

(array([206]),)

In [16]:
b = np.where(df_GNI['Country Name'].str.contains("Hong Kong"))

b

(array([94]),)

In [20]:
c = np.where(df_GNI['Country Name'].str.contains("Korea"))

c

(array([124, 191]),)

In [21]:
d = np.where(df_GNI['Country Name'].str.contains("Japan"))

d

(array([117]),)

In [22]:
e = np.where(df_GNI['Country Name'].str.contains("China"))

e

(array([ 38,  94, 144]),)

In [29]:
df_GNI.loc[144]["Country Name"]

'Macao SAR, China'

In [31]:
df_GNI["Country Name"][144]

'Macao SAR, China'

In [None]:
df_POP = pd.read_csv("pop.csv")

In [None]:
df_GNI.head()

# 3.3. Extract Relevant Data

In [None]:
df_GNI.set_index("Country Name", inplace=True)
df_GNI.head()

In [None]:
df_POP.set_index("Country Name", inplace=True)
df_POP.head()

In [None]:
fourtiger = ["Hong Kong", "Singapore", "South Korea", "Taiwan"]

In [None]:
df_GNI.loc['Singapore']

In [None]:
[df_GNI['Country Name'].str.contains("Singapore")]

In [None]:
df_GNI.loc["Singapore"]

In [None]:
df_GNI.loc["South Korea"]

In [None]:
a=pd.read_csv("gdp.csv")
a.head()

In [None]:
b=pd.read_csv("gnp.csv")
b.head()

In [None]:
c=pd.read_csv("pop.csv")
c.head()



## 1. Region and Domain



State the region and the domain category that your data sets are about.

https://hub.coursera-notebooks.org/user/snqwzxvcmyurudasirmlfj/files/readonly/Assignment4_example.pdf

In [None]:
import requests
import io

import numpy as np
import pandas as pd

In [None]:
c=pd.read_csv("gnp.csv")

In [None]:
# GDP

# http://api.worldbank.org/v2/en/indicator/NY.GDP.MKTP.CD?downloadformat=csv
# https://raw.githubusercontent.com/andriyantohalim/AppliedDataScienceMichigan/master/API_NY.GDP.MKTP.CD_DS2_en_csv_v2_10363296.csv

In [None]:
# GNP

# http://api.worldbank.org/v2/en/indicator/NY.GNP.ATLS.CD?downloadformat=csv
# https://raw.githubusercontent.com/andriyantohalim/AppliedDataScienceMichigan/master/API_NY.GNP.ATLS.CD_DS2_en_csv_v2_10364296.csv

In [None]:
# Pop

# http://api.worldbank.org/v2/en/indicator/SP.POP.TOTL?downloadformat=csv
# https://raw.githubusercontent.com/andriyantohalim/AppliedDataScienceMichigan/master/API_SP.POP.TOTL_DS2_en_csv_v2_10363240.csv

In [None]:
url="https://raw.githubusercontent.com/andriyantohalim/AppliedDataScienceMichigan/master/API_NY.GDP.MKTP.CD_DS2_en_csv_v2_10363296.csv"

In [None]:
c=pd.read_csv("gdp.csv")

In [None]:
c.head()

In [None]:
http://api.worldbank.org/v2/en/indicator/NY.GDP.MKTP.CD?downloadformat=csv

## 2. Research Question

You must state a question about the domain category and region that you identified as being interesting.

## 3. Links

You must provide at least two links to publicly accessible datasets. These could be links to files such as CSV or Excel files, or links to websites which might hav edata in tabular form, such as Wikipedia pages.

## 4. Images

You must upload an image which addresses the research question you stated. In addition to addressing the question, this visual should follow Cairo’s principles of truthfulness, functionality,
beauty, and insightfulness.

In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

## 5. Discussion

You must contribute a short (1-2 paragraph) written justification of how your visualization
addresses your stated research question.