#  Does 👩‍🏫 + 📚 + 🕓 + 🎓 === 💵💵💵💵?
  
Something that is emblematic of the immigrant experience is the focus on hard work, resilience, and educational attainment. And, although I was born in this country, I, also come from a long line of immigrants. So I know what it’s like to live the immigrant experience. When I was a kid, it sometimes felt like education was the end all be all of belonging to an immigrant family. There was such intense focus on it from everyone in my family — my parents and aunts and uncles and grandparents and even friends of the family. They all waxed eloquent about the virtues of education. They were relentless (or, at least, it felt that way). Basically, it was the only thing I ever heard about, ever. Education represented a way for a person to change their lot in life — to have a chance at achieving financial security or maybe even gaining real wealth. In short, education was the key to success. It wasn’t just education writ ?large that was the focus — it was post-secondary education that was lauded as the ultimate goal. 
 
Growing up in this environment, naturally meant I learned from an early age to value and appreciate the pursuit of higher education. However, since I was little, I never took anything I was told on faith alone. Because of this, I felt inclined to research whether there really was any correlation between between higher education and wealth, or was it all part of my immigrant mythology. In order to answer this question, I pointed my web browser to http://www.worlddev.xyz/. This website is a treasure trove of information. It aggregates open-source data collected from the World Bank and presents the data visually in the form of charts and graphs. It is an excellent tool to probe for answers to my question. 
 
I agonized for weeks on what specific questions I could ask. There is so much data on worlddev.xyz to comb through. I felt overwhelmed. While feverishly scouring over the data I stumbled upon three indicators that piqued my interest. They involved the different levels of post-secondary education. That is, the attainment of bachelor’s, master’s, and doctoral degrees. I thought to myself, “this is it!”. It was something that spoke to me and my experience. It was exactly what I was looking for. 
 
I want to examine these data to see if there is a relationship between a country’s wealth and the proportion of its population that has obtained higher education. I am curious to know what these two factors can tell us about larger global trends. 
 
The world bank describes the bachelor’s degree attainment as, “The percentage of population ages 25 and over that attained or completed Bachelor's or equivalent,” (databank.worldbank.org). The wording is exactly the same for the other two degrees as well. It stands to reason that since a subordinate degree is a prerequisite for getting a higher degree then it is understood that the bachelor’s indicator will include some number of persons with master’s and doctoral degrees. And that the master’s indicator will include some portion of people that have also received doctoral degrees. This fact aside I think it is still worthwhile to see what impressions we can glean regarding all three indicators. 
 
They will be compared against economic indicators: GDP per capita (Current US$) and Adjusted Net National Income (Current US$). The definitions of each are as described by the world bank’s website are provided below.
 
**GDP per capita (Current US$)**
 
>“GDP per capita is gross domestic product divided by midyear population. GDP is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources. Data are in current U.S. dollars.”
        
**Adjusted Net National Income (Current US$)**
 
>“Adjusted net national income is GNI minus consumption of fixed capital and natural resources depletion.”
 
I chose GDP because it is the indicator I see used most often in the world of economics and politics. It is the “gold standard” of economic indicators. As the definition states it reflects the entire wealth represented by a given country. However, the numbers stated in terms of GDP are hard for me to wrap my head around. Especially because I am a layperson. I like the net national income indicator because it makes a little more sense to me because it represents on a per capita basis the take-home pay of a person in a given country year-over-year.

## Analysis
 
I chose to analyze five countries (Australia, Bangladesh, Mexico, Serbia, Sweden) that had data across five years (2013-2017). I could only pick these five due to the limitations of the dataset. Luckily, the world bank has a similar indicator that compares to the ones I outlined above. 
This indicator also looks at those over 25 and if they have completed post-secondary education. This indicator is not as specific as the ones I'm interested in but can give us a good sense of the overall trends in this space.
 
 
### Select countries percent of population over age 25 with post-secondary education
Here you can see 20 countries represented in the bubble graph below. They are compared against GDP. There is a distinct upward trend reflected in the data. When looking at the next graph. The disparity becomes starker because instead of isolating each data point as an individual country the data are grouped together by income. The high income groups, for the most part, all have the highest rates of post-secondary reducation completion.


In [43]:
import pandas as pd
import plotly.express as px
import plotly.io as pio

pio.templates.default = "plotly_dark" # adds dark theme to better see the data

#### importing data frame for global context

This data is helpful in understanding the global trends and if they are vaible to study on a smaller scale. there is minimal cleaning and organizing for the data frame.

In [108]:
df_world = pd.read_csv('data/world_data.csv')
df_world.drop(columns=["Unnamed: 0"], inplace=True)

df_world

Unnamed: 0,Year,SP.POP.TOTL,NY.GDP.PCAP.CD,SE.SEC.CUAT.PO.ZS,Country Code,Country Name,Region,Income Group,Lending Type
0,2012,22733465,68012.147901,46.461971,AUS,Australia,East Asia & Pacific,High income,Not classified
1,2013,23128129,68150.107041,41.640911,AUS,Australia,East Asia & Pacific,High income,Not classified
2,2014,23475686,62510.791171,39.034981,AUS,Australia,East Asia & Pacific,High income,Not classified
3,2015,23815995,56755.721712,45.345070,AUS,Australia,East Asia & Pacific,High income,Not classified
4,2016,24190907,49971.131456,46.473431,AUS,Australia,East Asia & Pacific,High income,Not classified
...,...,...,...,...,...,...,...,...,...
90,2012,313830990,51610.605278,40.663231,USA,United States,North America,High income,Not classified
91,2013,315993715,53117.667850,41.504669,USA,United States,North America,High income,Not classified
92,2014,318301008,55064.744548,41.889702,USA,United States,North America,High income,Not classified
93,2015,320635163,56839.381774,42.337509,USA,United States,North America,High income,Not classified


In [138]:
fig = px.scatter(data_frame = df_world,
                 height=675,
                 size_max= 60,
                 title = 'Bubble Size, Population, total',
                 labels= {
                   "NY.GDP.PCAP.CD": 'GDP per capita (Current US$)',
                    'SE.SEC.CUAT.PO.ZS': '% 25+ completed at least Post-Secondary',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x="NY.GDP.PCAP.CD",
                 y="SE.SEC.CUAT.PO.ZS",
                 color="Country Name",
                 size='SP.POP.TOTL', 
                 animation_frame="Year")

fig.show()

In [139]:
fig = px.scatter(data_frame = df_world,
                 height=600,
                 size_max= 60,
                 title = 'Bubble Size, Population, total',
                 labels= {
                   "NY.GDP.PCAP.CD": 'GDP per capita (Current US$)',
                    'SE.SEC.CUAT.PO.ZS': '% 25+ completed at least Post-Secondary',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x="NY.GDP.PCAP.CD",
                 y="SE.SEC.CUAT.PO.ZS",
                 color="Income Group",
                 size='SP.POP.TOTL',
                 animation_frame="Year")
fig.show()


 
 
### What about Bachelor's, Master's, and Doctoral Degree attainment?
 
I think the post-secondary data gives us a good sense of the validity of the trends even though we are limited to 5 countries' data. Below we can see that for each type of degree the trend is upheld. For each degree type, the wealthiest countries (Australia and Sweden) outstrip the other countries in terms of how many of their citizens have completed higher education.


#### Importing the rest of the data
There three three remaining csv files that need to be read into a dataframe. these three files differ only in a single column, so merging the data frames is required before proceeding.

In [115]:
df_higher_ed = pd.read_csv('data/bach_data.csv')
data_m = pd.read_csv('data/mast_data.csv')
data_d = pd.read_csv('data/doc_data.csv')

df_higher_ed.drop(columns=["Unnamed: 0"], inplace=True)

# Add the masters and doctoral columns to df_higher_ed

df_higher_ed.insert(4, 'SE.TER.CUAT.MS.ZS', data_m['SE.TER.CUAT.MS.ZS'])

df_higher_ed.insert(5,'SE.TER.CUAT.DO.ZS', data_d['SE.TER.CUAT.DO.ZS'])

df_higher_ed



Unnamed: 0,Year,NY.ADJ.NNTY.PC.CD,SP.POP.TOTL,SE.TER.CUAT.BA.ZS,SE.TER.CUAT.MS.ZS,SE.TER.CUAT.DO.ZS,Country Code,Country Name,Region,Income Group,Lending Type
0,2013,53920.037029,23128129,26.743601,5.75698,0.98578,AUS,Australia,East Asia & Pacific,High income,Not classified
1,2014,48738.747973,23475686,24.946449,5.32485,0.93408,AUS,Australia,East Asia & Pacific,High income,Not classified
2,2015,44352.137846,23815995,29.78425,7.02229,1.317,AUS,Australia,East Asia & Pacific,High income,Not classified
3,2016,39137.300358,24190907,30.02981,6.73469,1.08242,AUS,Australia,East Asia & Pacific,High income,Not classified
4,2017,42036.79642,24601860,31.39119,7.36984,1.2035,AUS,Australia,East Asia & Pacific,High income,Not classified
5,2013,972.921721,152764676,5.50232,1.96671,0.09163,BGD,Bangladesh,South Asia,Lower middle income,IDA
6,2014,1091.481204,154520167,5.85214,2.05822,0.09679,BGD,Bangladesh,South Asia,Lower middle income,IDA
7,2015,1217.843548,156256276,8.23714,3.32594,0.15837,BGD,Bangladesh,South Asia,Lower middle income,IDA
8,2016,1360.206784,157970840,8.5096,3.48119,0.16596,BGD,Bangladesh,South Asia,Lower middle income,IDA
9,2017,1493.554796,159670593,8.97688,3.79244,0.17941,BGD,Bangladesh,South Asia,Lower middle income,IDA


#### Now that everything is neat and tidy we can visualize the data
I'll begin with the Bachelor's data followed by the Master's and end with the Doctoral data.

In [137]:
fig = px.scatter(data_frame = df_higher_ed,
                 height=600,
                 size_max=80,
                 title = 'Bubble Size, Population, total',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.BA.ZS': '% 25+ completed at least Bachelor\'s Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='NY.ADJ.NNTY.PC.CD',
                 y='SE.TER.CUAT.BA.ZS',
                 color="Country Name",
                 size='SP.POP.TOTL', 
                 animation_frame="Year")
fig.update_layout(yaxis_range = (0,df_higher_ed['SE.TER.CUAT.BA.ZS'].max()+5))
fig.show()

In [136]:
fig = px.scatter(data_frame = df_higher_ed,
                 height=600,
                 size_max=80,
                 title = 'Bubble Size, Population, total',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.MS.ZS': '% 25+ completed at least Master\'s Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='NY.ADJ.NNTY.PC.CD',
                 y='SE.TER.CUAT.MS.ZS',
                 color="Country Name",
                 size='SP.POP.TOTL',                  
                 animation_frame="Year")
fig.update_layout(yaxis_range = (0,df_higher_ed['SE.TER.CUAT.MS.ZS'].max()+1))
fig.show()

In [135]:
fig = px.scatter(data_frame = df_higher_ed,
                 height=600,
                 size_max=80,
                 title = 'Bubble Size, Population, total',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.DO.ZS': '% 25+ completed at least Doctoral Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='NY.ADJ.NNTY.PC.CD',
                 y='SE.TER.CUAT.DO.ZS',
                 color="Country Name",
                 size='SP.POP.TOTL',              
                 animation_frame="Year")
fig.update_layout(yaxis_range = (0,df_higher_ed['SE.TER.CUAT.DO.ZS'].max()+.1))
fig.show()

### Two data walk into a bar....
 
Now seeing that there is an established trend that wealth and educational attainment certainly are strongly correlated we can dive deeper. I want to take a closer look at Bachelor's degree attainment. 
 
The bar graphs reiterate the findings established above. But they make the contrast clearer than ever. The bar graphs show a substantial difference between the leaders and the laggards. But that distinction almost jumps out of the screen when looking at the second bar graph. Here we see how much factors like education contribute to a nation's wealth and the wealth of its citizens.


In [147]:
fig = px.bar(data_frame = df_higher_ed,
                 height=600,
                 title= '% 25+ completed at least Bachelor\'s Degree by Country',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.BA.ZS': '% 25+ completed at least Bachelor\'s Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='Country Name',
                 y='SE.TER.CUAT.BA.ZS',
                 color="Country Name",
                 animation_frame="Year")
fig.update_xaxes(categoryorder='total descending')                 

fig.show()                

In [148]:
fig = px.bar(data_frame = df_higher_ed,
                 height=600,
                 title= 'Adjusted net national income per capita (current US$) by Country',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.BA.ZS': '% 25+ completed at least Bachelor\'s Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='Country Name',
                 y='NY.ADJ.NNTY.PC.CD',
                 color="Country Name",
                 animation_frame="Year")
fig.update_xaxes(categoryorder='total descending')                 

fig.show()                

 
### The end of the line
 
Last but not least is a graph looking at bachelor's attainment across these five countries over time. As suspected there are some fluctuations here and there but largely there is an upward trend as we march ever forward in time. Perhaps the most interesting thing of note is that each of the five countries (barring large anomalies) have lines that have the same slope. I'm not sure what that means but it could have to do with a population grown and their university systems have not met their respective capacities. 


In [153]:
fig = px.line(data_frame = df_higher_ed,
                 height=600,
                 title= 'Adjusted net national income per capita (current US$) by Country',
                 labels= {
                   "NY.ADJ.NNTY.PC.CD": 'Adjusted net national income per capita (current US$)',
                    'SE.TER.CUAT.BA.ZS': '% 25+ completed at least Bachelor\'s Degree',
                    'SP.POP.TOTL' : 'Population, total'
                 },
                 x='Year',
                 y='SE.TER.CUAT.BA.ZS',
                 color="Country Name")
fig.update_xaxes(categoryorder='total descending').update_layout(xaxis= dict(dtick = 1))                 

fig.show()                

### This is the end my friend
 
In conclusion, I think that we learned there is a correlation between wealth and educational attainment. But I don’t think there is enough to go on to say that it is the factor that determines whether a country is wealthy or not.
 
What can poorer nations do to improve the education of their citizens? I think that access to food and running water and reliable access to electricity are table stakes. They are critical things that allow a country to become stable. Only then can children be reliably educated and only after completing primary and secondary education they can go on to attempt post-secondary degree attainment. Being highly educated ultimately is a luxury. Even though being educated can lead to even more wealth. The gap between the rich and the poor will only grow bigger if the ‘little guys’ can’t ever catch up.

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=123b262d-35f2-47ab-b1cf-53ab82f88de9' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>