# Human Capital and Economic Growth

It is well known that **human capital** is widely recognized as one of the most critical drivers of sustained economic growth and development . This classical economic proposition posits that investments in the skills, knowledge, and health of the workforce directly enhance a nation's productive capacity.
To empirically test this thesis, this study aims to **quantitatively analyze the relationship between government expenditure on education and the subsequent change in total factor productivity (TFP)** across a panel of countries over time. Government spending on education serves as a primary, measurable proxy for public investment in human capital formation.

The theoretical mechanism is straightforward: increased public investment in schooling and training is expected to raise the level of labor quality, foster technological absorption capabilities, and stimulate domestic innovation. These factors collectively lead to an increase in **Total Factor Productivity (TFP)**, which is the key residual measure of economic efficiency and technological progress. Specifically, we will use government education expenditure (as a percentage of GDP) as the key independent variable, and TFP growth rate as the dependent variable.

# Human Capital

Firstly, for the measure of invenstmen in human capital, we will utilize data on government spending on education (as a percentage of GDP) sourced from (the World Bank)[https://data.worldbank.org/indicator/SE.XPD.TOTL.GD.ZS] as a core empirical proxy.

In [1]:
import pandas as pd
education = pd.read_csv("API_SE.XPD.TOTL.GD.ZS_DS2_en_csv_v2_2638.csv", header=2)
education

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2016,2017,2018,2019,2020,2021,2022,2023,2024,Unnamed: 69
0,Aruba,ABW,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,5.491360,4.455820,4.548764,4.435037,,3.618558,,,,
1,Africa Eastern and Southern,AFE,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,4.692000,4.430510,4.739750,4.511410,4.090565,4.368379,3.697668,3.962293,,
2,Afghanistan,AFG,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,4.543970,4.343190,,,,,,,,
3,Africa Western and Central,AFW,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,2.615035,3.296630,3.051252,3.047399,3.398741,3.096926,2.891687,3.215620,,
4,Angola,AGO,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,2.754937,2.466879,2.183513,2.073064,2.667447,2.297197,2.385359,2.512737,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
261,Kosovo,XKX,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,,,,,,,,,,
262,"Yemen, Rep.",YEM,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,,,,,,,,,,
263,South Africa,ZAF,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,5.444240,5.598670,5.644010,5.911580,6.170670,6.555170,6.155990,6.123470,6.0211,
264,Zambia,ZMB,"Government expenditure on education, total (% ...",SE.XPD.TOTL.GD.ZS,,,,,,,...,3.747920,3.729640,4.739750,4.418240,3.943741,3.113635,3.658841,4.073749,,


In [2]:
# clean the data
education = education.dropna(axis=1, how="all")
education = education.drop(columns=["Indicator Name","Indicator Code"])
education

Unnamed: 0,Country Name,Country Code,1970,1971,1972,1973,1974,1975,1976,1977,...,2015,2016,2017,2018,2019,2020,2021,2022,2023,2024
0,Aruba,ABW,,,,,,,,,...,5.888270,5.491360,4.455820,4.548764,4.435037,,3.618558,,,
1,Africa Eastern and Southern,AFE,,,,,,,,,...,4.737919,4.692000,4.430510,4.739750,4.511410,4.090565,4.368379,3.697668,3.962293,
2,Afghanistan,AFG,,,,,,,,,...,3.255800,4.543970,4.343190,,,,,,,
3,Africa Western and Central,AFW,,,,,,,,,...,3.138830,2.615035,3.296630,3.051252,3.047399,3.398741,3.096926,2.891687,3.215620,
4,Angola,AGO,,,,,,,,,...,3.486896,2.754937,2.466879,2.183513,2.073064,2.667447,2.297197,2.385359,2.512737,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
261,Kosovo,XKX,,,,,,,,,...,,,,,,,,,,
262,"Yemen, Rep.",YEM,,,,,,,,,...,,,,,,,,,,
263,South Africa,ZAF,,,,,,,,,...,5.482850,5.444240,5.598670,5.644010,5.911580,6.170670,6.555170,6.155990,6.123470,6.0211
264,Zambia,ZMB,4.38781,6.09535,5.94579,5.48807,4.87801,6.23854,6.02694,5.69251,...,4.624330,3.747920,3.729640,4.739750,4.418240,3.943741,3.113635,3.658841,4.073749,


As a next step, we will identify the latest available year in the dataset that meets the maximum data completeness criteria (ie., the year with the fewest NA values )

In [3]:
# find the year with the fewest NA
na_year = education.isna().sum().reset_index()
na_year

Unnamed: 0,index,0
0,Country Name,0
1,Country Code,0
2,1970,234
3,1971,210
4,1972,212
5,1973,212
6,1974,212
7,1975,205
8,1976,209
9,1977,208


Based on the NA count above, we will select the year 2020 for the analysis, as it represents the most recent period and more available data.

In [9]:
education_2020 = education[["Country Name","2020"]]
education_2020

Unnamed: 0,Country Name,2020
0,Aruba,
1,Africa Eastern and Southern,4.090565
2,Afghanistan,
3,Africa Western and Central,3.398741
4,Angola,2.667447
...,...,...
261,Kosovo,
262,"Yemen, Rep.",
263,South Africa,6.170670
264,Zambia,3.943741


# TFP

Second, we will analyze the growth of the total factor productivity (TFP)  (Source:https://ourworldindata.org/grapher/tfp-at-constant-national-prices-20111)

Since the TFP figures in the dataset are indexed relative to each country's 2021 TFP level, we will compute the annual TFP growth rate by dividing the TFP value of a given year by the value from the preceding year and subtracting one. This process isolates the year-to-year percentage change, which serves as the appropriate dependent variable.

In [10]:
tfp = pd.read_csv("tfp-at-constant-national-prices-20111.csv")
tfp

Unnamed: 0,Entity,Code,Year,Total factor productivity level (using national accounts)
0,Albania,ALB,1974,0.728517
1,Albania,ALB,1975,0.733268
2,Albania,ALB,1976,0.727340
3,Albania,ALB,1977,0.719494
4,Albania,ALB,1978,0.712486
...,...,...,...,...
6887,Zimbabwe,ZWE,2019,1.057740
6888,Zimbabwe,ZWE,2020,0.973187
6889,Zimbabwe,ZWE,2021,1.000000
6890,Zimbabwe,ZWE,2022,0.997026


Based on the assumption that there is a time lag in the effect realization of educational investment, this study specifies the independent variable (educational expenditure) in 2020 and the dependent variable (TFP growth rate) as the change observed between 2023 and 2024.

In [11]:
tfp_2023 = tfp[tfp["Year"] >= 2022]
tfp_2023

Unnamed: 0,Entity,Code,Year,Total factor productivity level (using national accounts)
48,Albania,ALB,2022,0.995818
49,Albania,ALB,2023,1.022242
98,Angola,AGO,2022,1.004963
99,Angola,AGO,2023,0.996475
150,Argentina,ARG,2022,1.039326
...,...,...,...,...
6699,Uruguay,URY,2023,0.993805
6824,Zambia,ZMB,2022,1.037554
6825,Zambia,ZMB,2023,1.060886
6890,Zimbabwe,ZWE,2022,0.997026


In [12]:
tfp_growth = tfp_2023.pivot(index="Entity",
                            columns="Year",
                            values="Total factor productivity level (using national accounts)")
tfp_growth

Year,2022,2023
Entity,Unnamed: 1_level_1,Unnamed: 2_level_1
Albania,0.995818,1.022242
Angola,1.004963,0.996475
Argentina,1.039326,0.966590
Armenia,1.069053,1.107368
Australia,1.005513,1.001073
...,...,...
United Kingdom,1.010483,0.998529
United States,0.984038,0.992999
Uruguay,0.993747,0.993805
Zambia,1.037554,1.060886


In [13]:
tfp_growth["Growth"] = tfp_growth[2023]/tfp_growth[2022] -1 
tfp_growth = tfp_growth.drop(columns=[2022,2023])
tfp_growth

Year,Growth
Entity,Unnamed: 1_level_1
Albania,0.026534
Angola,-0.008447
Argentina,-0.069984
Armenia,0.035840
Australia,-0.004416
...,...
United Kingdom,-0.011830
United States,0.009106
Uruguay,0.000059
Zambia,0.022487


# Merge the two datasets

In [16]:
df_merge = pd.merge(left=education_2020, right=tfp_growth,
                    left_on="Country Name", right_on="Entity")
df_merge

Unnamed: 0,Country Name,2020,Growth
0,Angola,2.667447,-0.008447
1,Albania,3.325040,0.026534
2,Argentina,5.276900,-0.069984
3,Armenia,2.705560,0.035840
4,Australia,5.386250,-0.004416
...,...,...,...
103,Uruguay,4.542920,0.000059
104,United States,5.395320,0.009106
105,South Africa,6.170670,-0.041478
106,Zambia,3.943741,0.022487


In [17]:
import plotly.express as px
fig = px.scatter(df_merge,
                 x = "2020",
                 y = "Growth",
                 title= "The Relationship between expenditure on education and productivity",
                 trendline="ols")
fig

In [20]:
import statsmodels.formula.api as smf
model = smf.ols("Growth ~ Q('2020')", data=df_merge)
results = model.fit()
results.summary()

0,1,2,3
Dep. Variable:,Growth,R-squared:,0.001
Model:,OLS,Adj. R-squared:,-0.01
Method:,Least Squares,F-statistic:,0.06353
Date:,"Sat, 06 Dec 2025",Prob (F-statistic):,0.802
Time:,22:31:23,Log-Likelihood:,221.71
No. Observations:,98,AIC:,-439.4
Df Residuals:,96,BIC:,-434.2
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,-0.0075,0.008,-0.927,0.356,-0.023,0.008
Q('2020'),0.0004,0.002,0.252,0.802,-0.003,0.004

0,1,2,3
Omnibus:,4.766,Durbin-Watson:,2.138
Prob(Omnibus):,0.092,Jarque-Bera (JB):,5.269
Skew:,-0.237,Prob(JB):,0.0718
Kurtosis:,4.032,Cond. No.,16.3


Unfortunately, based on the provided data and model specification, the OLS analysis found no evidence that government expenditure on education in 2020 has a statistically significant impact on the TFP growth rate observed between 2023 and 2024.

The R-squared 0.001 means that The independent variable, educational expenditure , explains only 0.1% of the variation in TFP growth. This indicates an extremely poor fit for the model.

However,This outcome does not necessarily invalidate the theory that human capital drives growth, but rather suggests that the analytical approach needs refinement. We can consider some ways to obtain a more robust result. For example, we can adjust the time lag.The effect of education may take longer than three to four years.