## PSTAT 100 Final Project Report
## Exploring the Human Freedom Index and World Happiness data

Chunting Zheng, Karen Zhao

#### Author contributions

Chunting Zheng worked on everything.

Karen Zhao worked on everything.

#### Abstract

The Human Freedom Index (HFI) is a global measurement of personal, civic, and economic freedom on a scale of 0 to 10. Because freedom is inherently valuable and plays an important role in human progress and human well-being, it is important to observe its relationship with how people evaluate their life quality, as well as the ways in which the various dimensions of freedom interact with one another. This project aims to study the correlation between the 12 areas of freedom used to calcualt human freedom, and the relationship between freedom and life evaluation. After conducting exploratory analyses, we have discovered that these areas of freedom and life evaluation score are all positively correlated with each other. The fitted multiple regression model suggests that there is a signifcant difference in life evaluation by income level and personal freedom, but not economic freedom.

---
## Introduction

### Background

In recent years, personal freedom has declined around the world. It can be challenging to determine which countries' citizens have the most freedom. However, we can determine which countries have the highest level of human freedom through individual indices. The [Human Freedom Index Report for 2020](https://www.cato.org/sites/cato.org/files/2021-03/human-freedom-index-2020.pdf) is co‐published by the Cato Institute and the Fraser Institute. The Human Freedom Index helps observe relationship between freedom and other socioeconomic phenomena, as well as ways in which the various dimensions of freedom interact with one another. 

To determine each country's freedom rank, each country is given a score for `Personal Freedom` and `Economic Freedom`. These scores are averaged to find the `Human Freedom` score. Countries in the top quartile of freedom enjoy a significantly higher average per capita income ($\$50,340$) than those in other quartiles; the average per capita income in the least-free quartile is $\$7,720$.

The findings in the Human Freedom Index suggest that freedom plays an important role in human well-being, and they offer opportunities for further research into the complex ways in which freedom influences, and can be influenced by political reigmes, economic development, and the whole range of indicators of human well-being. Also, the Human Freedom Index finds a strong relationship between human freedom and democracy. 

Moreover, freedom and happiness tend to be positively coorelated. The [World Happiness Report for 2020](https://happiness-report.s3.amazonaws.com/2020/WHR20.pdf) brings together the available global data on national happiness and reviewing evidence from the emerging science of happiness. The report reminds us, happiness is based on social capital, not just financial capital. In this project, we will dig more deeply into the pattern of correlation and see how it differs across cultures and aspects of freedom in 2018. 


### Aims
This project aims to dig more deeply into the pattern of correlation and see how it differs across cultures and aspects of freedom in 2018. We first attempted to detect the major determinants of human freedom using principal component analysis. We could not identify which freedom subindexes are significantly contributing to the human freedom. Yet we found regional patterns in the data. The fall in personal freedom in the past decade has driven the decline in human freedom in the world, with some indicators and some regions seeing especially marked deterioration. Then we sought to understand the relationship between world happiness and freedom scores, and identify freedom variables that are predictive of the world happiness in 2018. Using a linear regression model, 65% of variability in world happiness scores in 2018 can be explained using the personal / economic freedom and income levels. We also found that personal freedom contributes to world happiness significantly than economic freedom.

---
## Materials and methods

### Datasets

The data for this project are `Human Freedom Index` collected and compiled by Ryan Murphy, and `World Happiness Report` baesed on the Gallup World Poll. These data are publicly available: 

> Ian Vasquez and Fred McMahon, The Human Freedom Index 2020: A Global Measurement of Personal, Civil, and Economic Freedom (Washington: Cato Institute, Fraser Institute, and the Friedrich Naumann Foundation for Freedom, 2020).

> Helliwell, John F., Richard Layard, Jeffrey Sachs, and Jan-Emmanuel De Neve, eds. 2020. World Happiness Report 2020. New York: Sustainable Development Solutions Network.

The data value in the `Human Freedom Index` is obtained or caluclated through different variables based on various dataset, such as Global Terriorism Database, Gloabl Database, United Nations, CI-RIGHTS Dataset, OECD, and so on. For example, rule of law is an average and procedural justice, civil justice, and criminal justice, and each subcomponent is calculated as an average of selected Rule of Law Index subfactors. The data value in the `World Happiness Report` is evaluated from the Gallup World Poll surveys. This data is collected by asking people about life satisfaction and happiness for each country. Both data are administrative data. Thus, the scope of inference is none. No information is available about the sampling design for both data.

The `Human Freedom Index` presents the state of human freedom of 162 countries from 2008 to 2018 around the world based on a broad measure with the help of 79 distinct indicators that encompasses, economic, civil, and personal freedom. The `World Happiness Report` contains happiness score for 156 countries from 2008 to 2018 along with the factors used to explain the score. For this study, the observations units are countries.

> **Table 1:** variable descriptions and units for each variable in the dataset.

Name | Variable description | Type | Units of measurement
---|---|---|---
hf_score | Human freedom score | Numeric |  Ranges from 0 to 10
pf_rol | Rule of law | Numeric | Ranges from 0 to 10
pf_ss | Security and safety | Numeric | Ranges from 0 to 10
pf_movement | Freedom of movement (travel) | Numeric | Ranges from 0 to 10
pf_religion | Religious freedom | Numeric | Ranges from 0 to 10
pf_association | Freedom to associate and assemble with peaceful individuals or organizations | Numeric | Ranges from 0 to 10
pf_expression | Freedom of expression | Numeric | Ranges from 0 to 10
pf_identity | Identity and relationships | Numeric | Ranges from 0 to 10
pf_score | Personal freedom score | Numeric | Ranges from 0 to 10
ef_government | Size of government | Numeric | Ranges from 0 to 10
ef_legal | Legal system and property rights | Numeric | Ranges from 0 to 10
ef_money | Sound money | Numeric | Ranges from 0 to 10
ef_trade | Freedom to trade internationally | Numeric | Ranges from 0 to 10
ef_regulation | Regulation of Credit, Labor, and Business | Numeric | Ranges from 0 to 10
ef_score | Economic freedom score | Numeric | Ranges from 0 to 10
life_ladder | Life evaluation score | Numeric | Ranges from 0 to 10
income | Income level | Categorical | Low, lower-middle, upper-middle, high

In our data preprocessing stage, we merged the human freedom and happiness dataset by year and country. Then we obtain 2018 Human Freedom and Happiness data by filtering 2018 records.

> **Table 2:** example rows and columns of 2018 Human Freedom and Happiness data.

|    | region                     |   year | income       | country   |   hf_score |   pf_score |   pf_rol |   pf_ss |   ef_score |   ef_government |   ef_legal |   ef_money |   life_ladder |
|---:|:---------------------------|-------:|:-------------|:----------|-----------:|-----------:|---------:|--------:|-----------:|----------------:|-----------:|-----------:|--------------:|
|  0 | Europe & Central Asia      |   2018 | Upper middle | Albania   |       7.81 |       7.81 |      5   |     9.3 |       7.8  |             8.1 |        5.2 |        9.8 |       5.0044  |
|  1 | Middle East & North Africa |   2018 | Upper middle | Algeria   |       5.2  |       5.42 |      5.1 |     7.8 |       4.97 |             4.2 |        4.5 |        7.9 |       5.04309 |
|  2 | Sub-Saharan Africa         |   2018 | Lower middle | Angola    |       5.48 |       6.21 |      3.6 |     8.4 |       4.75 |             7.3 |        3.4 |        4.7 |     nan       |
|  3 | Latin America & Caribbean  |   2018 | Upper middle | Argentina |       7.05 |       8.32 |      5.7 |     8.8 |       5.78 |             6   |        4.6 |        5.1 |       5.7928  |

### Methods

Exploratory analysis aimed at illuminating correlation among freedom and life evluation scores, and relationship between life satisfaction and human freedom in 2018. This stage of analysis checked relationship visually to see what kind of model structure is sensible. Subsequently, principal components analysis was performed on the normalized human freedom index data to identify measures of freedom subindexes that capture a significant portion of total variation in the data; and typical values of these measures were compared by region and income level. Lastly, multiple linear regression was performed to quantify the association between life satisfaction, personal freedom, economic freedom, and income level. Further, the model are used to predit life satisfaction using personal freedom, economic freedom, and income level.

---
## Results

#### Correlation between freedom and happiness

Exploratory analysis focused on visualizing relationships between the variables to see which kind of model structure is sensible, since the next step of the project to predict life evaluation score using personal freedom, economic freedom, and income level. Figure 1 shows the correlation among quantitative variables and relationship between personal / economic freedom and life satisfaction in 2018.

> **Figure 1**: Left: a heatmap of correlation among all quantitative variables in `hf_happiness` data. Color scale shows positive correlations in orange, negative ones in blue, strong correlations in dark tones, and weak correlations in light tones. Right: a scatterplot of personal or economic freedom scores (x axis) and life evaluation score (y axis), with points and trend lines colored according to its score type, pf_score or ef_score.

<center><img src = 'figures/fig1.svg' style = 'width:800px'></center>

As shown in the heatmap, size of government is less correlated with all other variables -- all entries in the `ef_government` row are in light tones. Human freedom score is strongly, positively correlated with personal and economic freedom indicators -- all entries in the `hf_score` row are orange. There also appears to be a moderately strong correlation between life evaluation score and personal / economic freedom scores. The scatterplot shows an approximate linear relationship between life evaluation score and personal / economic freedom scores, with the slope on economic freedom being steeper. 

#### Regional patterns

Principal components analysis was performed on the normalized human freedom index data to identify two scores of freedom subindexes. The first score predominantly describes the average of all economic and personal freedom scores. The second measure reflects the difference of the economic and personal freedom scores. Notably, we could possibly profile countries according to these two scores with income level by region.

Further analysis of these scores of freedom subindexes reveals regional patterns along with income level in the data. Figure 2 shows a loading plot indicating the freedom subindexes of each score. Figure 3 shows a visualization of how the two scores characterize countries or regions, and set the countries apart among the region.

> **Figure 2**: a principal loading plot of the first two principal components indicating the freedom subindexes of each score.

<center><img src = 'figures/fig2.svg' style = 'width:250px'></center>

The first two principal components explained roughly 65% of the total variability in the human freedom index. We could not identify which freedom subindexes are significantly contributing to the human freedom. Nonetheless, PC1 reflects the average of all freedom subindexes, and PC2 reflects the difference between personal freedom subindexes and economic freedom subindexes. Large and negative PC1 implies high average personal and economic freedom scores, thus high human freedom scores. Large and positive PC2 implies high economic freedom scores, conversly, large and negative PC2 implies high personal freedom scores.

> **Figure 3**: a faceted scatterplot of PC2 against PC1 by region, and color the points by income level.

<center><img src = 'figures/fig3.svg' style = 'width:800px'></center>

Some countries have both high personal and economic freedom, while some countries have high personal freedom but significantly low economic freedom, such as, Argentina. The scatterplots suggests there are regional patterns in the data. For example, North America countries have high human freedom, and no big differences between personal and economic freedom. But it is worth mentioning that this conclusion could be bias, since the data we have for North America countries are small and with high incomes. In Asia and Europe, countries with high income level tend to have high human freedom scores. There appears mixed-income countries in Sub-Suharan Africa, and variation along the PC1 axis suggests that Sub-Suharan Africa countries have lower human freedom in general regardless income levels.

#### Predicting life satisfaction via linear regression

Lastly, a multiple linear regression model was performed to approximate the association between personal freedom, economic freedom, income, and life evaluation score. 

$$\mathrm{eval}_i = \beta_0 + \beta_1 \, \mathrm{hf}_i + \beta_2 \, \mathrm{pf}_i + \beta_3 \, \mathrm{ef}_i + \beta_4 \, \mathrm{income}_i.$$

In this model, human freedom score, personal freedom score, and economic freedom score are quantitative; and income is categorical and encoded using indicators. Table 3 shows the results of the model linear regression and figure 4 provides a model visualization.

> **Table 3**: a parameter estimate table of the fitted linear regression model. the coefficient estimates and coefficient standard errors for the intercept and each variable are shown in the two columns, with rows indexed by parameter name. The estimate for the error variance parameter is in the last row. 

|                     |   estimate |   standard error |      lwr |     upr |
|:--------------------|-----------:|-----------------:|----------:|----------:|
| intercept           |    4.37998 |          0.62897 |   3.12203 |   5.63792 |
| pf                  |    0.18319 |          0.06504 |   0.05311 |   0.31327 |
| ef                  |    0.0923  |          0.08887 |  -0.08543 |   0.27003 |
| low income          |   -1.69904 |          0.22364 |  -2.14633 |  -1.25176 |
| lower-middle income |   -1.10653 |          0.20288 |  -1.51229 |  -0.70077 |
| upper-middle income |   -0.8427  |          0.17472 |  -1.19214 |  -0.49326 |
| error variance      |    0.40676 |        nan       | nan       | nan       |

> **Figure 4**:  The trend lines for the same three economic freedom levels (10th, 50th, and 90th percentiles) and each income level are shown, without the data scatter.

<center><img src = 'figures/fig4.svg' style = 'width:800px'></center>

A 1 unit increase in personal freedom score is associated with a 0.1832 increase in life evaluation score, after accounting for economic freedom and income level. The standard error is 0.065, so zero is not within 2SE of the estimate 0.1832. The income variable changes the intercept by about -1.70 to -0.84 units, depending on the income level. The estimates for all income variables are also not within 2SE of zero, indicating a difference in life evaluation score between countries with different income levels. However, looking at the estimate for economic freedom, 0 is included in 2SE of the estimate 0.0923. Therefore, the model suggests the differences in life evaluation score by economic freedom is not significant. Figure 4 is a visualization of this model. As economic freedom score moves down, the relationship between personal freedom and life is similar. The intercept difference among the panels shows that life evaluation scores in high income countries are much higher than lower income level countries. The $R^2$ value indicates a moderate amount(65.46%) of variation in life evaluation score can be explained by these variables.

---
## Discussion

This project analyzed the pattern of correlation and see how it differs across cultures and aspects of freedom in 2018. The analysis focused on correlation among freedom and life evluation scores as well as relationship between life satisfaction and human freedom in 2018 (Figure 1), and identified variation and regional patterns in the human freedom index data (Figure 2, Figure 3). Further, linear regression model quantified the relationship between world happiness and freedom with income level (Table 3), and a model visualization detected the major determinants of happiness (Figure 4). <br>

The analysis suggests that economic development and social progress go hand in hand with human freedom. Across different regions, there is a strong relationship between the level of freedom and income. For any particular region, countries at higher income levels are "freeer". Countries in Europe and North America tend to have higher human freedom (average of the economic and personal freedom), while the difference between personal and economic freedom score is small. In sub-Saharan Africa, countries' human freedom score varies, and there can be a big difference between a country's personal and economic freedom score. These findings highlight human freedom is the embodiment of social progress and economic growth. Social conflict and state instability are both a cause and a reflection of limited social and economic progress. Furthermore, freedom and happiness are highly correlated, espeically personal freedom. Happiness is based on social capital, not just financial capital. Although there is no strong evidence of an association between economic freedom and happiness across global, there appears to be a strong correlation between personal freedom and income level. <br>

Since personal freedom is strongly associated with life satisfaction, although not analyzed here, we believe `pf_rol` and `pf_ss` possibly contributes to life evaluation significantly. The rule of law and security are essential to provide reasonable assurance that life is protected. Without security or the rule of law, liberty is degraded or even meaningless. A promising extension of this analysis would be to look at the effects of specific indicators of personal freedom on life evaluation.

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=f67ac267-6d1b-412b-8a5a-c5b3558f5c98' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>