# Explore the Index of Economic Freedom through time since 2013

The point here is to find the relationship between variables in order to try to explain some phenomena and our present situation (2020)

The most of the explanations have been extracted from the Index of Economic freedom by the Heritage foundation

In [1]:
from bokeh.plotting import figure, show
from bokeh.models.tools import HoverTool 
from bokeh.models import BasicTickFormatter, ColumnDataSource, Legend, LegendItem
from bokeh.io import push_notebook, show, output_notebook
import numpy as np
import pandas as pd

output_notebook()

In [2]:
def plot_scatter_hover(df, x, y, x_title, y_title, title="", color="navy"):
    """
    Plots a new figure, with a hover widget included
    
    Parameters
    ----------
    df: ColumnDataSource
        a padas Dataframe converted to a ColumnDataSource in order
        to be interpreted by Bokeh
    x: str
        The name of the column interpreted as x axis
    y: str
        The name of the column interpreted as y axis
    x_title: str
        The x label showed on the plot
    y_title: str
        The y label showed on the plot
    title: str (optional)
        The title of the plor
    color: str (optional)
        The color of the points on the scatter plot
    """
    p = figure(title=title, plot_width=600, plot_height=600)

    hover = HoverTool()
    hover.tooltips=[
        ('Country name', '@country_name'),
        (x_title, f'@{x}'),
        (y_title, f'@{y}'),
    ]

    p.add_tools(hover)
    
    # determine best fit line
    x_array = np.asarray(df.data[x], dtype=float)
    y_array = np.asarray(df.data[y], dtype=float)
    par = np.polyfit(x_array, y_array, 1, full=True)
    slope=par[0][0]
    intercept=par[0][1]
    y_predicted = [slope * i + intercept  for i in x_array]

    p.xaxis.axis_label = x_title
    p.yaxis.axis_label = y_title
    p.circle(x=x, y=y, source=df, size=10, color=color, alpha=0.5)
    p.line(x_array, y_predicted, color="red")
    show(p)

In [3]:
def build_comparable_data_source(df, x, y):
    """
    Retuns an instance of ColumnDataSource, needed to plot a dataframe using bokeh,
    but first extract the needed columns to find a corelations between the variables
    
    Parameters
    ----------
    df: Dataframe
        the original dataframe
    x: str
        the name of the column used as the x axis
    y: str
        the name of the column used as the y axis
    
    Returns
    -------
    ColumnDataSource instance
    """
    comparable_df = df[["country_name", x, y]]
    comparable_df = comparable_df.dropna()
    return ColumnDataSource(comparable_df)

## 2013 data

First of all, we need to explore the data from 2013. Before make any asumption, we have to know structure of the initial data set.

In [4]:
df_2013 = pd.read_csv("data/index2013_data.csv", index_col=0)

In [5]:
df_2013.head(5)

Unnamed: 0,CountryID,Country Name,WEBNAME,Region,World Rank,Region Rank,2013 Score,Change in Yearly Score from 2012,Property Rights,Change in Property Rights from 2012,...,Country,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
0,1,Afghanistan,Afghanistan,Asia-Pacific,,,,,,,...,Afghanistan,31.084,29.731,5.737,10.335398,956.448,,11.247,83.411455,12.1
1,2,Albania,Albania,Europe,58.0,27.0,65.2,0.1,30.0,-5.0,...,Albania,3.218,24.91,2.0,4.431419,7741.428,13.5,3.427,1031.362818,58.923
2,3,Algeria,Algeria,Middle East / North Africa,145.0,14.0,49.6,-1.4,30.0,0.0,...,Algeria,35.954,263.661,2.47,2.718594,7333.226,10.0,4.5,2571.0,9.925
3,4,Angola,Angola,Sub-Saharan Africa,158.0,40.0,47.3,0.6,15.0,-5.0,...,Angola,19.625,115.679,3.404,8.848807,5894.617,,13.5,-5585.52927,30.897
4,5,Argentina,Argentina,South and Central America / Caribbean,160.0,27.0,46.7,-1.3,15.0,-5.0,...,Argentina,40.9,716.419,8.87,6.811779,17516.147,7.2,9.775,7243.148181,44.203


In [6]:
df_2013.tail(5)

Unnamed: 0,CountryID,Country Name,WEBNAME,Region,World Rank,Region Rank,2013 Score,Change in Yearly Score from 2012,Property Rights,Change in Property Rights from 2012,...,Country,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
180,181,Yemen,Yemen,Middle East / North Africa,113.0,12.0,55.9,0.6,30.0,0.0,...,Yemen,25.13,57.966,-10.48,1.411758,2306.695,,17.61,-712.81,42.522
181,182,Zambia,Zambia,Sub-Saharan Africa,93.0,12.0,58.7,0.4,30.0,0.0,...,Zambia,13.585,21.882,6.565,6.490851,1610.722,14.0,8.659,1981.7,26.072
182,183,Zimbabwe,Zimbabwe,Sub-Saharan Africa,175.0,46.0,28.6,2.3,10.0,0.0,...,Zimbabwe,12.575,6.127,9.319,-0.145682,487.197,95.0,3.47,387.0,70.328
183,184,Somalia,Somalia,Sub-Saharan Africa,,,,,,,...,Somalia,9.1,6.1,2.6,,600.0,,,102.0,
184,185,Kosovo,Kosovo,Europe,,,,,30.0,,...,Kosovo,1.7,11.99,5.2,,7052.0,45.1,7.3,473.0,5.6


In [7]:
df_2013.dtypes

CountryID                                        int64
Country Name                                    object
WEBNAME                                         object
Region                                          object
World Rank                                     float64
Region Rank                                    float64
2013 Score                                     float64
Change in Yearly Score from 2012               float64
Property Rights                                float64
Change in Property Rights from 2012            float64
Freedom from Corruption                        float64
Change in Freedom from Corruption from 2012    float64
Fiscal Freedom                                 float64
Change in Fiscal Freedom from 2012             float64
Gov't Spending                                 float64
Change in Gov't Spending from 2012             float64
Business Freedom                               float64
Change in Business Freedom from 2012           float64
Labor Free

In [8]:
df_2013.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 185 entries, 0 to 184
Data columns (total 43 columns):
 #   Column                                       Non-Null Count  Dtype  
---  ------                                       --------------  -----  
 0   CountryID                                    185 non-null    int64  
 1   Country Name                                 185 non-null    object 
 2   WEBNAME                                      185 non-null    object 
 3   Region                                       185 non-null    object 
 4   World Rank                                   177 non-null    float64
 5   Region Rank                                  177 non-null    float64
 6   2013 Score                                   177 non-null    float64
 7   Change in Yearly Score from 2012             177 non-null    float64
 8   Property Rights                              180 non-null    float64
 9   Change in Property Rights from 2012          177 non-null    float64
 10  Fr

We need to drop all columns we don't need

In [9]:
df_2013 = df_2013.drop(["Change in Yearly Score from 2012", "Change in Property Rights from 2012", "Change in Freedom from Corruption from 2012", "Change in Fiscal Freedom from 2012", "Change in Gov't Spending from 2012", "Change in Business Freedom from 2012", "Change in Labor Freedom from 2012", "Change in Monetary Freedom from 2012", "Change in Trade Freedom from 2012", "Change in Investment Freedom from 2012", "Change in Financial Freedom from 2012"], axis=1)

In [10]:
df_2013.head(5)

Unnamed: 0,CountryID,Country Name,WEBNAME,Region,World Rank,Region Rank,2013 Score,Property Rights,Freedom from Corruption,Fiscal Freedom,...,Country,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
0,1,Afghanistan,Afghanistan,Asia-Pacific,,,,,15.0,,...,Afghanistan,31.084,29.731,5.737,10.335398,956.448,,11.247,83.411455,12.1
1,2,Albania,Albania,Europe,58.0,27.0,65.2,30.0,31.0,92.6,...,Albania,3.218,24.91,2.0,4.431419,7741.428,13.5,3.427,1031.362818,58.923
2,3,Algeria,Algeria,Middle East / North Africa,145.0,14.0,49.6,30.0,29.0,80.4,...,Algeria,35.954,263.661,2.47,2.718594,7333.226,10.0,4.5,2571.0,9.925
3,4,Angola,Angola,Sub-Saharan Africa,158.0,40.0,47.3,15.0,20.0,82.6,...,Angola,19.625,115.679,3.404,8.848807,5894.617,,13.5,-5585.52927,30.897
4,5,Argentina,Argentina,South and Central America / Caribbean,160.0,27.0,46.7,15.0,30.0,64.3,...,Argentina,40.9,716.419,8.87,6.811779,17516.147,7.2,9.775,7243.148181,44.203


In [11]:
df_2013.describe()

Unnamed: 0,CountryID,World Rank,Region Rank,2013 Score,Property Rights,Freedom from Corruption,Fiscal Freedom,Gov't Spending,Business Freedom,Labor Freedom,...,Gov't Expenditure % of GDP,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
count,185.0,177.0,177.0,177.0,180.0,184.0,179.0,180.0,183.0,182.0,...,180.0,185.0,184.0,184.0,177.0,184.0,131.0,183.0,184.0,179.0
mean,93.0,89.0,19.485876,59.649718,42.972222,39.777174,77.384916,61.420556,64.34918,60.790659,...,35.048078,37.471627,430.095054,3.783196,3.631464,14813.843636,12.0,6.612339,7931.38596,47.366101
std,53.549043,51.239633,12.470013,11.62859,24.327245,21.156697,13.455964,24.160459,17.953765,16.980104,...,15.148214,137.108504,1507.2075,6.197814,3.246019,17842.352268,12.732551,5.749967,23206.072639,32.325779
min,1.0,1.0,1.0,1.5,5.0,0.0,0.0,0.0,0.0,0.0,...,10.358,0.036,0.24,-61.026,-14.668795,348.098,0.4,-0.283,-5585.52927,0.0
25%,47.0,45.0,9.0,52.3,28.75,25.0,71.3,46.65,53.6,49.175,...,25.4915,2.23,13.22925,1.797,1.445644,2654.5815,5.3,3.355,141.692614,25.9575
50%,93.0,89.0,18.0,59.6,37.5,31.5,79.3,66.75,65.6,61.95,...,33.2895,8.215,46.8095,3.979,3.55137,8322.8185,7.9,5.047,971.233393,40.901
75%,139.0,133.0,29.0,68.1,60.0,50.25,85.65,80.5,76.1,74.05,...,42.175,25.13,266.221,5.859,5.343677,20123.59325,13.6,8.4455,4203.766006,66.206
max,185.0,177.0,46.0,89.3,95.0,95.0,99.9,96.8,99.9,95.5,...,156.4,1348.121,15094.025,26.4,16.586669,124485.0,95.0,53.228,226937.0,229.773


In [12]:
df_2013.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 185 entries, 0 to 184
Data columns (total 32 columns):
 #   Column                       Non-Null Count  Dtype  
---  ------                       --------------  -----  
 0   CountryID                    185 non-null    int64  
 1   Country Name                 185 non-null    object 
 2   WEBNAME                      185 non-null    object 
 3   Region                       185 non-null    object 
 4   World Rank                   177 non-null    float64
 5   Region Rank                  177 non-null    float64
 6   2013 Score                   177 non-null    float64
 7   Property Rights              180 non-null    float64
 8   Freedom from Corruption      184 non-null    float64
 9   Fiscal Freedom               179 non-null    float64
 10  Gov't Spending               180 non-null    float64
 11  Business Freedom             183 non-null    float64
 12  Labor Freedom                182 non-null    float64
 13  Monetary Freedom    

It's import to rename all the columns, so that we can deal with the dataframe easily

In [13]:
df_2013 = df_2013.rename(
    columns={
        "2013 Score": "score",
        "World Rank": "world_rank",
        "WEBNAME": "webname",
        "Region": "region",
        "Country Name": "country_name",
        "Property Rights": "property_rights",
        "Freedom from Corruption": "freedom_from_corruption",
        "Fiscal Freedom ": "fiscal_freedom",
        "Gov't Spending": "govt_spending",
        "Business Freedom": "business_freedom",
        "Labor Freedom": "labor_freedom",
        "Monetary Freedom": "monetary_freedom",
        "Trade Freedom": "trade_freedom",
        "Investment Freedom ": "investment_freedom",
        "Financial Freedom": "financial_freedom",
        "Tariff Rate (%)": "tariff_rate",
        "Income Tax Rate (%)": "income_tax_rate",
        "Corporate Tax Rate (%)": "corporate_tax_rate",
        "Tax Burden % of GDP": "tax_burden_gdp",
        "Country": "country",
        "Population (Millions)": "population_millions",
        "GDP (Billions, PPP)": "gdp",
        "GDP Growth Rate (%)": "gdp_growth_rate",
        "5 Year GDP Growth Rate (%)": "five_year_gdp_growth_rate",
        "GDP per Capita (PPP)": "gdp_per_capita",
        "Unemployment (%)": "unemployment",
        "Inflation (%)": "inflation",
        "FDI Inflow (Millions)": "fdi_inflow",
        "Public Debt (% of GDP)": "public_debt"
    }
)

It's import to know how many NaN/null values we have. Keeping in mind that the dataframes is small, the drop this values is best approach for us.

In [14]:
df_2013.isnull().sum()

CountryID                       0
country_name                    0
webname                         0
region                          0
world_rank                      8
Region Rank                     8
score                           8
property_rights                 5
freedom_from_corruption         1
fiscal_freedom                  6
govt_spending                   5
business_freedom                2
labor_freedom                   3
monetary_freedom                4
trade_freedom                   5
investment_freedom              3
financial_freedom               5
tariff_rate                     5
income_tax_rate                 3
corporate_tax_rate              2
tax_burden_gdp                  4
Gov't Expenditure % of GDP      5
country                         0
population_millions             0
gdp                             1
gdp_growth_rate                 1
five_year_gdp_growth_rate       8
gdp_per_capita                  1
unemployment                   54
inflation     

## Monetary freedom and Inflation

The monetary freedom leads to a lower inflation

In [15]:
df_2013_monetary_freedom_v_inflation = build_comparable_data_source(df_2013, "monetary_freedom", "inflation")

In [16]:
plot_scatter_hover(df_2013_monetary_freedom_v_inflation, "monetary_freedom", "inflation", "Monetary Freedom", "Inflation %", title="Monetary Freedom and Inflation %", color="crimson")

## Business freedom and GDP per Capita (PPP)

In [17]:
df_2013_business_freedom_v_gdp_per_capita = build_comparable_data_source(df_2013, "business_freedom", "gdp_per_capita")

In [18]:
plot_scatter_hover(df_2013_business_freedom_v_gdp_per_capita, "business_freedom", "gdp_per_capita", "Business Freedom", "GDP per Capita (PPP)", title="Business Freedom and GDP per Capita (PPP)", color="darkorange")

## Economic Freedom and Standard of Living

In [19]:
df_2013_economic_freedom_v_gdp_per_capita = build_comparable_data_source(df_2013, "score", "gdp_per_capita")

In [20]:
plot_scatter_hover(df_2013_economic_freedom_v_gdp_per_capita, "score", "gdp_per_capita", "Economic Freedom Score", "GDP per Capita (PPP)", title="Economic Freedom Score and GDP per Capita (PPP)", color="navy")

## 2019 data

In [21]:
df_2019 = pd.read_csv("data/index2019_data.csv", index_col=0)

In [22]:
df_2019.head(5)

Unnamed: 0,CountryID,Country Name,WEBNAME,Region,World Rank,Region Rank,2019 Score,Property Rights,Judical Effectiveness,Government Integrity,...,Country,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
0,1,Afghanistan,Afghanistan,Asia-Pacific,152.0,39.0,51.5,19.6,29.6,25.2,...,Afghanistan,35.5,69.6,2.505,2.9,1957.58,8.8,5.0,53.9,7.3
1,2,Albania,Albania,Europe,52.0,27.0,66.5,54.8,30.6,40.4,...,Albania,2.9,36.0,3.9,2.5,12506.65,13.9,2.0,1119.1,71.2
2,3,Algeria,Algeria,Middle East and North Africa,171.0,14.0,46.2,31.6,36.2,28.9,...,Algeria,41.5,632.9,2.0,3.1,15237.2,10.0,5.6,1203.0,25.8
3,4,Angola,Angola,Sub-Saharan Africa,156.0,33.0,50.6,35.9,26.6,20.5,...,Angola,28.2,190.3,0.7,2.9,6752.58,8.2,31.7,-2254.5,65.3
4,5,Argentina,Argentina,Americas,148.0,26.0,52.2,47.8,44.5,33.5,...,Argentina,44.1,920.2,2.9,0.7,20875.76,8.7,25.7,11857.0,52.6


In [23]:
df_2019.tail(5)

Unnamed: 0,CountryID,Country Name,WEBNAME,Region,World Rank,Region Rank,2019 Score,Property Rights,Judical Effectiveness,Government Integrity,...,Country,Population (Millions),"GDP (Billions, PPP)",GDP Growth Rate (%),5 Year GDP Growth Rate (%),GDP per Capita (PPP),Unemployment (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
181,179,Venezuela,Venezuela,Americas,179.0,32.0,25.9,7.6,13.1,7.9,...,Venezuela,31.4,380.7,-14.0,-7.8,12113.54,7.7,1087.5,-68.0,34.9
182,180,Vietnam,Vietnam,Asia-Pacific,128.0,30.0,55.3,49.8,40.3,34.0,...,Vietnam,93.6,647.4,6.8,6.2,6913.13,2.1,3.5,14100.0,58.2
183,181,Yemen,Yemen,Middle East and North Africa,,,,19.6,22.2,20.3,...,Yemen,30.0,38.6,-13.8,-16.1,1287.48,14.0,4.9,-269.9,141.0
184,182,Zambia,Zambia,Sub-Saharan Africa,138.0,27.0,53.6,45.0,35.6,32.3,...,Zambia,17.2,68.9,3.6,4.0,3996.14,7.8,6.6,1091.2,62.2
185,183,Zimbabwe,Zimbabwe,Sub-Saharan Africa,175.0,45.0,40.4,29.7,24.8,15.8,...,Zimbabwe,14.9,34.0,3.0,2.6,2282.65,5.0,1.3,289.4,78.4


In [24]:
df_2019.dtypes

CountryID                        int64
Country Name                    object
WEBNAME                         object
Region                          object
World Rank                     float64
Region Rank                    float64
2019 Score                     float64
Property Rights                float64
Judical Effectiveness          float64
Government Integrity           float64
Tax Burden                     float64
Gov't Spending                 float64
Fiscal Health                  float64
Business Freedom               float64
Labor Freedom                  float64
Monetary Freedom               float64
Trade Freedom                  float64
Investment Freedom             float64
Financial Freedom              float64
Tariff Rate (%)                float64
Income Tax Rate (%)            float64
Corporate Tax Rate (%)         float64
Tax Burden % of GDP            float64
Gov't Expenditure % of GDP     float64
Country                         object
Population (Millions)    

In [25]:
df_2019.describe()

Unnamed: 0,CountryID,World Rank,Region Rank,2019 Score,Property Rights,Judical Effectiveness,Government Integrity,Tax Burden,Gov't Spending,Fiscal Health,...,Tariff Rate (%),Income Tax Rate (%),Corporate Tax Rate (%),Tax Burden % of GDP,Gov't Expenditure % of GDP,GDP Growth Rate (%),5 Year GDP Growth Rate (%),Inflation (%),FDI Inflow (Millions),Public Debt (% of GDP)
count,186.0,180.0,180.0,180.0,185.0,185.0,185.0,180.0,183.0,183.0,...,182.0,183.0,183.0,179.0,182.0,184.0,183.0,182.0,181.0,182.0
mean,93.5,90.5,20.538889,60.768333,52.327568,44.899459,41.47027,77.212778,64.203825,65.996721,...,5.987253,28.182787,23.891475,22.15581,33.863736,3.468913,2.984153,10.586264,7911.153039,56.469231
std,53.837719,52.105662,12.738611,11.255725,19.608526,18.104745,19.793193,13.208314,23.150984,31.76416,...,5.533767,13.374276,8.858419,10.153488,15.476484,5.835964,2.926503,80.507501,25984.794434,34.163855
min,1.0,1.0,1.0,5.9,7.6,5.0,7.9,0.0,0.0,0.0,...,0.0,0.0,0.0,1.6,10.6,-14.0,-16.1,-0.9,-8296.9,0.0
25%,47.25,45.75,9.75,53.95,37.0,31.0,27.2,70.975,51.7,39.9,...,2.0,20.0,20.0,14.23,24.675,1.8,1.9,1.3,213.8,34.95
50%,93.5,90.5,19.5,60.75,50.1,42.9,35.5,78.05,68.8,80.3,...,4.3,30.0,25.0,20.7,32.35,3.2,3.0,2.75,896.6,49.9
75%,139.75,135.25,31.0,67.8,65.9,54.7,50.3,85.425,82.6,91.45,...,8.775,35.0,30.0,29.85,40.225,4.62375,4.45,5.45,4046.0,70.125
max,186.0,180.0,47.0,90.2,97.4,92.4,96.7,99.8,96.6,100.0,...,50.0,60.0,50.0,47.0,139.2,70.8,9.9,1087.5,275381.0,236.4


In [26]:
df_2019.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 186 entries, 0 to 185
Data columns (total 34 columns):
 #   Column                       Non-Null Count  Dtype  
---  ------                       --------------  -----  
 0   CountryID                    186 non-null    int64  
 1   Country Name                 186 non-null    object 
 2   WEBNAME                      186 non-null    object 
 3   Region                       186 non-null    object 
 4   World Rank                   180 non-null    float64
 5   Region Rank                  180 non-null    float64
 6   2019 Score                   180 non-null    float64
 7   Property Rights              185 non-null    float64
 8   Judical Effectiveness        185 non-null    float64
 9   Government Integrity         185 non-null    float64
 10  Tax Burden                   180 non-null    float64
 11  Gov't Spending               183 non-null    float64
 12  Fiscal Health                183 non-null    float64
 13  Business Freedom    

renaming the columns, so that we can deal with the data easily

In [27]:
df_2019 = df_2019.rename(
    columns={
        "2019 Score": "score",
        "World Rank": "world_rank",
        "WEBNAME": "webname",
        "Region": "region",
        "Country Name": "country_name",
        "Property Rights": "property_rights",
        "Judical Effectiveness": "judical_effectiveness",
        "Tax Burden": "tax_burden",
        "Government Integrity": "freedom_from_corruption",
        "Fiscal Health": "fiscal_freedom",
        "Gov't Spending": "govt_spending",
        "Business Freedom": "business_freedom",
        "Labor Freedom": "labor_freedom",
        "Monetary Freedom": "monetary_freedom",
        "Trade Freedom": "trade_freedom",
        "Investment Freedom ": "investment_freedom",
        "Financial Freedom": "financial_freedom",
        "Tariff Rate (%)": "tariff_rate",
        "Income Tax Rate (%)": "income_tax_rate",
        "Corporate Tax Rate (%)": "corporate_tax_rate",
        "Tax Burden % of GDP": "tax_burden_gdp",
        "Country": "country",
        "Population (Millions)": "population_millions",
        "GDP (Billions, PPP)": "gdp",
        "GDP Growth Rate (%)": "gdp_growth_rate",
        "5 Year GDP Growth Rate (%)": "five_year_gdp_growth_rate",
        "GDP per Capita (PPP)": "gdp_per_capita",
        "Unemployment (%)": "unemployment",
        "Inflation (%)": "inflation",
        "FDI Inflow (Millions)": "fdi_inflow",
        "Public Debt (% of GDP)": "public_debt"
    }
)

In [28]:
df_2019["gdp_per_capita"] = df_2019["gdp_per_capita"].replace('[\$,]', '', regex=True)

In [29]:
df_2019[["gdp_per_capita", "extra_info"]] = df_2019["gdp_per_capita"].str.split(' ', n=1, expand=True)

In [30]:
df_2019 = df_2019.astype({"gdp_per_capita": "float64"})

How many NaN values we have in this dataset? Let's find out!

In [31]:
df_2019.isnull().sum()

CountryID                        0
country_name                     0
webname                          0
region                           0
world_rank                       6
Region Rank                      6
score                            6
property_rights                  1
judical_effectiveness            1
freedom_from_corruption          1
tax_burden                       6
govt_spending                    3
fiscal_freedom                   3
business_freedom                 1
labor_freedom                    2
monetary_freedom                 2
trade_freedom                    4
investment_freedom               2
financial_freedom                5
tariff_rate                      4
income_tax_rate                  3
corporate_tax_rate               3
tax_burden_gdp                   7
Gov't Expenditure % of GDP       4
country                          0
population_millions              0
gdp                              1
gdp_growth_rate                  2
five_year_gdp_growth

In [32]:
df_2019.dtypes

CountryID                        int64
country_name                    object
webname                         object
region                          object
world_rank                     float64
Region Rank                    float64
score                          float64
property_rights                float64
judical_effectiveness          float64
freedom_from_corruption        float64
tax_burden                     float64
govt_spending                  float64
fiscal_freedom                 float64
business_freedom               float64
labor_freedom                  float64
monetary_freedom               float64
trade_freedom                  float64
investment_freedom             float64
financial_freedom              float64
tariff_rate                    float64
income_tax_rate                float64
corporate_tax_rate             float64
tax_burden_gdp                 float64
Gov't Expenditure % of GDP     float64
country                         object
population_millions      

## Monetary freedom and Inflation

The monetary freedom leads to a lower inflation

In [33]:
df_2019_monetary_freedom_v_inflation = build_comparable_data_source(df_2019, "monetary_freedom", "inflation")

In [34]:
plot_scatter_hover(df_2019_monetary_freedom_v_inflation, "monetary_freedom", "inflation", "Monetary Freedom", "Inflation %", title="Monetary Freedom and Inflation %", color="crimson")

## Business freedom vs GDP per Capita (Purchasing Power Parity)

In [35]:
df_2019_investment_freedom_v_gdp_per_capita = build_comparable_data_source(df_2019, "business_freedom", "gdp_per_capita")

In [36]:
plot_scatter_hover(df_2019_investment_freedom_v_gdp_per_capita, "business_freedom", "gdp_per_capita", "Business Freedom", "GDP per Capita (PPP)", title="Business Freedom vs. GDP per Capita (PPP)", color="darkorange")

## Economic Freedom and Standard of Living

In [37]:
df_2019_economic_freedom_v_gdp_per_capita = build_comparable_data_source(df_2019, "score", "gdp_per_capita")

In [38]:
plot_scatter_hover(df_2019_economic_freedom_v_gdp_per_capita, "score", "gdp_per_capita", "Economic Freedom Score", "GDP per Capita (PPP)", title="Economic Freedom Score and GDP per Capita (PPP)", color="navy")

## Combining data since 2013 to 2020

The Index of Economic Freedom strives to provide as comprehensive a view of economic freedom as possible with data that illuminate varying aspects of the rule of law, the size and scope of government, the efficiency of regulations, and the openness of the economy to global commerce.

The need to advance economic freedom is stronger than ever. Our world has experienced—is experiencing—astounding progress, yet many in countries both rich and poor are still clamoring for change. Indeed, a recurring theme of human history has been resilience and revival. History has demonstrated that free-market capitalism, built on the principles of economic freedom, can be relied upon to provide that change. It pushes out the old to make way for the new so that real and true progress can take place. It leads to innovation in all realms: better jobs, better goods and services, and better societies.

In [39]:
def drop_reference_columns(df, year):
    """
    Drops the not necessary columns,
    those who references to the previous state of the variable
    
    Parameter
    ---------
    df: pandas.Dataframe
        the dataframe in case
    year: int
        the year belonging to the dataset
    
    Returns
    -------
    df: pandas.Dataframe
        the modified dataframe
    """
    previous_year = year - 1
    try:
        df = df.drop([f"Change in Yearly Score from {previous_year}", f"Change in Property Rights from {previous_year}", f"Change in Freedom from Corruption from {previous_year}", f"Change in Fiscal Freedom from {previous_year}", f"Change in Gov't Spending from {previous_year}", f"Change in Business Freedom from {previous_year}", f"Change in Labor Freedom from {previous_year}", f"Change in Monetary Freedom from {previous_year}", f"Change in Trade Freedom from {previous_year}", f"Change in Investment Freedom from {previous_year}", f"Change in Financial Freedom from {previous_year}"], axis=1)
    except KeyError as e:
        print("No referenc columns found")
    return df

In [40]:
def rename_columns(df, year):
    """
    The datasets have column names that are hard to handle,
    that's why they are rename by other easier to be referenced.

    Parameters
    ----------
    df: pandas.Dataframe
        the dataframe that it's gonna be modified
    year: int
    the year belonging to the dataset
    
    Returns
    -------
    df: pandas.Dataframe
        the modified dataframe
    """
    df = df.rename(
        columns={
            f"{year} Score": "score",
            "World Rank": "world_rank",
            "WEBNAME": "webname",
            "Region": "region",
            "Country Name": "country_name",
            "Property Rights": "property_rights",
            "Freedom from Corruption": "freedom_from_corruption",
            "Fiscal Freedom ": "fiscal_freedom",
            "Gov't Spending": "govt_spending",
            "Business Freedom": "business_freedom",
            "Labor Freedom": "labor_freedom",
            "Monetary Freedom": "monetary_freedom",
            "Trade Freedom": "trade_freedom",
            "Investment Freedom ": "investment_freedom",
            "Financial Freedom": "financial_freedom",
            "Tariff Rate (%)": "tariff_rate",
            "Income Tax Rate (%)": "income_tax_rate",
            "Corporate Tax Rate (%)": "corporate_tax_rate",
            "Tax Burden % of GDP": "tax_burden_gdp",
            "Country": "country",
            "Population (Millions)": "population_millions",
            "GDP (Billions, PPP)": "gdp",
            "GDP Growth Rate (%)": "gdp_growth_rate",
            "5 Year GDP Growth Rate (%)": "five_year_gdp_growth_rate",
            "GDP per Capita (PPP)": "gdp_per_capita",
            "Unemployment (%)": "unemployment",
            "Inflation (%)": "inflation",
            "FDI Inflow (Millions)": "fdi_inflow_millions",
            "Public Debt (% of GDP)": "public_debt_gdp"
        }
    )
    return df

In [41]:
def extract_dataframes():
    """
    Extracts all the datases stored as csv files
    
    Returns
    -------
    dataframes: list
        the list of dataframes extracted from the csv files
    """
    dataframes = []
    for i in range(2013, 2021):
        df = pd.read_csv(f"data/index{i}_data.csv", index_col=0)
        df = drop_reference_columns(df, i)
        df = rename_columns(df, i)
        dataframes.append(df)
    return dataframes

In [42]:
dataframes = extract_dataframes()

No referenc columns found
No referenc columns found
No referenc columns found
No referenc columns found


Combines the all datasets separated by the belonging year of the dataset

In [43]:
dataframes_by_year = pd.concat(dataframes, keys=list(range(2013, 2021)))

## Plotting the outstandings countries 

In [44]:
def outstanding_countries(df):
    """
    Returns the top ten countries in the Index of Economic Freedom by the Heritage Foundation
    
    Parameters
    ----------
    df: pandas.Dataframe
        the dataframe that combines all the historic data in a single place
    
    Returns
    -------
    list of dict items of length of 10
    """
    ranks = {}
    for country in dataframes_by_year["webname"]:
        mean = df.loc[df["webname"] == country]["world_rank"].mean(skipna=True)
        if not pd.isna(mean):
            ranks[country] = mean
    ranks = sorted(ranks.items(), key=lambda x: x[1])
    return ranks[:10] 

In [45]:
top_countries = outstanding_countries(dataframes_by_year)
top_countries

[('HongKong', 1.125),
 ('Singapore', 1.875),
 ('NewZealand', 3.375),
 ('Australia', 4.25),
 ('Switzerland', 4.375),
 ('Canada', 7.125),
 ('Ireland', 8.0),
 ('Estonia', 9.875),
 ('UnitedKingdom', 10.625),
 ('Chile', 11.375)]

In [46]:
def historic_data_by_column(df, country, column):
    """
    Returns a list of the historic data, given the name of a column
    by country
    
    Parameters
    ----------
    df: pandas.Dataframe
        the dataframe that combines all the historic data since 2013
    country: str
        the name of the country
    column: str
        the name of the column of interest
        
    Returns
    -------
    pandas.core.series.Series
        the historic data needed to be plot
    """
    return dataframes_by_year.loc[dataframes_by_year["webname"] == country][column]

In [47]:
def year_x_axis():
    """
    Returns a list composed by the range of numbers since 2013 to 2020    
    
    Returns
    -------
    list
        2013-2020
    """
    return list(range(2013, 2021))

In [48]:
desirable_colors = ["darkcyan", "cyan", "crimson", "blueviolet", "orange", "sandybrown", "hotpink", "gold", "lime", " royalblue"]

## GDP per Capita of the top ten Countries

In [49]:
def plot_measure(dataframes, top_countries, measure, hover_scientific_notation=False):
    """
    Plots a multi line chart given a measure from the Index of Economic freedom
    by the Heritage Foundation
    
    Parameters
    ----------
    dataframes: pandas.Dataframe
        the dataframe that concatenates all the datasets in one
    top_countries: list
        list of dict items of length of 10, each item has this pattern:
        (country name, average position in rank of the index since 2013)
    measure: str
        the measure tha is going to be plotted i.e. "gdp_per_capita"
    """
    p = figure(plot_width=800, plot_height=800)
    p.left[0].formatter.use_scientific = False
    
    y_axis = []
    for country in top_countries:
        y_axis.append(historic_data_by_column(dataframes, country[0], measure))

    r = p.multi_line([year_x_axis()] * 10, y_axis, color=desirable_colors, line_width=4)
    hover = HoverTool()
    if hover_scientific_notation:
        hover.tooltips=[("(x,y)", "($x{int}, $y)")]
    else:    
        hover.tooltips=[("(x,y)", "($x{int}, $y{int})")]
    p.add_tools(hover)
    
    items = []
    for i, country in enumerate(top_countries):
        items.append(LegendItem(label=country[0], renderers=[r], index=i))

    legend = Legend(items=items, glyph_height=10, glyph_width=5,)
    p.add_layout(legend)
    
    show(p)

In [50]:
plot_measure(dataframes_by_year, top_countries, "gdp_per_capita")

## Property Rights in the top ten countries in the index

The property rights component is an assessment of the ability of individuals to accumulate private property, secured by clear laws that are fully enforced by the state. It measures the degree to which a country’s laws protect private property rights and the degree to which its government enforces those laws. It also assesses the likelihood that private property will be expropriated and analyzes the independence of the judiciary, the existence of corruption within the judiciary, and the ability of individuals and businesses to enforce contracts.

The more certain the legal protection of property, the higher a country’s score; similarly, the greater the chances of government expropriation of property, the lower a country’s score. Countries that fall between two categories may receive an intermediate score.

In [51]:
plot_measure(dataframes_by_year, top_countries, "property_rights", hover_scientific_notation=True)

## Unemployment percetage in Top Ten Countries

In [52]:
plot_measure(dataframes_by_year, top_countries, "unemployment", hover_scientific_notation=True)

## Business Freddom in Top Ten Coutries

Business freedom is an overall indicator of the efficiency of government regulation of business. The quantitative score is derived from an array of measurements of the difficulty of starting, operating, and closing a business. The business freedom score for each country is a number between 0 and 100, with 100 equaling the freest business environment. The score is based on 10 factors, all weighted equally, using data from the World Bank’s Doing Business study:

   * Starting a business—procedures (number);
   * Starting a business—time (days);
   * Starting a business—cost (% of income per capita);
   * Starting a business—minimum capital (% of income per capita);
   * Obtaining a license—procedures (number);1
   * Obtaining a license—time (days);
   * Obtaining a license—cost (% of income per capita);
   * Closing a business—time (years);
   * Closing a business—cost (% of estate); and
   * Closing a business—recovery rate (cents on the dollar).2

In [53]:
plot_measure(dataframes_by_year, top_countries, "business_freedom", hover_scientific_notation=True)

## Monetary freedom in Top Ten Countries

Monetary freedom combines a measure of price stability with an assessment of price controls. Both inflation and price controls distort market activity. Price stability without microeconomic intervention is the ideal state for the free market.

The score for the monetary freedom component is based on two factors:

   * The weighted average inflation rate for the most recent three years and
   * Price controls.


In [54]:
plot_measure(dataframes_by_year, top_countries, "monetary_freedom", hover_scientific_notation=True)