## Table of content <a id='TOC'></a>

### [I. Preprocessing of the data.](#section1)
### [II. Visualization of Human Freedom Index from 2008 to 2017.](#section2)
### [III. How has average human freedom changed from 2008 to 2017](#section3)
### [IV. Is human freedom equally distributed?](#section4)
### [V. Ranking of countries on human freedom in 2017.](#section5)
### [VI. Summary: Personal Freedom Index, Economic Freedom Index, Human Freedom Index.](#section6)


In [155]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/world-population-2000-to-2019/7e38a8be-67ef-4292-ab63-be838414d5a4_Series - Metadata.csv
/kaggle/input/world-population-2000-to-2019/7e38a8be-67ef-4292-ab63-be838414d5a4_Data.csv
/kaggle/input/the-human-freedom-index/hfi_cc_2019.csv
/kaggle/input/the-human-freedom-index/hfi_cc_2018.csv


<a id='section1'></a>
### I. Preprocessing of the data.

### I am using the dataset with the Human Freedom Index for 2017.

In [156]:
df=pd.read_csv("/kaggle/input/the-human-freedom-index/hfi_cc_2019.csv",na_values=["-"])

In [157]:
df.head()

Unnamed: 0,year,ISO_code,countries,region,hf_score,hf_rank,hf_quartile,pf_rol_procedural,pf_rol_civil,pf_rol_criminal,...,ef_regulation_business_adm,ef_regulation_business_bureaucracy,ef_regulation_business_start,ef_regulation_business_bribes,ef_regulation_business_licensing,ef_regulation_business_compliance,ef_regulation_business,ef_regulation,ef_score,ef_rank
0,2017,ALB,Albania,Eastern Europe,7.84,38.0,1.0,6.7,4.5,4.7,...,6.3,6.7,9.7,4.1,6.0,7.2,6.7,7.8,7.67,30.0
1,2017,DZA,Algeria,Middle East & North Africa,4.99,155.0,4.0,,,,...,3.7,1.8,9.3,3.8,8.7,7.0,5.7,5.4,4.77,159.0
2,2017,AGO,Angola,Sub-Saharan Africa,5.4,151.0,4.0,,,,...,2.4,1.3,8.7,1.9,8.1,6.8,4.9,5.7,4.83,158.0
3,2017,ARG,Argentina,Latin America & the Caribbean,6.86,77.0,2.0,7.1,5.8,4.3,...,2.5,7.1,9.6,3.3,5.4,6.5,5.7,5.6,5.67,147.0
4,2017,ARM,Armenia,Caucasus & Central Asia,7.42,54.0,2.0,,,,...,4.6,6.2,9.9,4.6,9.3,7.1,6.9,7.5,7.7,27.0


### Check for missing values.

In [158]:
df.isna().any()

year                                 False
ISO_code                             False
countries                            False
region                               False
hf_score                              True
                                     ...  
ef_regulation_business_compliance     True
ef_regulation_business                True
ef_regulation                         True
ef_score                              True
ef_rank                               True
Length: 120, dtype: bool

### It appears that many columns have missing values.

In [160]:
df.isna().sum()/len(df)

year                                 0.000000
ISO_code                             0.000000
countries                            0.000000
region                               0.000000
hf_score                             0.049383
                                       ...   
ef_regulation_business_compliance    0.055556
ef_regulation_business               0.051852
ef_regulation                        0.049383
ef_score                             0.049383
ef_rank                              0.049383
Length: 120, dtype: float64

### We cannot just drop the rows with empty cells, since every row has empty cells. Our interests are "hf_score","pf_score" and "ef_score". We can above that for the "hf_score" column and the "ef_score" column, 4.9383% of the cells are empty. Let's see what percentage of the cells in "pf_score" column is also empty.

In [161]:
df[["pf_score"]].isna().sum()/len(df)

pf_score    0.049383
dtype: float64

year	ISO_code	countries	region

### Make a dataframe with "year", "ISO_code", "countries", "region","hf_score","pf_score","ef_score".

In [162]:
df=df[["year", "ISO_code", "countries","region","hf_score","pf_score","ef_score"]]
df.head()

Unnamed: 0,year,ISO_code,countries,region,hf_score,pf_score,ef_score
0,2017,ALB,Albania,Eastern Europe,7.84,8.01,7.67
1,2017,DZA,Algeria,Middle East & North Africa,4.99,5.2,4.77
2,2017,AGO,Angola,Sub-Saharan Africa,5.4,5.98,4.83
3,2017,ARG,Argentina,Latin America & the Caribbean,6.86,8.04,5.67
4,2017,ARM,Armenia,Caucasus & Central Asia,7.42,7.15,7.7


In [164]:
df=df.fillna(df.median())

In [166]:
df.isna().any()

year         False
ISO_code     False
countries    False
region       False
hf_score     False
pf_score     False
ef_score     False
dtype: bool

In [167]:
import numpy as np
np.unique(df[["year"]])

array([2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017])

### The dataset covers the years from 2008 to 2017.

[Back to table of content](#TOC)

<a id='section2'></a>
### II. Visualization of Human Freedom Index from 2008 to 2017.

In [168]:
import plotly.express as px
import plotly.graph_objects as go

In [170]:
df1=df.sort_values(by="year")
fig = px.choropleth(df1, 
                    locations ="ISO_code", 
                    color ="hf_score", 
                    hover_name ="countries",  
                    color_continuous_scale = px.colors.sequential.Plasma, 
                    scope ="world", 
                    animation_frame ="year",title="Human Freedom Index: 2008 to 2017") 
fig.show()

<a id='section3'></a>
### III. How has average worldwide human freedom changed from 2008 to 2017.

### What is the average Human Freedom Index for 2017, the most recent year in this dataset? Was there any improvement compared to the previous year?

In [172]:
y2008=df[df["year"]==2008]
y2009=df[df["year"]==2009]
y2010=df[df["year"]==2010]
y2011=df[df["year"]==2011]
y2012=df[df["year"]==2012]
y2013=df[df["year"]==2013]
y2014=df[df["year"]==2014]
y2015=df[df["year"]==2015]
y2016=df[df["year"]==2016]
y2017=df[df["year"]==2017]

In [173]:
y2017[["hf_score"]].mean()

hf_score    6.887963
dtype: float64

In [174]:
y2016[["hf_score"]].mean()

hf_score    6.892716
dtype: float64

### Not only that there was no improvement in human freedom from 2016 to 2017, there was actually a slight decrease in the index going from 2016 to 2017: in 2016, the HFI was 6.893, whereas in 2017, it was 6.888.

### Let's see how worldwide freedom changed over the years in numbers and visualize how our personal, civil and economic freedom have changed over the years.

In [175]:
avg2008=y2008[["hf_score"]].mean()
avg2009=y2009[["hf_score"]].mean()
avg2010=y2010[["hf_score"]].mean()
avg2011=y2011[["hf_score"]].mean()
avg2012=y2012[["hf_score"]].mean()
avg2013=y2013[["hf_score"]].mean()
avg2014=y2014[["hf_score"]].mean()
avg2015=y2015[["hf_score"]].mean()
avg2016=y2016[["hf_score"]].mean()
avg2017=y2017[["hf_score"]].mean()

In [176]:
data={"year":[2008,2009,2010,2011,2012,2013,2014,2015,2016,2017],
      "Average Human Freedom Index":[7.054938, 7.059321,7.027778,7.00821,6.973704,6.968025,6.949012,
                                    6.921852,6.892716,6.887963]}
avghfi=pd.DataFrame(data=data)
avghfi

Unnamed: 0,year,Average Human Freedom Index
0,2008,7.054938
1,2009,7.059321
2,2010,7.027778
3,2011,7.00821
4,2012,6.973704
5,2013,6.968025
6,2014,6.949012
7,2015,6.921852
8,2016,6.892716
9,2017,6.887963


In [177]:
fig = px.scatter(avghfi, x="year", y="Average Human Freedom Index", title="Average Human Freedom Index: 2008 to 2017",
                size="Average Human Freedom Index",color="Average Human Freedom Index")
fig.show()

### Worldwide average human freedom over the years (2008 to 2017) showed an alarmining downward trend. Our freedom is deteriorating and the world is not becoming a safer place. What are we sacrificing our freedom personal, civil and economic freedom for?

[Back to table of content](#TOC)

<a id='section4'></a>
### IV. Is human freedom equally distributed?

### In other words, is personal, civil and economic freedom proportional to the world population?

### The Human Freedom Index dataset does not contain population information. To assess whether human freedom is proportional to population, we would need to import another dataset with population information. I have added the dataset here.

### Load the population dataset.

In [180]:
df2=pd.read_csv("/kaggle/input/world-population-2000-to-2019/7e38a8be-67ef-4292-ab63-be838414d5a4_Data.csv",na_values=[".."])
df2.head()

Unnamed: 0,Series Name,Series Code,Country Name,Country Code,1990 [YR1990],2000 [YR2000],2011 [YR2011],2012 [YR2012],2013 [YR2013],2014 [YR2014],2015 [YR2015],2016 [YR2016],2017 [YR2017],2018 [YR2018],2019 [YR2019],2020 [YR2020]
0,"Population, total",SP.POP.TOTL,Afghanistan,AFG,12412308.0,20779953.0,30117413.0,31161376.0,32269589.0,33370794.0,34413603.0,35383128.0,36296400.0,37172386.0,38041754.0,
1,"Population, total",SP.POP.TOTL,Albania,ALB,3286542.0,3089027.0,2905195.0,2900401.0,2895092.0,2889104.0,2880703.0,2876101.0,2873457.0,2866376.0,2854191.0,
2,"Population, total",SP.POP.TOTL,Algeria,DZA,25758869.0,31042235.0,36661444.0,37383887.0,38140132.0,38923687.0,39728025.0,40551404.0,41389198.0,42228429.0,43053054.0,
3,"Population, total",SP.POP.TOTL,American Samoa,ASM,47347.0,57821.0,55759.0,55667.0,55713.0,55791.0,55812.0,55741.0,55620.0,55465.0,55312.0,
4,"Population, total",SP.POP.TOTL,Andorra,AND,54509.0,65390.0,83747.0,82427.0,80774.0,79213.0,78011.0,77297.0,77001.0,77006.0,77142.0,


In [181]:
populations=df2[["Country Name","2017 [YR2017]"]]
populations.head()

Unnamed: 0,Country Name,2017 [YR2017]
0,Afghanistan,36296400.0
1,Albania,2873457.0
2,Algeria,41389198.0
3,American Samoa,55620.0
4,Andorra,77001.0


In [182]:
populations.isna().sum()/len(populations)

Country Name     0.018587
2017 [YR2017]    0.026022
dtype: float64

### The population dataset does not have empty cells in every row. Hence, we are going to drop the rows with empty cells.

In [183]:
populations2=populations.dropna()

### Now, let's compare how the world population distribution compared to freedom distribution.

In [184]:
fig = go.Figure(data=go.Choropleth(
    locations=populations['Country Name'], 
    z = populations['2017 [YR2017]'].astype(float), 
    locationmode = 'country names', 
    colorscale = 'Hot',
    colorbar_title = "World population",
))

fig.update_layout(
    title_text = 'World population 2017',
    geo_scope='world'
)

fig.show()

In [185]:
fig = go.Figure(data=go.Choropleth(
    locations=y2017['countries'], 
    z = y2017['hf_score'].astype(float), 
    locationmode = 'country names', 
    colorscale = 'Hot',
    colorbar_title = "Human Freedom Index",
))

fig.update_layout(
    title_text = 'Human Freedom Index: 2017',
    geo_scope='world'
)

fig.show()

### It is a stark constrast:the world most populated countries such as China, India, Nigeria and Brazil all have Human Freedom Index below average. On the other hand, some of the world's least populated countries or regions such as New Zealand, Switzerland, Australia, Canada,Denmark,Lexembourg, Finland and Germany had the highest Human Freedomm Indices.

### The Human Freedom Index is the most comprehensive global measure of personal, civil and economic freedom. It is the interesting that the countries with the highest human freedom indices also had the highest Women Entrepreneurship Index.

[Back to table of content](#TOC)

<a id='section5'></a>
### V. Ranking of countries on human freedom in 2017.

In [191]:
y20171=y2017.sort_values(by="hf_score")
fig = px.bar(y20171, x="hf_score", y="countries", orientation='h',height=1800)
fig.show()

### The Human Freedom Index (HFI) is the most comprehensive global measure of personal, civil and economic freedom. The 10 countries with the highest HMI in 2017, in descending order, were: New Zealand, Switzerland, Hong Kong, Canada, Australia, Denmark, Lexembourg, Finland, Germany and Ireland. 

### Hong Kong ranks #3 in human freedom in 2017. I think we are likely going to see a change in human freedom in Hong Kong, given the high-profile political events in 2020. It is very questionable whether Hong Kong will still have one of the highest Human Freedom Index in the world. It is not unreasonable to worry about the region's future.


[Back to the table of content](#TOC)

<a id='section6'></a>
### VI. Summary: Personal Freedom Index, Economic Freedom Index, Human Freedom Index.

In [186]:
fig = px.choropleth(df1, 
                    locations ="ISO_code", 
                    color ="pf_score", 
                    hover_name ="countries",  
                    color_continuous_scale = px.colors.sequential.Plasma, 
                    scope ="world", 
                    animation_frame ="year",title="World Personal Freedom: 2008 to 2017") 
fig.show()

In [187]:
fig = px.choropleth(df1, 
                    locations ="ISO_code", 
                    color ="ef_score", 
                    hover_name ="countries",  
                    color_continuous_scale = px.colors.sequential.Plasma, 
                    scope ="world", 
                    animation_frame ="year",title="World Economic Freedom: 2008 to 2017") 
fig.show()

In [192]:
fig = px.choropleth(df1, 
                    locations ="ISO_code", 
                    color ="hf_score", 
                    hover_name ="countries",  
                    color_continuous_scale = px.colors.sequential.Plasma, 
                    scope ="world", 
                    animation_frame ="year",title="World Human Freedom: 2008 to 2017") 
fig.show()

### The countries with high human freedom index also had high personal freedom and economic freedom indices. (Anyone thinking about the perfect country to move to? 🤔)

### Please upvote if you like the analysis.