# Data Story Draft
* Hugo Krijgsman (14667851)
* Ingmar Hartman (15206149)
* Julius de Groot (14362104)


## Introduction
The relationship between freedom and happiness is a crucial area of investigation, especially for informing the development of effective public policies. In Western society, there is a prevalent belief that increased freedom directly correlates with higher levels of happiness. This belief drives significant efforts to enhance personal and political freedoms. However, the reality is more complex. There are countries where people face greater limitations in their freedoms but still report high levels of happiness.

This report utilizes two key datasets to explore the relationship between freedom and happiness. The first dataset, from the Human Freedom Index, measures various aspects of freedom including personal, civil, and economic dimensions. The second dataset, from the World Happiness Report, provides happiness scores based on indicators such as GDP per capita, social support, and life expectancy.

By comparing these datasets, we aim to uncover correlations and patterns that reveal how different dimensions of freedom influence overall happiness across countries.

### Perspectives
*The more freedom citizens of a country experience, the happier they are.*
* When people have more control, there overall well-being increases.
* Societies with more personal freedom provide more opportunities for personal growth.
* Countries with more economical freedom have higher standards of living.

*There is no correlation between freedom and happiness citizens of a country experience.*
* ...
* ...
* ...

### Dependencies 


### Imports

In [1]:
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
import pycountry 

from plotly.offline import init_notebook_mode
init_notebook_mode(connected=True)

## Datasets

### World Happiness Report
The World Happiness Report dataset on Kaggle contains data from the annual World Happiness Report, offering insights into the happiness levels of countries worldwide. The dataset includes information on several key indicators such as GDP per capita, social support, healthy life expectancy, freedom to make life choices, generosity, and perceptions of corruption. This comprehensive dataset allows for detailed analysis and comparison of the well-being and happiness of different nations, aiding researchers and analysts in understanding the factors that contribute to a country's overall happiness.

In [2]:
happiness_data = pd.read_csv("happiness.csv")
happiness_data["Country"] = happiness_data["Country or region"]
happiness_data = happiness_data.drop(columns=["Country or region", "Overall rank"])

happiness_data.head()

Unnamed: 0,Score,GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Country
0,7.769,1.34,1.587,0.986,0.596,0.153,0.393,Finland
1,7.6,1.383,1.573,0.996,0.592,0.252,0.41,Denmark
2,7.554,1.488,1.582,1.028,0.603,0.271,0.341,Norway
3,7.494,1.38,1.624,1.026,0.591,0.354,0.118,Iceland
4,7.488,1.396,1.522,0.999,0.557,0.322,0.298,Netherlands


### Human Freedom Index

The Human Freedom Index dataset on Kaggle includes comprehensive data on the state of human freedom globally. This dataset combines indicators of personal and economic freedom, covering a wide range of topics such as rule of law, security and safety, movement, religion, association, assembly, civil society, expression and information, identity and relationships, and economic freedoms like regulation and freedom to trade internationally. This extensive dataset facilitates in-depth analysis of the factors that influence human freedom in various countries.

In [3]:
freedom_data = pd.read_csv("freedom.csv", low_memory=False)
columns_to_drop = []

# Normalize country column
freedom_data = freedom_data.rename(columns={"countries": "Country"})
columns_to_drop.extend(["region"])

# Filter out years other than 2019
freedom_data["year"] = pd.to_numeric(freedom_data["year"], errors='coerce')
freedom_data = freedom_data[freedom_data["year"] == 2019]
columns_to_drop.append("year")

# Reset and Drop
freedom_data = freedom_data.reset_index()
columns_to_drop.append("index")
freedom_data = freedom_data.drop(columns=columns_to_drop)

freedom_data.head()

Unnamed: 0,Country,hf_score,hf_rank,hf_quartile,pf_rol_procedural,pf_rol_civil,pf_rol_criminal,pf_rol_vdem,pf_rol,pf_ss_homicide,...,ef_regulation_business_adm,ef_regulation_business_burden,ef_regulation_business_start,ef_regulation_business_impartial,ef_regulation_business_licensing,ef_regulation_business_compliance,ef_regulation_business,ef_regulation,ef_score,ef_rank
0,Albania,8.07,42.0,2.0,5.903741,4.725831,4.047825,7.375907,4.892466,9.343023,...,5.651538,6.666667,9.742477,6.2425,5.62194,7.17525,6.850062,7.700885,7.79,31.0
1,Algeria,5.08,155.0,4.0,4.913311,5.503872,4.254187,5.345021,4.890457,9.613372,...,4.215154,2.222222,9.305002,2.5775,8.771111,7.029528,5.686753,5.840164,4.86,159.0
2,Angola,5.96,127.0,4.0,2.773262,4.352009,3.47895,5.2643,3.53474,8.590305,...,2.937894,2.444444,8.730805,4.7025,7.916416,6.782923,5.58583,5.974672,5.55,153.0
3,Argentina,7.33,75.0,2.0,6.824288,5.679943,4.218635,6.570627,5.574289,8.505814,...,2.714233,5.777778,9.579288,6.53,5.726521,6.508295,6.139352,5.994265,5.44,154.0
4,Armenia,8.32,34.0,1.0,,,,7.287006,7.287006,9.281977,...,5.170406,5.555556,9.86353,6.9575,9.302574,7.040738,7.315051,7.819774,7.98,17.0


### Merging Both Datasets

To merge the datasets, we first needed to filter some irrelevant information away. This meant only keeping the data from 2019. We then decided to take the ISO code of each country, a unique 3 letter code for every internationally recognised country. After this we appended every other variable to the row per ISO code. This still left a couple of countries without representation (listed below). So for these a bit of manual work was needed to make them fit.

In [4]:
# Applying the corrected code to merge datasets with the right column names
# Define a dictionary for mapping different country names to a standardized name

# Function for converting to iso 
def getIsoCode(country_name):
    try:
        country_iso = pycountry.countries.search_fuzzy(country_name)[0]
        return country_iso.alpha_3
    except LookupError:
        return ""
    
country_name_mapping = {
    "Bahamas, The": "Bahamas",
    "Congo, Rep.": "Congo",
    "Cote d'Ivoire": "Ivory Coast",
    "Egypt, Arab Rep.": "Egypt",
    "Gambia, The": "Gambia",
    "Iran, Islamic Rep.": "Iran",
    "Korea, Rep.": "South Korea",
    "Kyrgyz Republic": "Kyrgyzstan",
    "Lao PDR": "Laos",
    "Russia": "Russian Federation",
    "Slovak Republic": "Slovakia",
    "Swaziland": "Eswatini",
    "Syria": "Syrian Arab Republic",
    "Trinidad & Tobago": "Trinidad and Tobago",
    "Venezuela, RB": "Venezuela",
    "Yemen, Rep.": "Yemen"
}

freedom_data['Country'] = freedom_data['Country'].replace(country_name_mapping)
happiness_data['Country'] = happiness_data['Country'].replace(country_name_mapping)

happiness_data["iso"] = happiness_data['Country'].apply(getIsoCode)
freedom_data["iso"] = freedom_data['Country'].apply(getIsoCode)

data = pd.merge(freedom_data, happiness_data, on='iso', how='left')
data.rename(columns={
    'hf_score': 'Human Freedom Score',
    'ef_score': 'Economic Freedom Score',
    'pf_score': 'Personal Freedom Score'
}, inplace=True)

In [5]:
fig = px.choropleth(data, locations="iso",
                    title='Freedom around the world',
                    color="hf_score", 
                    hover_name="iso", 
                    hover_data=['iso'],
                    color_continuous_scale=px.colors.sequential.Redor_r,  
                    )

fig = px.choropleth(data, locations="iso",
                    title='Freedom around the world',
                    color="hf_score", 
                    hover_name="iso", 
                    hover_data=['iso'],
                    color_continuous_scale=px.colors.sequential.Redor_r,  
                    )

fig.show()

ValueError: Value of 'color' is not the name of a column in 'data_frame'. Expected one of ['Country_x', 'Human Freedom Score', 'hf_rank', 'hf_quartile', 'pf_rol_procedural', 'pf_rol_civil', 'pf_rol_criminal', 'pf_rol_vdem', 'pf_rol', 'pf_ss_homicide', 'pf_ss_homicide_data', 'pf_ss_disappearances_disap', 'pf_ss_disappearances_violent', 'pf_ss_disappearances_violent_data', 'pf_ss_disappearances_organized', 'pf_ss_disappearances_fatalities', 'pf_ss_disappearances_fatalities_data', 'pf_ss_disappearances_injuries', 'pf_ss_disappearances_injuries_data', 'pf_ss_disappearances_torture', 'pf_ss_killings', 'pf_ss_disappearances', 'pf_ss', 'pf_movement_vdem_foreign', 'pf_movement_vdem_men', 'pf_movement_vdem_women', 'pf_movement_vdem', 'pf_movement_cld', 'pf_movement', 'pf_religion_freedom_vdem', 'pf_religion_freedom_cld', 'pf_religion_freedom', 'pf_religion_suppression', 'pf_religion', 'pf_assembly_entry', 'pf_assembly_freedom_house', 'pf_assembly_freedom_bti', 'pf_assembly_freedom_cld', 'pf_assembly_freedom', 'pf_assembly_parties_barriers', 'pf_assembly_parties_bans', 'pf_assembly_parties_auton', 'pf_assembly_parties', 'pf_assembly_civil', 'pf_assembly', 'pf_expression_direct_killed', 'pf_expression_direct_killed_data', 'pf_expression_direct_jailed', 'pf_expression_direct_jailed_data', 'pf_expression_direct', 'pf_expression_vdem_cultural', 'pf_expression_vdem_harass', 'pf_expression_vdem_gov', 'pf_expression_vdem_internet', 'pf_expression_vdem_selfcens', 'pf_expression_vdem', 'pf_expression_house', 'pf_expression_bti', 'pf_expression_cld', 'pf_expression', 'pf_identity_same_m', 'pf_identity_same_f', 'pf_identity_same', 'pf_identity_divorce', 'pf_identity_inheritance_widows', 'pf_identity_inheritance_daughters', 'pf_identity_inheritance', 'pf_identity_fgm', 'pf_identity', 'Personal Freedom Score', 'pf_rank', 'ef_government_consumption', 'ef_government_consumption_data', 'ef_government_transfers', 'ef_government_transfers_data', 'ef_government_investment', 'ef_government_investment_data', 'ef_government_tax_income', 'ef_government_tax_income_data', 'ef_government_tax_payroll', 'ef_government_tax_payroll_data', 'ef_government_tax', 'ef_government_soa', 'ef_government', 'ef_legal_judicial', 'ef_legal_courts', 'ef_legal_protection', 'ef_legal_military', 'ef_legal_integrity', 'ef_legal_enforcement', 'ef_legal_regulatory', 'ef_legal_police', 'ef_gender', 'ef_legal', 'ef_money_growth', 'ef_money_growth_data', 'ef_money_sd', 'ef_money_sd_data', 'ef_money_inflation', 'ef_money_inflation_data', 'ef_money_currency', 'ef_money', 'ef_trade_tariffs_revenue', 'ef_trade_tariffs_revenue_data', 'ef_trade_tariffs_mean', 'ef_trade_tariffs_mean_data', 'ef_trade_tariffs_sd', 'ef_trade_tariffs_sd_data', 'ef_trade_tariffs', 'ef_trade_regulatory_nontariff', 'ef_trade_regulatory_compliance', 'ef_trade_regulatory', 'ef_trade_black', 'ef_trade_movement_open', 'ef_trade_movement_capital', 'ef_trade_movement_visit', 'ef_trade_movement', 'ef_trade', 'ef_regulation_credit_ownership', 'ef_regulation_credit_private', 'ef_regulation_credit_interest', 'ef_regulation_credit', 'ef_regulation_labor_minwage', 'ef_regulation_labor_firing', 'ef_regulation_labor_bargain', 'ef_regulation_labor_hours', 'ef_regulation_labor_dismissal', 'ef_regulation_labor_conscription', 'ef_regulation_labor', 'ef_regulation_business_adm', 'ef_regulation_business_burden', 'ef_regulation_business_start', 'ef_regulation_business_impartial', 'ef_regulation_business_licensing', 'ef_regulation_business_compliance', 'ef_regulation_business', 'ef_regulation', 'Economic Freedom Score', 'ef_rank', 'iso', 'Score', 'GDP per capita', 'Social support', 'Healthy life expectancy', 'Freedom to make life choices', 'Generosity', 'Perceptions of corruption', 'Country_y'] but received: hf_score