In [2]:
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import folium
from geopy.geocoders import Nominatim
%matplotlib inline

What is the World Happiness Report?

The Sustainable Development Solutions Network publishes a World Happiness Report around March 20th of each year which is recognized by the U.N. as World Happiness Day. As the website states, the report itself reflects “a worldwide demand for happiness and well-being as criteria for government policy.” The findings from the report are meant to encourage governments to prioritize happiness as a national goal on par with economic growth or national security.

How do you measure Happiness?

Happiness is a subjective term and thus can be hard to pin down with measurements. The report attempts to measure happiness with one central question: the Cantril ladder. The question asks respondents to rate their lives on a scale. They are meant to think of a ladder where the top, a score of 10, is the best possible life for them. At the bottom of the ladder, a score of 0 is the worst possible life. The respondents then respond with what step of the ladder they are currently on. The rankings are tabulated from a nationally representative sample size over the course of three years.

Six variables are used to explain the variation in scores across countries. The variables are GDP per capita, social support, healthy life expectancy, freedom, generosity, and corruption. The variables are not directly related to the Cantril ladder ranking but instead are attempts to explain which factors are impacting the score. They are calculated by using observed data and estimates of their association with life evaluations.
Bhutan was the country responsible for the idea of using happiness as a criterion for government policy and first set out to measure happiness in 1972. Eventually this culminated in the 2012 World Happiness Report and a report has been produced yearly ever since.

Questions to Answer

What are the top ten countries for each year and specifically 2023? Why? Have there been any changes in top ten over the years?

Who is in the bottom ten countries for 2023? Why? Any drastic changes from the previous year?

Create a Map with pop-up markers containing Ladder Score.

What countries had the biggest gain or drop in happiness? 

One country on the decline and One country on the rise; Breakdown Finland vs Lebanon over the years

How did the US perform? Where was the US strongest/weakest?

Who were the top ten for each category?

Is there any correlation between variables?


In [3]:
#Constructing top_ten dataframes for each year, then concatenate at the end

_2015_base = pd.read_excel('2015_clean_with_nulls.xlsx')
top_ten_2015 = _2015_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2015.rename(columns={'Ladder score': '2015 Score'}, inplace=True)
top_ten_2015 = top_ten_2015.drop(columns=top_ten_2015.columns[columns_to_drop_index])

_2016_base = pd.read_excel('2016_clean_with_nulls.xlsx')
top_ten_2016 = _2016_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2016 = top_ten_2016.drop(columns=top_ten_2016.columns[columns_to_drop_index])
top_ten_2016.rename(columns={'Happiness score': '2016 Score'}, inplace=True)


_2017_base = pd.read_excel('2017_clean_with_nulls.xlsx')
top_ten_2017 = _2017_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2017 = top_ten_2017.drop(columns=top_ten_2017.columns[columns_to_drop_index])
top_ten_2017.rename(columns={'Happiness score': '2017 Score'}, inplace=True)


_2018_base = pd.read_excel('2018_clean_with_nulls.xlsx')
top_ten_2018 = _2018_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2018 = top_ten_2018.drop(columns=top_ten_2018.columns[columns_to_drop_index])
top_ten_2018.rename(columns={'Happiness score': '2018 Score'}, inplace=True)


_2019_base = pd.read_excel('2019_clean_with_nulls.xlsx')
top_ten_2019 = _2019_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2019 = top_ten_2019.drop(columns=top_ten_2019.columns[columns_to_drop_index])
top_ten_2019.rename(columns={'Happiness score': '2019 Score'}, inplace=True)

top_ten_15_19 = pd.concat([top_ten_2015, top_ten_2016, top_ten_2017, top_ten_2018, top_ten_2019], axis=1)
top_ten_15_19

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  top_ten_2015.rename(columns={'Ladder score': '2015 Score'}, inplace=True)


Unnamed: 0,country,2015 Score,Country,2016 Score,Country.1,2017 Score,Country.2,2018 Score,Country.3,2019 Score
0,Switzerland,7.587,Denmark,7.526,Norway,7.537,Finland,7.6321,Finland,7.7689
1,Iceland,7.561,Switzerland,7.509,Denmark,7.522,Norway,7.5937,Denmark,7.6001
2,Denmark,7.527,Iceland,7.501,Iceland,7.504,Denmark,7.5553,Norway,7.5539
3,Norway,7.522,Norway,7.498,Switzerland,7.494,Iceland,7.4952,Iceland,7.4936
4,Canada,7.427,Finland,7.413,Finland,7.469,Switzerland,7.4873,Netherlands,7.4876
5,Finland,7.406,Canada,7.404,Netherlands,7.377,Netherlands,7.4413,Switzerland,7.4802
6,Netherlands,7.378,Netherlands,7.339,Canada,7.316,Canada,7.3285,Sweden,7.3433
7,Sweden,7.364,New Zealand,7.334,New Zealand,7.314,New Zealand,7.3238,New Zealand,7.3075
8,New Zealand,7.286,Australia,7.313,Sweden,7.284,Sweden,7.3145,Canada,7.2781
9,Australia,7.284,Sweden,7.291,Australia,7.284,Australia,7.2721,Austria,7.246


The above cell output shows Top Ten scores for the years 2015 to 2019

In [4]:
#Same procedure as above except 2020 to 2023

_2020_base = pd.read_excel('2020_clean_with_nulls.xlsx')
top_ten_2020 = _2020_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2020 = top_ten_2020.drop(columns=top_ten_2020.columns[columns_to_drop_index])
top_ten_2020.rename(columns={'Ladder score': '2020 Score'}, inplace=True)

_2021_base = pd.read_excel('2021_clean_with_nulls.xlsx')
top_ten_2021 = _2021_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2021 = top_ten_2021.drop(columns=top_ten_2021.columns[columns_to_drop_index])
top_ten_2021.rename(columns={'Ladder score': '2021 Score'}, inplace=True)

_2022_base = pd.read_excel('2022_clean_with_nulls.xlsx')
top_ten_2022 = _2022_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2022 = top_ten_2022.drop(columns=top_ten_2022.columns[columns_to_drop_index])
top_ten_2022.rename(columns={'Happiness score': '2022 Score'}, inplace=True)

_2023_base = pd.read_excel('2023_clean_with_nulls.xlsx')
top_ten_2023 = _2023_base.head(10)
columns_to_drop_index = [2,3,4,5,6,7]
top_ten_2023 = top_ten_2023.drop(columns=top_ten_2023.columns[columns_to_drop_index])
top_ten_2023.rename(columns={'Ladder score': '2023 Score'}, inplace=True)

top_ten_20_23 = pd.concat([top_ten_2020, top_ten_2021, top_ten_2022, top_ten_2023], axis=1)
top_ten_20_23

Unnamed: 0,Country name,2020 Score,Country name.1,2021 Score,Country,2022 Score,Country name.2,2023 Score
0,Finland,7.8087,Finland,7.8421,Finland,7.821,Finland,7.8042
1,Denmark,7.6456,Denmark,7.6195,Denmark,7.6362,Denmark,7.5864
2,Switzerland,7.5599,Switzerland,7.5715,Iceland,7.5575,Iceland,7.5296
3,Iceland,7.5045,Iceland,7.5539,Switzerland,7.5116,Israel,7.4729
4,Norway,7.488,Netherlands,7.464,Netherlands,7.4149,Netherlands,7.403
5,Netherlands,7.4489,Norway,7.3925,Luxembourg*,7.404,Sweden,7.3952
6,Sweden,7.3535,Sweden,7.3627,Sweden,7.3843,Norway,7.3155
7,New Zealand,7.2996,Luxembourg,7.3244,Norway,7.3651,Switzerland,7.2401
8,Austria,7.2942,New Zealand,7.2766,Israel,7.3638,Luxembourg,7.2279
9,Luxembourg,7.2375,Austria,7.2678,New Zealand,7.1998,New Zealand,7.1229


The output for the previous cells shows the top ten scores for each year from 2015 to 2023.

One aspect that is striking is how many Scandanavian countries make the list. They regulary encompass half the list.

Finland has been ranked happiest country six years running. Israel and Luxembourg are recent risers into the top ten.



In [5]:
all_years = pd.read_excel('All_Years_per_Country.xlsx')
#all_years.head(5)

In [6]:
#For Loops that replace NaN values with the average of the previous three years

column_names = ['Social support', 'Healthy life expectancy at birth', 'Freedom to make life choices', 'Generosity', 'Perceptions of corruption']

for j in column_names:
    for i in range(2, len(all_years[j])):
        if pd.isna(all_years.at[i, j]):
            average_previous_3 = all_years.loc[i-2:i-1, j].mean()
            all_years.at[i, j] = average_previous_3

#all_years.head(5)

In [7]:
group_by_country = all_years.groupby(by='Country name').mean()

The following output cell shows the average scores by Country going back as far as they have been giving scores.

The top ten can be seen below. The top ten is largely the same except that Denmark has had the happiest Ladder score on average.

In [8]:
#Average of all years scores. Similar top ten but Denmark is happiest on average
group_by_country.sort_values(by='Life Ladder', ascending=False).head(10)

Unnamed: 0_level_0,year,Life Ladder,Log GDP per capita,Social support,Healthy life expectancy at birth,Freedom to make life choices,Generosity,Perceptions of corruption
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Denmark,2013.941176,7.673428,10.890561,0.957257,70.110588,0.943526,0.165567,0.198935
Finland,2014.8,7.619146,10.758259,0.952087,70.385334,0.942529,0.003297,0.241205
Norway,2015.666667,7.48182,11.063554,0.948174,70.983334,0.951594,0.137959,0.33488
Switzerland,2015.75,7.474483,11.134667,0.937606,72.071666,0.917081,0.095792,0.298382
Iceland,2016.3,7.458607,10.88226,0.978529,71.8425,0.931095,0.250788,0.696371
Netherlands,2014.25,7.451992,10.892281,0.932308,71.13125,0.901849,0.261463,0.416522
Sweden,2013.941176,7.377259,10.817726,0.92934,71.497648,0.931547,0.138364,0.285677
Canada,2013.941176,7.323657,10.753692,0.9263,71.02,0.920971,0.212636,0.417187
New Zealand,2014.3125,7.278497,10.607145,0.952855,69.9825,0.916543,0.245216,0.266186
Israel,2014.0,7.265653,10.515885,0.907165,71.835294,0.732031,0.105356,0.820234


Top ten countries for 2023 based on percentile ranks. The top countries typically are scoring in the top ten percentile per category.

In [9]:
#Based on Percentile Rankings, Top ten ranked countries scored in Top ten on average for each category except Freedom to make life choices and Generosity

_2023_base_rank = _2023_base

def format_as_percentage(value):
    return '{:.2%}'.format(value)

_2023_base_rank.set_index('Country name', inplace=True)
#columns_to_rank = ['Ladder score', 'Explained by: Log GDP per capita', 'Explained by: Social support', 'Explained by: Healthy life expectancy', 
                    #'Explained by: Freedom to make life choices', 'Explained by: Generosity', 'Explained by: Perceptions of corruption']
unformatted_2023_df = _2023_base_rank.rank(pct= True, ascending= False)
formatted_2023_df = _2023_base_rank.rank(numeric_only=True, pct=True, ascending=False).applymap(format_as_percentage)

top_ten_avg_pctile = unformatted_2023_df.head(10)
top_ten_avg_pctile.head()

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Finland,0.007299,0.131387,0.014599,0.169118,0.007299,0.583942,0.014599
Denmark,0.014599,0.065693,0.021898,0.154412,0.080292,0.218978,0.021898
Iceland,0.021898,0.087591,0.007299,0.080882,0.065693,0.094891,0.255474
Israel,0.029197,0.19708,0.072993,0.051471,0.452555,0.591241,0.321168
Netherlands,0.036496,0.072993,0.138686,0.110294,0.19708,0.087591,0.072993


Where each 2023 top ten country ranks in terms of percentiles formatted with %. Cursory glance suggests Generosity not a large factor in explaining Ladder score.

In [10]:
#Where each 2023 top ten country ranks in terms of percentiles formatted with %. Cursory glance suggests Generosity not a large factor in explaining Ladder score.

formatted_2023_df.head(10)

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Finland,0.73%,13.14%,1.46%,16.91%,0.73%,58.39%,1.46%
Denmark,1.46%,6.57%,2.19%,15.44%,8.03%,21.90%,2.19%
Iceland,2.19%,8.76%,0.73%,8.09%,6.57%,9.49%,25.55%
Israel,2.92%,19.71%,7.30%,5.15%,45.26%,59.12%,32.12%
Netherlands,3.65%,7.30%,13.87%,11.03%,19.71%,8.76%,7.30%
Sweden,4.38%,9.49%,9.49%,7.35%,2.19%,16.06%,2.92%
Norway,5.11%,4.38%,6.57%,11.76%,2.92%,19.71%,5.11%
Switzerland,5.84%,2.92%,16.79%,3.68%,17.52%,41.61%,3.65%
Luxembourg,6.57%,0.73%,35.04%,9.56%,11.68%,43.07%,5.84%
New Zealand,7.30%,18.25%,4.38%,20.59%,20.44%,14.60%,4.38%


The following cell outputs show the Top ten Countries for each category in 2023. The first table shows the Top ten Ladder Scores for 2023. 

In [11]:
ranks_2023_df = _2023_base_rank.rank(ascending=False)
ranks_2023_df.head(10)

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Finland,1.0,18.0,2.0,23.0,1.0,80.0,2.0
Denmark,2.0,9.0,3.0,21.0,11.0,30.0,3.0
Iceland,3.0,12.0,1.0,11.0,9.0,13.0,35.0
Israel,4.0,27.0,10.0,7.0,62.0,81.0,44.0
Netherlands,5.0,10.0,19.0,15.0,27.0,12.0,10.0
Sweden,6.0,13.0,13.0,10.0,3.0,22.0,4.0
Norway,7.0,6.0,9.0,16.0,4.0,27.0,7.0
Switzerland,8.0,4.0,23.0,5.0,24.0,57.0,5.0
Luxembourg,9.0,1.0,48.0,13.0,16.0,59.0,8.0
New Zealand,10.0,25.0,6.0,28.0,28.0,20.0,6.0


Top ten ranks by Log GDP per capita

In [12]:
#Top ten by Log GDP per capita
gdp_rank = ranks_2023_df.sort_values(by='Explained by: Log GDP per capita').head(10)
columns_to_drop1 = ['Explained by: Social support', 'Explained by: Healthy life expectancy', 'Explained by: Freedom to make life choices', 'Explained by: Generosity', 'Explained by: Perceptions of corruption']
gdp_rank.drop(columns_to_drop1, axis=1, inplace=True)
gdp_rank

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Luxembourg,9.0,1.0
Singapore,25.0,2.0
Ireland,14.0,3.0
Switzerland,8.0,4.0
United Arab Emirates,26.0,5.0
Norway,7.0,6.0
United States,15.0,7.0
Hong Kong S.A.R. of China,82.0,8.0
Denmark,2.0,9.0
Netherlands,5.0,10.0


Top ten by Social Support

In [13]:
#Top ten by Social Support
social_rank = ranks_2023_df.sort_values(by='Explained by: Social support').head(10)
columns_to_drop2 = ['Explained by: Log GDP per capita', 'Explained by: Healthy life expectancy', 'Explained by: Freedom to make life choices', 'Explained by: Generosity', 'Explained by: Perceptions of corruption']
social_rank.drop(columns_to_drop2, axis=1, inplace=True)
social_rank

Unnamed: 0_level_0,Ladder score,Explained by: Social support
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Iceland,3.0,1.0
Finland,1.0,2.0
Denmark,2.0,3.0
Czechia,18.0,4.0
Slovakia,29.0,5.0
New Zealand,10.0,6.0
Slovenia,22.0,7.0
Estonia,31.0,8.0
Norway,7.0,9.0
Israel,4.0,10.0


Top ten by Healthy Life Expectancy

In [14]:
#Top ten by Healthy Lifee Expectancy
health_rank = ranks_2023_df.sort_values(by='Explained by: Healthy life expectancy').head(10)
columns_to_drop3 = ['Explained by: Log GDP per capita', 'Explained by: Social support', 'Explained by: Freedom to make life choices', 'Explained by: Generosity', 'Explained by: Perceptions of corruption']
health_rank.drop(columns_to_drop3, axis=1, inplace=True)
health_rank

Unnamed: 0_level_0,Ladder score,Explained by: Healthy life expectancy
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Hong Kong S.A.R. of China,82.0,1.0
Japan,47.0,2.0
Singapore,25.0,3.0
South Korea,57.0,4.0
Switzerland,8.0,5.0
Cyprus,46.0,6.0
Israel,4.0,7.0
Spain,32.0,8.0
France,21.0,9.0
Sweden,6.0,10.0


Top ten by Freedom to make life choices

In [15]:
#Top ten by Freedom to make life choices
choice_rank = ranks_2023_df.sort_values(by='Explained by: Freedom to make life choices').head(10)
columns_to_drop4 = ['Explained by: Log GDP per capita', 'Explained by: Social support', 'Explained by: Healthy life expectancy', 'Explained by: Generosity', 'Explained by: Perceptions of corruption']
choice_rank.drop(columns_to_drop4, axis=1, inplace=True)
choice_rank

Unnamed: 0_level_0,Ladder score,Explained by: Freedom to make life choices
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Finland,1.0,1.0
Cambodia,115.0,2.0
Sweden,6.0,3.0
Norway,7.0,4.0
Bahrain,42.0,5.0
United Arab Emirates,26.0,6.0
Vietnam,65.0,7.0
Uzbekistan,54.0,8.0
Iceland,3.0,9.0
Kyrgyzstan,62.0,10.0


Top ten by Generosity

In [16]:
#Top ten by Generosity
generous_rank = ranks_2023_df.sort_values(by='Explained by: Generosity').head(10)
columns_to_drop5 = ['Explained by: Log GDP per capita', 'Explained by: Social support', 'Explained by: Healthy life expectancy', 'Explained by: Freedom to make life choices', 'Explained by: Perceptions of corruption']
generous_rank.drop(columns_to_drop5, axis=1, inplace=True)
generous_rank

Unnamed: 0_level_0,Ladder score,Explained by: Generosity
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Indonesia,84.0,1.0
Myanmar,117.0,2.0
Gambia,119.0,3.0
Thailand,60.0,4.0
Kenya,111.0,5.0
Ethiopia,124.0,6.0
Kosovo,34.0,7.0
United Kingdom,19.0,8.0
Ukraine,92.0,9.0
Uzbekistan,54.0,10.0


Top ten by Perceptions of corruption

In [17]:
#Top ten by Perceptions of corruption
corruption_rank = ranks_2023_df.sort_values(by='Explained by: Perceptions of corruption').head(10)
columns_to_drop6 = ['Explained by: Log GDP per capita', 'Explained by: Social support', 'Explained by: Healthy life expectancy', 'Explained by: Freedom to make life choices', 'Explained by: Generosity']
corruption_rank.drop(columns_to_drop6, axis=1, inplace=True)
corruption_rank

Unnamed: 0_level_0,Ladder score,Explained by: Perceptions of corruption
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1
Singapore,25.0,1.0
Finland,1.0,2.0
Denmark,2.0,3.0
Sweden,6.0,4.0
Switzerland,8.0,5.0
New Zealand,10.0,6.0
Norway,7.0,7.0
Luxembourg,9.0,8.0
Ireland,14.0,9.0
Netherlands,5.0,10.0


In [18]:
def get_coordinates(country_name):
    geolocator = Nominatim(user_agent="my_geocoder", timeout=10)
    location = geolocator.geocode(country_name, addressdetails=True)
    if location:
        country_coordinates = location.raw.get('lat', None), location.raw.get('lon', None)
        
        return country_coordinates
    else:
        return None

In [19]:
def fill_coordinates(country_name):
    #country_code = extract_code(country_name)
    coordinates = get_coordinates(country_name)
    if coordinates is not None:
        return pd.Series({'Latitude': coordinates[0], 'Longitude': coordinates[1]})
    else:
        return pd.Series({'Latitude': np.nan, 'Longitude': np.nan})

In [20]:
_2023_base[['Latitude', 'Longitude']] = _2023_base.index.to_series().apply(fill_coordinates)

In [21]:
any_nan_values = _2023_base.loc[_2023_base['Latitude'].isna()]
any_nan_values

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Latitude,Longitude
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Taiwan Province of China,6.5354,1.890335,1.371842,0.492341,0.561938,0.067239,0.177711,,
Hong Kong S.A.R. of China,5.3085,1.95094,1.201336,0.70159,0.406797,0.122539,0.389993,,


In [22]:
_2023_base = _2023_base.drop('Taiwan Province of China', axis=0)
_2023_base = _2023_base.drop('Hong Kong S.A.R. of China', axis=0)

In [23]:
#Correcting Latitudes and Longitudes manually
new_lat = 30.5852
old_lat = _2023_base.at['Jordan', 'Latitude']
new_long = 36.2384
old_long = _2023_base.at['Jordan', 'Longitude']
_2023_base['Latitude'].replace(old_lat, new_lat, inplace=True)
_2023_base['Longitude'].replace(old_long, new_long, inplace=True)

mong_lat = 46.8625
old_mong_lat = _2023_base.at['Mongolia', 'Latitude']
mong_long = 103.8467
old_mong_long = _2023_base.at['Mongolia', 'Longitude']
_2023_base['Latitude'].replace(old_mong_lat, mong_lat, inplace=True)
_2023_base['Longitude'].replace(old_mong_long, mong_long, inplace=True)

greece_lat = 39.0742
old_greece_lat = _2023_base.at['Greece', 'Latitude']
greece_long = 21.8243
old_greece_long = _2023_base.at['Greece', 'Longitude']
_2023_base['Latitude'].replace(old_greece_lat, greece_lat, inplace=True)
_2023_base['Longitude'].replace(old_greece_long, greece_long, inplace=True)

hungary_lat = 47.1625
old_hungary_lat = _2023_base.at['Hungary', 'Latitude']
hungary_long = 19.5033
old_hungary_long = _2023_base.at['Hungary', 'Longitude']
_2023_base['Latitude'].replace(old_hungary_lat, hungary_lat, inplace=True)
_2023_base['Longitude'].replace(old_hungary_long, hungary_long, inplace=True)

germany_lat = 51.1657
old_germany_lat = _2023_base.at['Germany', 'Latitude']
germany_long = 10.4515
old_germany_long = _2023_base.at['Germany', 'Longitude']
_2023_base['Latitude'].replace(old_germany_lat, germany_lat, inplace=True)
_2023_base['Longitude'].replace(old_germany_long, germany_long, inplace=True)

lebanon_lat = 33.8547
old_lebanon_lat = _2023_base.at['Lebanon', 'Latitude']
lebanon_long = 35.8623
old_lebanon_long = _2023_base.at['Lebanon', 'Longitude']
_2023_base['Latitude'].replace(old_lebanon_lat, lebanon_lat, inplace=True)
_2023_base['Longitude'].replace(old_lebanon_long, lebanon_long, inplace=True)

georgia_lat = 42.3154
old_georgia_lat = _2023_base.at['Georgia', 'Latitude']
georgia_long = 43.3569
old_georgia_long = _2023_base.at['Georgia', 'Longitude']
_2023_base['Latitude'].replace(old_georgia_lat, georgia_lat, inplace=True)
_2023_base['Longitude'].replace(old_georgia_long, georgia_long, inplace=True)

In [24]:
mymap = folium.Map(location = [40, -74], zoom_start = 3)

for index, row in _2023_base.iterrows():
    popup_content = f"Country: {index}<br>Ladder Score: {row['Ladder score']}"
    folium.Marker(location=[row['Latitude'], row['Longitude']],
                  popup=popup_content).add_to(mymap)

In [25]:
mymap

In [27]:
#Write markdown cell explaining US scores breakdown
ranks_2023_df.loc['United States']

Ladder score                                  15.0
Explained by: Log GDP per capita               7.0
Explained by: Social support                  24.0
Explained by: Healthy life expectancy         68.0
Explained by: Freedom to make life choices    70.0
Explained by: Generosity                      29.0
Explained by: Perceptions of corruption       38.0
Name: United States, dtype: float64

In [28]:
formatted_2023_df.loc['United States']

Ladder score                                  10.95%
Explained by: Log GDP per capita               5.11%
Explained by: Social support                  17.52%
Explained by: Healthy life expectancy         50.00%
Explained by: Freedom to make life choices    51.09%
Explained by: Generosity                      21.17%
Explained by: Perceptions of corruption       27.74%
Name: United States, dtype: object

Based on 2023 scores, which rely on the scores from 2020-2022, the US was in the top 11% for Ladder score. The ladder rank for 2023 was 15th. The US excelled in the economic category and perfomed decently in social support, 
generosity, and perceptions of corruption. However, they did not perform as well in those categories as the top ten countries did. The US lagged in the Healthy life expectancy and freedom to make life choices categories. In terms of the freedom to make life choices category, perhaps there are too many economic or social barriers in place that prevent them from making the choices they want. 

These are two areas that the federal or state governments can investigate and try to implement policies to improve these rankings. For example, start initiatives that encourage or even reward Americans who engage in exercise and physical activity in order to raise the Healthy life expectancy ranking. Social support generally can help increase Freedom to make life choices rankings. For example, provide scholarships for someone to learn a technical trade and try and bolster the American manufacturing industry at the same time. 

The point is that Governments can use the Happiness report to target areas that will raise the happiness score for it's citizens generally. Happiness can be used as political policy just as much as economics and other areas.

In [29]:
_2015_base.head()

Unnamed: 0,country,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption
0,Switzerland,7.587,1.396505,1.349505,0.941432,0.665573,0.296775,0.419777
1,Iceland,7.561,1.302324,1.402231,0.947844,0.628772,0.436297,0.141451
2,Denmark,7.527,1.325478,1.360581,0.874641,0.649384,0.341386,0.483573
3,Norway,7.522,1.458997,1.330955,0.885209,0.669732,0.346989,0.365034
4,Canada,7.427,1.326292,1.322608,0.905631,0.632968,0.458109,0.329573


In [30]:
columns_to_use = ['country', 'Ladder score']
ladder_difference = _2015_base[columns_to_use]
ladder_difference['year'] = [2015]*len(_2015_base)
ladder_difference = ladder_difference.set_index('country')
ladder_difference.head()

ladder_diff = _2023_base[['Ladder score']]
ladder_diff['year'] = [2023]*len(_2023_base)

total_diff = pd.merge(ladder_difference, ladder_diff, left_index=True, right_index=True)
total_diff['_15_to_23_diff'] = total_diff['Ladder score_y'] - total_diff['Ladder score_x']
total_diff.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ladder_difference['year'] = [2015]*len(_2015_base)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  ladder_diff['year'] = [2023]*len(_2023_base)


Unnamed: 0,Ladder score_x,year_x,Ladder score_y,year_y,_15_to_23_diff
Switzerland,7.587,2015,7.2401,2023,-0.3469
Iceland,7.561,2015,7.5296,2023,-0.0314
Denmark,7.527,2015,7.5864,2023,0.0594
Norway,7.522,2015,7.3155,2023,-0.2065
Canada,7.427,2015,6.9607,2023,-0.4663


In [48]:
total_diff.head(10)

Unnamed: 0,Ladder score_x,year_x,Ladder score_y,year_y,_15_to_23_diff
Switzerland,7.587,2015,7.2401,2023,-0.3469
Iceland,7.561,2015,7.5296,2023,-0.0314
Denmark,7.527,2015,7.5864,2023,0.0594
Norway,7.522,2015,7.3155,2023,-0.2065
Canada,7.427,2015,6.9607,2023,-0.4663
Finland,7.406,2015,7.8042,2023,0.3982
Netherlands,7.378,2015,7.403,2023,0.025
Sweden,7.364,2015,7.3952,2023,0.0312
New Zealand,7.286,2015,7.1229,2023,-0.1631
Australia,7.284,2015,7.0946,2023,-0.1894


Based on the below code, there were 4 outlier countries in terms of their score dropping as can be seen in the output. There were no outliers for the upper bound of score gained

In [32]:

q1 = np.percentile(total_diff['_15_to_23_diff'], 25)
q3 = np.percentile(total_diff['_15_to_23_diff'], 75)
iqr = q3 - q1

lower_bound = q1 - 1.5*iqr
upper_bound = q3 + 1.5*iqr


outliers_df = total_diff[total_diff['_15_to_23_diff'] < lower_bound]
outliers_df

Unnamed: 0,Ladder score_x,year_x,Ladder score_y,year_y,_15_to_23_diff
Venezuela,6.81,2015,5.2106,2023,-1.5994
Lebanon,4.839,2015,2.3922,2023,-2.4468
Zimbabwe,4.61,2015,3.2035,2023,-1.4065
Afghanistan,3.575,2015,1.859,2023,-1.716


The following code cell output shows the countries who had the biggest drop in score from 2015 to 2023.

Lebanon had the biggest drop in score from 2015. Lebanon scored badly in the categories of Ladder score, Social Support, Freedom to make life choices, Generosity, and Perceptions of Corruption.

In [33]:
total_diff.sort_values(by='_15_to_23_diff').head(10)

Unnamed: 0,Ladder score_x,year_x,Ladder score_y,year_y,_15_to_23_diff
Lebanon,4.839,2015,2.3922,2023,-2.4468
Afghanistan,3.575,2015,1.859,2023,-1.716
Venezuela,6.81,2015,5.2106,2023,-1.5994
Zimbabwe,4.61,2015,3.2035,2023,-1.4065
Sierra Leone,4.507,2015,3.1376,2023,-1.3694
Congo (Kinshasa),4.517,2015,3.2072,2023,-1.3098
Zambia,5.129,2015,3.9822,2023,-1.1468
Jordan,5.192,2015,4.1198,2023,-1.0722
Botswana,4.332,2015,3.4353,2023,-0.8967
Brazil,6.983,2015,6.1246,2023,-0.8584


In [34]:
_2023_base.rank(pct=True, ascending=False).loc['Lebanon']

  _2023_base.rank(pct=True, ascending=False).loc['Lebanon']


Ladder score                                  0.992593
Explained by: Log GDP per capita              0.533333
Explained by: Social support                  0.977778
Explained by: Healthy life expectancy         0.477612
Explained by: Freedom to make life choices    0.985185
Explained by: Generosity                      0.903704
Explained by: Perceptions of corruption       0.896296
Name: Lebanon, dtype: float64

In [35]:
_2023_base_rank.rank(ascending=False).tail()

Unnamed: 0_level_0,Ladder score,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Latitude,Longitude
Country name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Congo (Kinshasa),133.0,136.0,115.0,127.0,119.0,45.0,94.0,122.0,48.0
Zimbabwe,134.0,127.0,110.0,134.0,124.0,91.0,64.0,126.0,38.0
Sierra Leone,135.0,128.0,130.0,129.0,120.0,37.0,111.0,5.0,129.0
Lebanon,136.0,74.0,134.0,66.0,135.0,124.0,123.0,52.0,103.0
Afghanistan,137.0,129.0,137.0,132.0,137.0,107.0,106.0,70.0,13.0


The following output shows the biggest gain in happiness since 2015. The countries with the biggest gains typically were in Europe or Africa.

In [36]:
total_diff.sort_values(by='_15_to_23_diff', ascending=False).head(10)

Unnamed: 0,Ladder score_x,year_x,Ladder score_y,year_y,_15_to_23_diff
Romania,5.124,2015,6.5891,2023,1.4651
Guinea,3.656,2015,5.0717,2023,1.4157
Ivory Coast,3.655,2015,5.0527,2023,1.3977
Togo,2.839,2015,4.1374,2023,1.2984
Congo (Brazzaville),3.989,2015,5.2671,2023,1.2781
Bulgaria,4.218,2015,5.4661,2023,1.2481
Hungary,4.8,2015,6.0412,2023,1.2412
Honduras,4.788,2015,6.0225,2023,1.2345
Gabon,3.896,2015,5.0347,2023,1.1387
Latvia,5.098,2015,6.2127,2023,1.1147


In [37]:
_2023_base.rank(pct=True, ascending=False).loc['Romania']

  _2023_base.rank(pct=True, ascending=False).loc['Romania']


Ladder score                                  0.177778
Explained by: Log GDP per capita              0.296296
Explained by: Social support                  0.444444
Explained by: Healthy life expectancy         0.402985
Explained by: Freedom to make life choices    0.303704
Explained by: Generosity                      0.933333
Explained by: Perceptions of corruption       1.000000
Name: Romania, dtype: float64

A Tale of Two Countries.

Finland had the highest ladder score for six years running and Lebanon had the biggest drop in Ladder score since 2015.


In [38]:
finland23 = _2023_base.loc['Finland']
finland23 = pd.DataFrame(data=finland23)


finland15 = _2015_base[_2015_base['country'] == 'Finland']
finland15.set_index('country', inplace=True)
finland15 = finland15.T
finland15

country,Finland
Ladder score,7.406
Explained by: Log GDP per capita,1.290249
Explained by: Social support,1.318263
Explained by: Healthy life expectancy,0.889111
Explained by: Freedom to make life choices,0.641693
Explained by: Generosity,0.233508
Explained by: Perceptions of corruption,0.413718


Finland has been on the rise in the top ten countries in terms of Ladder score until it took over and held first place for six years running.

The high Ladder score can be explained by an increase in Log GDP per capita from 2015 to 2023 as well as an increase in social support. These increases
in score are the primary reasons Finland has been in first place so consistently. 

In [39]:
finland_15_to_23 = pd.merge(finland15, finland23, left_index=True, right_index=True)
finland_15_to_23['difference'] = finland_15_to_23['Finland_y'] - finland_15_to_23['Finland_x']
finland_15_to_23

Unnamed: 0,Finland_x,Finland_y,difference
Ladder score,7.406,7.8042,0.3982
Explained by: Log GDP per capita,1.290249,1.88838,0.598132
Explained by: Social support,1.318263,1.5849,0.266637
Explained by: Healthy life expectancy,0.889111,0.534574,-0.354537
Explained by: Freedom to make life choices,0.641693,0.77151,0.129817
Explained by: Generosity,0.233508,0.126331,-0.107176
Explained by: Perceptions of corruption,0.413718,0.535299,0.121581


In [40]:
columns = {'Year':[2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023], 'Ladder Score': [_2015_base.at[_2015_base.loc[_2015_base['country'] == 'Finland'].index[0], 'Ladder score'],
                                                                                           _2016_base.at[_2016_base.loc[_2016_base['Country'] == 'Finland'].index[0], 'Happiness score'],
                                                                                           _2017_base.at[_2017_base.loc[_2017_base['Country'] == 'Finland'].index[0], 'Happiness score'],
                                                                                           _2018_base.at[_2018_base.loc[_2018_base['Country'] == 'Finland'].index[0], 'Happiness score'],
                                                                                           _2019_base.at[_2019_base.loc[_2019_base['Country'] == 'Finland'].index[0], 'Happiness score'],
                                                                                           _2020_base.at[_2020_base.loc[_2020_base['Country name'] == 'Finland'].index[0], 'Ladder score'],
                                                                                           _2021_base.at[_2021_base.loc[_2021_base['Country name'] == 'Finland'].index[0], 'Ladder score'],
                                                                                           _2022_base.at[_2022_base.loc[_2022_base['Country'] == 'Finland'].index[0], 'Happiness score'],
                                                                                           _2023_base.at['Finland', 'Ladder score']]}
finland = pd.DataFrame(columns)

finland

Unnamed: 0,Year,Ladder Score
0,2015,7.406
1,2016,7.413
2,2017,7.469
3,2018,7.6321
4,2019,7.7689
5,2020,7.8087
6,2021,7.8421
7,2022,7.821
8,2023,7.8042


In [41]:
finland['Cumulative Difference'] = finland['Ladder Score'].diff()
finland['Cumulative Difference'].fillna(0, inplace=True)
finland['CumSum'] = np.cumsum(finland['Cumulative Difference'])
finland

Unnamed: 0,Year,Ladder Score,Cumulative Difference,CumSum
0,2015,7.406,0.0,0.0
1,2016,7.413,0.007,0.007
2,2017,7.469,0.056,0.063
3,2018,7.6321,0.1631,0.2261
4,2019,7.7689,0.1368,0.3629
5,2020,7.8087,0.0398,0.4027
6,2021,7.8421,0.0334,0.4361
7,2022,7.821,-0.0211,0.415
8,2023,7.8042,-0.0168,0.3982


In [42]:
fig = go.Figure(go.Waterfall(
    name="",
    orientation="v",
    measure=["relative", "relative", "relative", "relative", "relative", "relative", "relative", "relative", "total"],
    x=finland['Year'],
    text = finland['CumSum'],
    textposition="outside",
    y=finland['Cumulative Difference'],
    connector={"line": {"color": "rgb(63, 63, 63)"}},
))

fig.update_layout(
    title = '<b>Finland Ladder Score Waterfall Chart<b>',
    xaxis_title = 'Year',
    yaxis_title = 'Cumulative Change in Score'
)

fig.show()

The above Waterfall chart shows the gains in Ladder Score Finland has obtained since 2015. Except for small decreases in 2022 and 2023, Finland's Ladder Score has been growing. The majority of their gains came in 2018
and 2019.

In [43]:
lebanon23 = _2023_base.loc['Lebanon']
lebanon23 = pd.DataFrame(data=lebanon23)

lebanon15 = _2015_base[_2015_base['country'] == 'Lebanon']

lebanon15.set_index('country', inplace=True)
lebanon15 = lebanon15.T
lebanon15

country,Lebanon
Ladder score,4.839
Explained by: Log GDP per capita,1.025641
Explained by: Social support,0.800011
Explained by: Healthy life expectancy,0.839474
Explained by: Freedom to make life choices,0.339159
Explained by: Generosity,0.218542
Explained by: Perceptions of corruption,0.045822


Lebanon had the biggest drop in Ladder score from 2015 to 2023.

The biggest categorical drop was Healthy life expectancy and Social Support. Log GDP per capita increased, but it was not enough
to offset the drops in other categories.

In [44]:
lebanon_15_to_23 = pd.merge(lebanon15, lebanon23, left_index=True, right_index=True)
lebanon_15_to_23['difference'] = lebanon_15_to_23['Lebanon_y'] - lebanon_15_to_23['Lebanon_x']
lebanon_15_to_23

Unnamed: 0,Lebanon_x,Lebanon_y,difference
Ladder score,4.839,2.3922,-2.4468
Explained by: Log GDP per capita,1.025641,1.416999,0.391358
Explained by: Social support,0.800011,0.475936,-0.324075
Explained by: Healthy life expectancy,0.839474,0.398308,-0.441166
Explained by: Freedom to make life choices,0.339159,0.122771,-0.216388
Explained by: Generosity,0.218542,0.060824,-0.157718
Explained by: Perceptions of corruption,0.045822,0.027208,-0.018615


In [45]:
columns = {'Year':[2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023], 'Ladder Score': [_2015_base.at[_2015_base.loc[_2015_base['country'] == 'Lebanon'].index[0], 'Ladder score'],
                                                                                           _2016_base.at[_2016_base.loc[_2016_base['Country'] == 'Lebanon'].index[0], 'Happiness score'],
                                                                                           _2017_base.at[_2017_base.loc[_2017_base['Country'] == 'Lebanon'].index[0], 'Happiness score'],
                                                                                           _2018_base.at[_2018_base.loc[_2018_base['Country'] == 'Lebanon'].index[0], 'Happiness score'],
                                                                                           _2019_base.at[_2019_base.loc[_2019_base['Country'] == 'Lebanon'].index[0], 'Happiness score'],
                                                                                           _2020_base.at[_2020_base.loc[_2020_base['Country name'] == 'Lebanon'].index[0], 'Ladder score'],
                                                                                           _2021_base.at[_2021_base.loc[_2021_base['Country name'] == 'Lebanon'].index[0], 'Ladder score'],
                                                                                           _2022_base.at[_2022_base.loc[_2022_base['Country'] == 'Lebanon'].index[0], 'Happiness score'],
                                                                                           _2023_base.at['Lebanon', 'Ladder score']]}
lebanon = pd.DataFrame(columns)

lebanon

Unnamed: 0,Year,Ladder Score
0,2015,4.839
1,2016,5.129
2,2017,5.225
3,2018,5.1989
4,2019,5.1973
5,2020,4.7715
6,2021,4.5838
7,2022,2.9553
8,2023,2.3922


In [46]:
lebanon['Cumulative Difference'] = lebanon['Ladder Score'].diff()
lebanon['Cumulative Difference'].fillna(0, inplace=True)
lebanon['CumSum'] = np.cumsum(lebanon['Cumulative Difference'])
lebanon

Unnamed: 0,Year,Ladder Score,Cumulative Difference,CumSum
0,2015,4.839,0.0,0.0
1,2016,5.129,0.29,0.29
2,2017,5.225,0.096,0.386
3,2018,5.1989,-0.0261,0.3599
4,2019,5.1973,-0.0016,0.3583
5,2020,4.7715,-0.4258,-0.0675
6,2021,4.5838,-0.1877,-0.2552
7,2022,2.9553,-1.6285,-1.8837
8,2023,2.3922,-0.5631,-2.4468


In [47]:
fig = go.Figure(go.Waterfall(
    name="",
    orientation="v",
    measure=["relative", "relative", "relative", "relative", "relative", "relative", "relative", "relative", "total"],
    x=lebanon['Year'],
    text = lebanon['CumSum'],
    textposition="outside",
    y=lebanon['Cumulative Difference'],
    connector={"line": {"color": "rgb(63, 63, 63)"}},
))

fig.update_layout(
    title = '<b>Lebanon Ladder Score Waterfall Chart<b>',
    xaxis_title = 'Year',
    #yaxis_title = 'Cumulative Change in Score',
    yaxis = dict(title = 'Cumulative Change in Score',
                 range = [-2.5,0.7])
)

fig.show()

Lebanon started out gaining in Ladder Score but took big drops in 2020 and 2022. Since 2015, Lebanon has lost 2.45 points off their Ladder Score which is the biggest drop of any nation.