<img style='display: block;
  margin-left: auto;
  margin-right: auto;' src='https://worldhappiness.report/assets/images/icons/whr-cover-ico.png' alt='World Happiness Report 2021'>
<h1 style='text-align: center'>World Happiness Report 2021</h1>
<h2 style='text-align: center'>A Plotly Visualization</h2><br>
In this Notebook I'm going to have an in-depth look at the 2021 World Happiness Report and visualize my findings using the ploply library.

<h1> Imports & first Glance</h1>

In [None]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import plotly.figure_factory as ff
from pandas_profiling import ProfileReport
from plotly.subplots import make_subplots
!pip install pycountry_convert
import pycountry_convert as pc
import itertools
from sklearn.linear_model import LinearRegression

purple = ['rgba(38, 24, 74, 0.8)', 'rgba(71, 58, 131, 0.8)',
          'rgba(122, 120, 168, 0.8)', 'rgba(164, 163, 204, 0.85)',
          'rgba(190, 192, 213, 1)']
green = ['rgba(50, 171, 96, 1.0)', 'rgba(50, 171, 96, 0.6)']

In [None]:
df = pd.read_csv('../input/world-happiness-report-2021/world-happiness-report-2021.csv')
dfpast = pd.read_csv('../input/world-happiness-report-2021/world-happiness-report.csv')

In [None]:
df.head()

In [None]:
dfpast.head()

In [None]:
dfpast.year.value_counts()

We're dealing with two different CSV files here, the one being the Report from 2021, and the other containing reports that date back up to 2005 (although we only have 25 countries reported in 2005). Note that the official World Happiness Report has first been released in 2012.

# The Happiest Countries in 2021 are...

We're going to take a look at the top 10 happiest countries in 2021. The measure for happiness is determined by the so-called ladder score, which is primarily based on levels of GDP, life expectancy, generosity, social support, freedom and corruption.

In [None]:
t1 = df.nlargest(10, 'Ladder score')[::-1]
fig = make_subplots(rows=1, cols=2, 
                    column_widths=[0.65, 0.35],
                    subplot_titles=['Top 10 Countries by ladder score', 'Ladder score boxplot - all countries'])
fig.append_trace(go.Bar(x=t1['Ladder score'],
                y=t1['Country name'],
                orientation='h',
                marker=dict(
                    color=green[1],
                    line=dict(color=green[0], width=1)
                ),
                        name=''
               ), 1,1
             )

fig.append_trace(go.Box(y=df['Ladder score'],
                        marker_color=purple[1],
                        name=''), 1,2)
fig.add_vline(x=9,
              col=1,
              )

fig.update_layout(
    xaxis_range=(6,8),
    yaxis2_range=(2,8),
    yaxis2_tickvals=[2,3,4,5,6,7, 8],
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',
    showlegend=False,
    title_text='Top 10 Happiest Countries 2021',
    title_font_size=22)
fig.update_annotations(yshift=5)
fig.show()

* Finland is the happiest country according to the data
* Except from New Zealand, all countries are from western europe (and New Zealand wasn't hit that hard by corona)
* from these western europe countries, all scandinavian countries are amongst the top 10
* All top countries are within the upper whisker, but we have 1 outlier that is below the lower whisker that is Afghanistan
* Finland seems to be up quite a bit before runner up Denmark

# What does Finland set apart from other countries?

Since Finland has been on top for 2 years now, let's see what makes them so happy...

### Logged GDP per capita
The gross domestic product is a measure of the countries economic output and can measure the wealth of a country.

In [None]:
finland = df[(df['Country name']=='Finland')].iloc[0] # iloc to make it unambiguous to pandas

def plot_vs_finland(col, bin_size):
    fig = ff.create_distplot([df[col]],
                             ['country'],  
                             show_rug=False,
                             bin_size=bin_size,
                             colors=[green[1]],
                            )
    fig.add_vline(x=finland[col],
                  line_dash='dash',
                  line_color=purple[1],
                  line_width=5,
                  annotation_text='Finland'
                 )
    fig.update_layout(
        title_text=col,
        showlegend=False,
        paper_bgcolor='rgb(248, 248, 255)',
        plot_bgcolor='rgb(248, 248, 255)',
        title_font_size=22
    )
    fig.show()
    
plot_vs_finland('Logged GDP per capita', .2)

* We can see that Finland is definitely one of the richer countries in the world
* Money will definitely make the population happier
* But the equation is not as easy as Happiness=Money
* Neither does it support the saying 'Money buys Happiness' since there are a still a lot countries richer than Finland

### Social Support
Variable is recorded by asking participants the binary question "If you were in trouble, do you have relatives or friends you can count on to help you whenever you need them, or not?"

In [None]:
plot_vs_finland('Social support', .025)

* Finland has definitely one of the best social support around the world
* is probably responsible for building a strong community and feeling as being part of something

### Healthy life expectancy
Variable is constructed based on data from the World Health Organization (WHO)

In [None]:
plot_vs_finland('Healthy life expectancy', 1)

* Finland once again on the upper hand of the plot, although there are some countries that have a better life expectancy

### Freedom to make life choices
Recorded by response to the binary question "Are you satisfied or dissatisfied with your freedom to choose what you do with your life?"

In [None]:
plot_vs_finland('Freedom to make life choices', .025)

* as one could've expected, Finland is a free country in which people also feel that way (value is close to 1)
* Everyone seems to benefit from a free society
* We can also see that most countries are in fact happy about how free they are with choices in their life

### Generosity
Values based on the question "Have you donated money to a charity in the past month?"

In [None]:
plot_vs_finland('Generosity', .025)

* quite a suprise, Finland is not amongst those countries where people are considered to be very generous
* it might come from the fact that they do not see people who are in need first hand, since Finland is a richer country
* they just seem to be average when it comes to that, which might not be a bad thing considering their happiness

### Perceptions of corruption
Following two questions were asked to record this variable: "Is corruption widespread throughout the government or not?" and "Is corruption widespread within businesses or not?"

In [None]:
plot_vs_finland('Perceptions of corruption', .025)

* interestingly most countries do believe (or should I say are very confident) that their country is corrupt
* Finland is NOT amongst those countries

## Conclusion: What does make Finland so happy?
* it is not that clear to say
* Finland is amongst the best countries in the world when it comes to those six values measured apart from "Generosity" (but here it is not clear what is considered good or bad - although I would guess that being generous is a good character trait)
* Maybe you dont have to be "the best" in each of these six recorded variables - sometimes it is fine to just be "good enough"

# How is the Happiness level across the globe?

In [None]:
df2 = df.set_index('Country name')
temp = pd.DataFrame(df2['Ladder score']).reset_index()

#ADAPTING TO THE ISO 3166 STANDARD
temp.loc[temp['Country name'] == 'Taiwan Province of China', 'Country name'] = 'Taiwan, Province of China' 
temp.loc[temp['Country name'] == 'Hong Kong S.A.R. of China', 'Country name'] = 'Hong Kong' 
temp.loc[temp['Country name'] == 'Congo (Brazzaville)','Country name'] = 'Congo' 
temp.loc[temp['Country name'] == 'Palestinian Territories','Country name'] = 'Palestine, State of' 

temp.drop(index=temp[temp['Country name'] == 'Kosovo'].index, inplace=True) # Kosovo Code agreed on not to use by ISO 3166
temp.drop(index=temp[temp['Country name'] == 'North Cyprus'].index, inplace=True) # Not part of the ISO 3166 standard


temp['iso_alpha'] = temp['Country name'].apply(lambda x:pc.country_name_to_country_alpha3(x,))
fig = px.choropleth(temp, locations='iso_alpha',
                    color='Ladder score',
                    hover_name='Country name',
                    color_continuous_scale=px.colors.diverging.RdYlGn,
                   )
fig.update_layout(
    title_text='World map - Ladder score',
    showlegend=False,
    paper_bgcolor='rgb(248, 248, 255)',
    geo_bgcolor='rgb(248, 248, 255)',
    geo_showframe=False,
    title_font_size=22,
)
fig.show()

* as we had already seen from the top 10 countries, Western Europe seems the be the happiest area on the globe
* the african continent seems to be the least happy, and also Asia is apart from Kazakhstan and and Uzbekistan not too happy

### Taking a step back and taking a look at regions

In [None]:
df4 = df.groupby('Regional indicator')['Ladder score'].mean().reset_index().sort_values('Ladder score')
fig = go.Figure()
fig.add_trace(go.Bar(x=df4['Ladder score'],
                     y=df4['Regional indicator'],
                     orientation='h',
                     marker=dict(
                     color=green[1],
                     line=dict(color=green[0], width=1)
                )))
fig.update_layout(title_text='Ladder score per Region',
                  title_font_size=22,
                  paper_bgcolor='rgb(248, 248, 255)',
                  plot_bgcolor='rgb(248, 248, 255)')
fig.show()

* North America and Western Europe are "by far" the happiest regions in the world
* South Asia is less happy than Sub-Saharan Africa, but only by a margin - both form the regions with the lowest happiness level

### Other Stats for each region

In [None]:
df3 = df.groupby('Regional indicator')['Ladder score', 'Logged GDP per capita', 'Social support', 
                                 'Healthy life expectancy', 'Freedom to make life choices', 
                                 'Generosity', 'Perceptions of corruption'].mean().round(decimals=2).reset_index().sort_values('Ladder score', ascending=False)

fig = go.Figure()
fig.add_trace(go.Table(header=dict(values=list(df3.columns),
                                   fill_color=purple[3],
                                  ),
                       cells=dict(values=[df3['Regional indicator'], df3['Ladder score'], df3['Logged GDP per capita'], df3['Social support'],
                                          df3['Healthy life expectancy'], df3['Freedom to make life choices'],
                                          df3['Generosity'], df3['Perceptions of corruption']],
                                  fill_color=[green[1], 'rgb(240, 240, 255)']),
                      columnwidth=100,))
fig.update_layout(title_text='Average Stats per Region (mean)',
                  title_font_size=22,
                  paper_bgcolor='rgb(248, 248, 255)')
fig.show()

# How does Happiness correlate with our values?

### Logged GDP per capita

In [None]:
ladder_score = df['Ladder score'].values.reshape(-1,1)
model = LinearRegression()
def plot_correlation(col):
    model.fit(ladder_score, df[col])

    y_range = np.linspace(ladder_score.min(), ladder_score.max(), 100)
    x_range = model.predict(y_range.reshape(-1, 1))

    fig = go.Figure()
    fig.add_trace(go.Scatter(x=df[col],
                             y=df['Ladder score'],
                             mode='markers',
                             marker_color=green[1],
                             customdata=df['Country name'],
                             marker_size=10,
                             name='Countries',
                             hovertemplate='<br>'.join(['<b>%{customdata}</b>',
                                                        '',
                                                        col+': %{x}',
                                                        'Ladder score: %{y}'])
                             ))
    fig.add_trace(go.Scatter(x=x_range,
                             y=y_range,
                             name='OLS Line',
                             line_color=green[0],
                             line_width=4))

    fig.update_layout(title_text=f'Correlation Ladder score/{col}',
                      title_font_size=22,
                      paper_bgcolor='rgb(248, 248, 255)',
                      plot_bgcolor='rgb(248, 248, 255)',
                      xaxis_title_text=col,
                      yaxis_title_text='Ladder Score',)
    fig.show()
plot_correlation('Logged GDP per capita')

* Positive Correlation between Logged GDP per capita and Ladder Score, so in general you can say that money makes the people happier

### Social Support

In [None]:
plot_correlation('Social support')

* Social support is also positively correlated with the ladder score, hence having friends that will help you when you need it will make us happier
* An interesting observation is that Turkmenistan is the country that reports to have the most social support out of all countries (alongside with Iceland), although Turkmenistan is only average when it comes to the ladder score

### Healthy life expectancy

In [None]:
plot_correlation('Healthy life expectancy')

* once again, healthy life expectancy correlates positively with happiness
* note that the 3 countries with the highest life expectency are all from Asia

### Freedom to make life choices

In [None]:
plot_correlation('Freedom to make life choices')

* do I really have to mention it :) positive correlation between Freedom to make life choices and the Ladder score
* Afghanistan is a heavy outlier, where the freedom to make life choices are very low - that is probably the reason why it is also the least happiest country

### Generosity

In [None]:
plot_correlation('Generosity')

* _finally_ something that is not really correlated with the happiness level

### Perceptions of corruption

In [None]:
plot_correlation('Perceptions of corruption')

* Negative Correlation between Perceptions of corruption and Ladder score, when we feel like our country is less corrupt, we are happier in general
* Rwana (bottom left marker) is not corrupt at all, still the country is not amongst the happy ones, probably because it has a low GDP

# Conclusions
* Happiness is not an easy equation to solve
* To be a happy country, one has to be good at a lot of different parameters
* A good basis for happiness seems to be a higher GDP and Freedom in decision making
* LETS MOVE TO FINLAND :)

<h2 style='text-align: center'>That you for reading this notebook to the end!<br>Feel free to upvote and leave a comment.</h2><h4 style='text-align: center'>Also please tell me what I could've done better...<h4>