![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fdata-viz-of-the-week&branch=main&subPath=black-history-month-immigration/black-history-month-nova-scotia-immigration.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# World Happiness and Income

# Question

Is the happiness score of citizens in a country directly correlated to income in that country, or are there other factors that also influence the happiness scores?


# Gather

We will use data from the World Happiness Report. 

Run the code in the following cell to import the code libraries needed for this project. Code libraries are sets of code that make it easier to accompilsh a specific purpose, for instance plotly express is a code library used for making visualizations. The first two lines of code import code libraries into this notebook and the lines of code below that in the same cell will import the data we are using from a website. 


In [1]:
import pandas as pd
import plotly.express as px

url = 'https://happiness-report.s3.amazonaws.com/2023/DataForFigure2.1WHR2023.xls'
data = pd.read_excel(url)
data

Unnamed: 0,Country name,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,7.8042,0.036162,7.875078,7.733322,10.792010,0.968770,71.149994,0.961408,-0.018824,0.181745,1.777825,1.888380,1.584900,0.534574,0.771510,0.126331,0.535299,2.363241
1,Denmark,7.5864,0.041028,7.666815,7.505985,10.962164,0.954112,71.250145,0.933533,0.134242,0.195814,1.777825,1.949406,1.547875,0.537302,0.734416,0.208459,0.525221,2.083766
2,Iceland,7.5296,0.048612,7.624879,7.434321,10.895531,0.982533,72.050018,0.936349,0.210987,0.667848,1.777825,1.925508,1.619666,0.559096,0.738164,0.249635,0.187119,2.250382
3,Israel,7.4729,0.031609,7.534853,7.410946,10.638705,0.943344,72.697205,0.808866,-0.023080,0.708094,1.777825,1.833398,1.520674,0.576730,0.568518,0.124048,0.158292,2.691290
4,Netherlands,7.4030,0.029294,7.460416,7.345583,10.942279,0.930499,71.550018,0.886875,0.212686,0.378929,1.777825,1.942274,1.488228,0.545473,0.672327,0.250547,0.394062,2.110044
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
132,Congo (Kinshasa),3.2072,0.095369,3.394124,3.020277,7.006671,0.651610,55.375000,0.663798,0.085998,0.833752,1.777825,0.530779,0.783747,0.104758,0.375472,0.182573,0.068287,1.161580
133,Zimbabwe,3.2035,0.060865,3.322795,3.084205,7.640998,0.689918,54.049889,0.654055,-0.046230,0.765582,1.777825,0.758279,0.880513,0.068653,0.362508,0.111627,0.117115,0.904856
134,Sierra Leone,3.1376,0.082441,3.299184,2.976016,7.394014,0.555251,54.899853,0.660367,0.104929,0.857780,1.777825,0.669699,0.540343,0.091811,0.370907,0.192731,0.051077,1.221006
135,Lebanon,2.3922,0.044495,2.479410,2.304990,9.477677,0.529754,66.148819,0.473900,-0.140915,0.891104,1.777825,1.416999,0.475936,0.398308,0.122771,0.060824,0.027208,-0.109798


## Happiness Score by Country

Run the following code to generate a bar graph of the ladder (happiness) score for each country.

In [2]:
px.bar(data, x='Country name', y='Ladder score', title='World Happiness Report 2023', height=800)

There are also `whisker` values representing the value ranges, let's see if those are significant for our purposes.

In [3]:
data['error y'] = data['upperwhisker'] - data['Ladder score']
data['error y minus'] = data['Ladder score'] - data['lowerwhisker']
px.scatter(data, x='Country name', y='Ladder score', error_y='error y', error_y_minus='error y minus')

Those whiskers don't seem large enough to worry about.

We can also color-code the bars by continent using contient names from [Gapminder](https://gapminder.org/).

In [4]:
geonames = pd.read_csv('https://raw.githubusercontent.com/open-numbers/ddf--gapminder--geo_entity_domain/master/ddf--entities--geo--country.csv')
geonames = geonames.rename(columns={'name':'Country name', 'world_4region':'Continent'}) # we could instead use the 'world_6region' column
geonames = geonames[['Country name', 'Continent']]
data['Continent'] = data['Country name'].map(geonames.set_index('Country name')['Continent']).fillna('unknown').replace('', 'unknown')
px.bar(data, x='Country name', y='Ladder score', title='World Happiness Report 2023 by Continent', height=800, color='Continent')

Instead of using 'unknown' for the countries that were named differently in the Gapminder data set, we can manually set their contients. First we will print the ones that have an `unknown` continent.

In [5]:
data[data['Continent'] == 'unknown']['Country name']

17                       Czechia
26      Taiwan Province of China
28                      Slovakia
61                    Kyrgyzstan
81     Hong Kong S.A.R. of China
85           Congo (Brazzaville)
86               North Macedonia
88                          Laos
92                   Ivory Coast
98            State of Palestine
105                      Turkiye
132             Congo (Kinshasa)
Name: Country name, dtype: object

Then we can set each of their contienents and recreate the visualization.

In [6]:
contient_corrections = {'Czechia':'europe', 
                        'Taiwan Province of China':'asia', 
                        'Slovakia':'europe', 
                        'Kyrgyzstan':'asia', 
                        'Hong Kong S.A.R. of China':'asia', 
                        'Congo (Brazzaville)':'africa', 
                        'North Macedonia':'europe', 
                        'Laos':'asia', 
                        'Ivory Coast':'africa',
                        'State of Palestine':'asia',
                        'Turkiye':'asia',
                        'Congo (Kinshasa)':'africa'}

for country, continent in contient_corrections.items():
    data.loc[data['Country name'] == country, 'Continent'] = continent

px.bar(data, x='Country name', y='Ladder score', title='World Happiness Report 2023 by Continent', height=800, color='Continent')

What do you notice about the happiness scores of specific countries or from the visualization of world happiness by continent? 


# Mapping the Data

Run the code below to make a map of the countries colored by their happiness scores. 

In [7]:
px.choropleth(data, locations='Country name', locationmode='country names', color='Ladder score', title='World Happiness Report 2023', height=800)

What observations can you make about world happiness based on the map above?

The code below will generate a scatter plot with the data. The country will be on the x-axis with the happiness score on the y-axis. The size of each dot represents the amount of social support and the colour represents the life expectancy. 


In [8]:
px.scatter(data, x='Country name', y='Ladder score', size='Social support', color='Healthy life expectancy', title='World Happiness Report 2023', height=800)

We can also generate individual scatter plots for each of the factors, with trendlines, to see how they correlate to the happiness score. In the visualizations below the happiness score is on the y axis. 

In [9]:
factors = ['Logged GDP per capita','Social support','Healthy life expectancy','Freedom to make life choices','Generosity','Perceptions of corruption']

import plotly.graph_objects as go
from plotly.subplots import make_subplots
fig = make_subplots(rows=3, cols=2, subplot_titles=factors)
for i, factor in enumerate(factors):
    new_plot = px.scatter(data, x=factor, y='Ladder score', hover_data=['Country name'], trendline='ols', title=factor)
    for t in new_plot.data: # add the scatterplot and the trendline
        fig.add_trace(t, row=i//2+1, col=i%2+1)
fig.update_layout(title='World Happiness Report 2023 <i>(Happiness Score on y-axis)</i>', showlegend=False, height=800)
fig.show()

# Reflection Questions

What factors contribute to a higher score of happiness? 

What factors contribute to a lower score of happiness?

Are there factors that were not explored in this data visualization that could contribute to happiness score in a country?

How do you think factors that influence quality of life affect the happiness score in a country? 

What advice would you give to a country leader who wanted to increase the happiness score in a country? 

[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)