# Visualizing the World Happiness Index

## Preparation

Load the required modules.

In [2]:
import folium
import json
import pandas as pd
from pathlib import Path

## Import data

Load the data from a CSV file, and get an overview of it.

In [3]:
data_dir = Path('Data')

In [4]:
happiness_filename = data_dir / 'world_happiness_2016.csv'
happiness_data = pd.read_csv(happiness_filename)

In [5]:
happiness_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 157 entries, 0 to 156
Data columns (total 13 columns):
Country                          157 non-null object
Region                           157 non-null object
Happiness Rank                   157 non-null int64
Happiness Score                  157 non-null float64
Lower Confidence Interval        157 non-null float64
Upper Confidence Interval        157 non-null float64
Economy (GDP per Capita)         157 non-null float64
Family                           157 non-null float64
Health (Life Expectancy)         157 non-null float64
Freedom                          157 non-null float64
Trust (Government Corruption)    157 non-null float64
Generosity                       157 non-null float64
Dystopia Residual                157 non-null float64
dtypes: float64(10), int64(1), object(2)
memory usage: 16.0+ KB


In [6]:
happiness_data.describe()

Unnamed: 0,Happiness Rank,Happiness Score,Lower Confidence Interval,Upper Confidence Interval,Economy (GDP per Capita),Family,Health (Life Expectancy),Freedom,Trust (Government Corruption),Generosity,Dystopia Residual
count,157.0,157.0,157.0,157.0,157.0,157.0,157.0,157.0,157.0,157.0,157.0
mean,78.980892,5.382185,5.282395,5.481975,0.95388,0.793621,0.557619,0.370994,0.137624,0.242635,2.325807
std,45.46603,1.141674,1.148043,1.136493,0.412595,0.266706,0.229349,0.145507,0.111038,0.133756,0.54222
min,1.0,2.905,2.732,3.078,0.0,0.0,0.0,0.0,0.0,0.0,0.81789
25%,40.0,4.404,4.327,4.465,0.67024,0.64184,0.38291,0.25748,0.06126,0.15457,2.03171
50%,79.0,5.314,5.237,5.419,1.0278,0.84142,0.59659,0.39747,0.10547,0.22245,2.29074
75%,118.0,6.269,6.154,6.434,1.27964,1.02152,0.72993,0.48453,0.17554,0.31185,2.66465
max,157.0,7.526,7.46,7.669,1.82427,1.18326,0.95277,0.60848,0.50521,0.81971,3.83772


## Visualizing data

Show a chloropleth map of the world, colored by happiness.

In [7]:
world = folium.Map(
    location=[0.0, 0.0],
    tiles='Mapbox Bright',
    zoom_start=2
)
country_geo_file = data_dir / 'countries.geo.json'
world.choropleth(
    geo_data=str(country_geo_file),
    name='world happiness',
    data=happiness_data,
    columns=['Country', 'Happiness Score'],
    key_on='properties.name',
    fill_color='YlGn',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Happiness Score'
)
folium.LayerControl().add_to(world)
world

Oops, no values for the United States?

In [8]:
country_geo_data = json.load(open(str(country_geo_file)))

In [9]:
for country in country_geo_data['features']:
    if 'states' in country['properties']['name'].lower():
        print(country['properties']['name'])

United States of America


In [12]:
for country in happiness_data.Country:
    if 'states' in country.lower():
        print(happiness_data.query("Country == '{country}'".format(country=country)).index)

Int64Index([12], dtype='int64')


The naming of countries in the World Happiness data set is not consistent with that in the GeoJSON countries file.  Hence the data seems to be missing.  This is the case for most countries that are colored yellow in the visualization above.  We will fix it only for the United States. The index for the United States in the pandas dataframe is 12.

In [13]:
happiness_data.at[12, 'Country'] = 'United States of America'

In [14]:
happiness_data.loc[12]

Country                          United States of America
Region                                      North America
Happiness Rank                                         13
Happiness Score                                     7.104
Lower Confidence Interval                            7.02
Upper Confidence Interval                           7.188
Economy (GDP per Capita)                          1.50796
Family                                            1.04782
Health (Life Expectancy)                            0.779
Freedom                                           0.48163
Trust (Government Corruption)                     0.14868
Generosity                                        0.41077
Dystopia Residual                                 2.72782
Name: 12, dtype: object

In [15]:
world = folium.Map(
    location=[0.0, 0.0],
    tiles='Mapbox Bright',
    zoom_start=2
)
country_geo_file = data_dir / 'countries.geo.json'
world.choropleth(
    geo_data=str(country_geo_file),
    name='world happiness',
    data=happiness_data,
    columns=['Country', 'Happiness Score'],
    key_on='properties.name',
    fill_color='YlGn',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Happiness Score'
)
folium.LayerControl().add_to(world)
world

This procedure should be repeated for all countries for which data seems to be missing.  Note that Greenland is considered part of Denmark in the World Happiness Index, but is denoted as Greenland in the GeoJSON countries data.