# World Happiness Report - Forge Project

Background and motivation: <br> In this project, I was interested in finding insights about how the world's happiness levels changed during the COVID-19 pandemic. The aim of this dataset was to essentially quantify the happiness levels of 140+ countries using various factors such as their GDP per capita, life expectancy, freedom to make life choices, etc. Specifically, I am interested in finding out if there was a significant drop in happiness in 2020, when COVID hit, and if people felt that their freedom to make life choices decreased substantially (due to quarantine, isolation, etc.).

In [83]:
import pandas as pd
import csv
from pymongo import MongoClient
import pymongo
import numpy as np
import seaborn as sns

#### Initial steps: 
1. Read in CSV files <br>
2. Take a look at the datasets, figure out what columns I want to use and their names <br>
3. Turn them into DataFrames for easy manipulation and clean them up wherever needed 

In [54]:
yr19 = pd.read_csv('2019.csv')
yr20 = pd.read_csv('2020.csv')
yr21 = pd.read_csv('2021.csv')

In [162]:
yr19.head(5)
# Taking a quick look 

Unnamed: 0,Overall rank,Country or region,Score,GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption
0,1,Finland,7.769,1.34,1.587,0.986,0.596,0.153,0.393
1,2,Denmark,7.6,1.383,1.573,0.996,0.592,0.252,0.41
2,3,Norway,7.554,1.488,1.582,1.028,0.603,0.271,0.341
3,4,Iceland,7.494,1.38,1.624,1.026,0.591,0.354,0.118
4,5,Netherlands,7.488,1.396,1.522,0.999,0.557,0.322,0.298


In [74]:
yr20.head(5)
# Taking a quick look 

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,Western Europe,7.8087,0.031156,7.869766,7.747634,10.639267,0.95433,71.900825,0.949172,-0.059482,0.195445,1.972317,1.28519,1.499526,0.961271,0.662317,0.15967,0.477857,2.762835
1,Denmark,Western Europe,7.6456,0.033492,7.711245,7.579955,10.774001,0.955991,72.402504,0.951444,0.066202,0.168489,1.972317,1.326949,1.503449,0.979333,0.66504,0.242793,0.49526,2.432741
2,Switzerland,Western Europe,7.5599,0.035014,7.628528,7.491272,10.979933,0.942847,74.102448,0.921337,0.105911,0.303728,1.972317,1.390774,1.472403,1.040533,0.628954,0.269056,0.407946,2.350267
3,Iceland,Western Europe,7.5045,0.059616,7.621347,7.387653,10.772559,0.97467,73.0,0.948892,0.246944,0.71171,1.972317,1.326502,1.547567,1.000843,0.661981,0.36233,0.144541,2.460688
4,Norway,Western Europe,7.488,0.034837,7.556281,7.419719,11.087804,0.952487,73.200783,0.95575,0.134533,0.263218,1.972317,1.424207,1.495173,1.008072,0.670201,0.287985,0.434101,2.168266


In [73]:
yr21.head(5)
# Taking a quick look 

Unnamed: 0,Country name,Regional indicator,Ladder score,Standard error of ladder score,upperwhisker,lowerwhisker,Logged GDP per capita,Social support,Healthy life expectancy,Freedom to make life choices,Generosity,Perceptions of corruption,Ladder score in Dystopia,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,Finland,Western Europe,7.842,0.032,7.904,7.78,10.775,0.954,72.0,0.949,-0.098,0.186,2.43,1.446,1.106,0.741,0.691,0.124,0.481,3.253
1,Denmark,Western Europe,7.62,0.035,7.687,7.552,10.933,0.954,72.7,0.946,0.03,0.179,2.43,1.502,1.108,0.763,0.686,0.208,0.485,2.868
2,Switzerland,Western Europe,7.571,0.036,7.643,7.5,11.117,0.942,74.4,0.919,0.025,0.292,2.43,1.566,1.079,0.816,0.653,0.204,0.413,2.839
3,Iceland,Western Europe,7.554,0.059,7.67,7.438,10.878,0.983,73.0,0.955,0.16,0.673,2.43,1.482,1.172,0.772,0.698,0.293,0.17,2.967
4,Netherlands,Western Europe,7.464,0.027,7.518,7.41,10.932,0.942,72.4,0.913,0.175,0.338,2.43,1.501,1.079,0.753,0.647,0.302,0.384,2.798


In [181]:
df19 = pd.DataFrame(yr19, columns=['Country or region', 'Score', 'Freedom to make life choices'])
df20 = pd.DataFrame(yr20, columns=['Country name', 'Ladder score', 'Freedom to make life choices'])
df21 = pd.DataFrame(yr21, columns=['Country name', 'Ladder score', 'Freedom to make life choices'])
# Getting only the columns I want and putting them into DataFrames

In [188]:
df19.rename(columns={"Country or region": "Country_name", "Freedom to make life choices": "Freedom"}, inplace = True)
df20.rename(columns={"Country name": "Country_name", "Ladder score": "Score", "Freedom to make life choices": "Freedom"}, inplace = True)
df21.rename(columns={"Country name": "Country_name", "Ladder score": "Score", "Freedom to make life choices": "Freedom"}, inplace = True)
# Making sure all three datasets have the same column names with underscores for ease

In [189]:
index19 = pd.Index(range(1, len(df19) + 1, 1))
df19 = df19.set_index(index19)
index20 = pd.Index(range(1, len(df20) + 1, 1))
df20 = df20.set_index(index20)
index21 = pd.Index(range(1, len(df21) + 1, 1))
df21 = df21.set_index(index21)
# Adjusting the indexes

In [190]:
df19.index.name = 'Happiness Rank'
df20.index.name = 'Happiness Rank'
df21.index.name = 'Happiness Rank'
# Changing the index to be the overall rank

In [191]:
df19.tail(3)
# Checking everything is good so far

Unnamed: 0_level_0,Country_name,Score,Freedom
Happiness Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
154,Afghanistan,3.203,0.0
155,Central African Republic,3.083,0.225
156,South Sudan,2.853,0.01


In [192]:
df20.tail(3)
# Checking everything is good so far

Unnamed: 0_level_0,Country_name,Score,Freedom
Happiness Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
151,Zimbabwe,3.2992,0.711458
152,South Sudan,2.8166,0.451314
153,Afghanistan,2.5669,0.396573


In [193]:
df21.tail(3)
# Checking everything is good so far

Unnamed: 0_level_0,Country_name,Score,Freedom
Happiness Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
147,Rwanda,3.415,0.897
148,Zimbabwe,3.145,0.677
149,Afghanistan,2.523,0.382


Note: the three datasets I picked are not the same lengths- some countries are not included in 2021, for example. For this reason I will drop those countries that are not in all three datasets for the sake of comparison.

In [203]:
merged = df19.merge(df20,on=['Country_name','Country_name'])
merged.head(3)

Unnamed: 0,Country_name,Score_x,Freedom_x,Score_y,Freedom_y
0,Finland,7.769,0.596,7.8087,0.949172
1,Denmark,7.6,0.592,7.6456,0.951444
2,Norway,7.554,0.603,7.488,0.95575


In [201]:
# Filtering rows of df19
result = df19[(~df20.Country_name.isin(merged.Country_name))&(~df19.Country_name.isin(merged.Country_name))]
result

Unnamed: 0_level_0,Country_name,Score,Freedom
Happiness Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
25,Taiwan,6.446,0.351
76,Hong Kong,5.43,0.44


In [202]:
# Filtering rows of df20
result2 = df20[(~df20.Country_name.isin(merged.Country_name)&(~df20.Country_name.isin(merged.Country_name)))]
result2

Unnamed: 0_level_0,Country_name,Score,Freedom
Happiness Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
25,Taiwan Province of China,6.4554,0.77153
42,Trinidad and Tobago,6.1919,0.857907
76,North Cyprus,5.5355,0.795294
78,Hong Kong S.A.R. of China,5.5104,0.779834
87,Maldives,5.1976,0.853963
90,Macedonia,5.1598,0.738841


In [200]:
# Display Result
print("Rows of DataFrame1 that are not present in DataFrame2 are:\n",result,"\n")

# Display Result2
print("Rows of DataFrame2 that are not present in DataFrame1 are:\n",result2)

Rows of DataFrame1 that are not present in DataFrame2 are:
                Country_name  Score  Freedom
Happiness Rank                             
25                   Taiwan  6.446    0.351
76                Hong Kong  5.430    0.440 

Rows of DataFrame2 that are not present in DataFrame1 are:
                              Country_name   Score   Freedom
Happiness Rank                                             
25               Taiwan Province of China  6.4554  0.771530
42                    Trinidad and Tobago  6.1919  0.857907
76                           North Cyprus  5.5355  0.795294
78              Hong Kong S.A.R. of China  5.5104  0.779834
87                               Maldives  5.1976  0.853963
90                              Macedonia  5.1598  0.738841
