# Analysing World Happiness Data from 2018-2020

The **WORLD HAPPINESS REPORT** is a publication of the [Sustainable Development Solutions Network](https://www.unsdsn.org/) powered by the Gallup Poll. It is a survey on the state of global happiness ranking 156 countries by how happy their citizens perceive themselves to be. The following datasets contain the World Happiness Reports from 2017-2019. Moreover these reports try to conclude the contributing factors determining happiness of citizens in a country.

To obtain more information about these datasets, you can check this [Kaggle Profile](https://www.kaggle.com/unsdsn/world-happiness)

In [None]:
#installing the jovian library
!pip install jovian

In [None]:
#importing the libraries required for visualisation
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import jovian
%matplotlib inline

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
#loading the datasets y1,y2,y3 corresponding to 2017,2018 and 2019
y1_df=pd.read_csv('/kaggle/input/world-happiness/2017.csv',index_col=1)
y2_df=pd.read_csv('/kaggle/input/world-happiness/2015.csv',index_col=2)
y3_df=pd.read_csv('/kaggle/input/world-happiness/2019.csv',index_col=0)

Each numerical value in a row of any of the above dataset contributes to the happiness score of the corresponding country.

In [None]:
y1_df.drop(['Whisker.low','Whisker.high'],axis=1,inplace=True)

In [None]:
#renaming columns in y1 for convenience
y1_col={'Country':'Country','Happiness.Rank':'rank','Happiness.Score':'Score','Economy..GDP.per.Capita.':'GDP_per_capita','Family':'Family','Health..Life.Expectancy.':'life_expectancy','Freedom':'Freedom','Generosity':'Generosity','Trust..Government.Corruption.':'corr_perception','Dystopia.Residual':'dystopia_residual'}
y1_df.rename(columns=y1_col,inplace=True)

In [None]:
#renaming in y2
y2_col={'Country':'Country','Happiness Score':'Score','Economy (GDP per capita)':'GDP_per_capita','Family':'Social support','Health(Life Expectancy)':'life_expectancy','Freedom':'Freedom','Trust (Government corruption)':'corr_perception','Generosity':'Generosity','Dystopia Residual':'dystopia_residual'}
y2_df.rename(columns=y2_col,inplace=True)

In [None]:
#renaming in y3
y3_col={'Overall rank':'rank','Country or region':'Country','Score':'Score','GDP per capita':'GDP_per_capita','Social support':'Social support','Healthy life expectancy':'life_expectancy','Freedom to make life choices':'Freedom','Generosity':'Generosity','Perceptions of corruption':'corr_perception'}
y3_df.rename(columns=y3_col,inplace=True)

**EXPLAINING THE DYSTOPIA RESIDUAL METRIC**

**DYSTOPIA RESIDUAL** : Dystopia is a hypothetical country consisting of the least happy people. It was formed so as to create a benchmark to compare Happiness Scores of other countries with it. 
The Dystopia Residual is calculated as (Score of Dystopia+ **Residual** for the corresponding country). Here the Residual is a value generated for each country, which indicates if the 6 variables have under or over explained the life evaluations for each country for that particular year.

In [None]:
#setting a darkgrid style for each visualisation
sns.set_style("darkgrid")
project_name='World Happiness Report'

**Distribution of scores across different countries**

Have people been relatively more happier in coming years?

In [None]:
#for 2015 and 2017
plt.figure(figsize=(10,6))
a=10
plt.hist(y2_df.Score,a,label='2015',alpha=0.3,color='red')
plt.hist(y1_df.Score,a,label='2017',alpha=0.5,color='skyblue')
plt.ylabel('No. of countries',size=13)
plt.legend(loc='upper right')
plt.title("Distribution of Happiness scores across 2015,2017",size=16)

As observed from the stacked histogram, we find that the extremum values have slightly shifted to the left, we can conclude that living standards had hit a new low in 2017. Meanwhile,we observe a significant increase in the number of countries having a score ranging from 5-6 in 2017 as compared to 2015, which shows a lower number. 

We can safely conlcude that the moderately happy countries became happier in 2017 and the living standards at the extremums fell to some extent.

In [None]:
#for 2017 and 2019
plt.figure(figsize=(10,6))
b=10
plt.hist(y1_df.Score,b,label='2017',alpha=0.3)
plt.hist(y3_df.Score,b,label='2019',alpha=0.3)
plt.ylabel("No. of Countries",size=13)
plt.legend(loc="upper right")
plt.title('Distribution of Happiness scores across 2017,2019',size=16)

The stacked histogram plotted above shows that there has been an increase in the happines score at the extremums as there appears to be a shift to the right in the year 2019. A significant increase in the number of countries with score between 4-5 has been observed. This may signify relatively better living standards and satisfaction of the people with their lives and the government in 2019.

We can conclude that people have been happier in 2019 as compared to 2017.

**Correlating features with the happiness scores**

Is there a strong relation between happiness scores of a country and its state?

To answer this we will find out the correlation values between Happiness Scores and:
1. corr_perception
2. GDP_per_capita
3. Freedom

across the three datasets.

In [None]:
#correlation values for 2015 dataset
#creating a copy of the dataset with 4 columns.
y2=y2_df.copy()
y2.drop(['Standard Error','Social support','Health (Life Expectancy)','Generosity','dystopia_residual'],axis=1,inplace=True)

In [None]:
#creating a correlation matrix between numeric columns
c2=y2.corr(method='pearson')
plt.figure(figsize=(10,6))
sns.heatmap(c2,annot=True)

In [None]:
#correlations for 2017 dataset
y1=y1_df.copy()
y1.drop(['Family','life_expectancy','Generosity','dystopia_residual'],axis=1,inplace=True)


In [None]:
c1=y1.corr(method='pearson')
plt.figure(figsize=(10,6))
sns.heatmap(c1,annot=True,cmap="YlOrRd")

In [None]:
#correlations for 2019 dataset
y3=y3_df.copy()
y3.drop(['Social support',
       'life_expectancy', 'Generosity'],axis=1,inplace=True)

In [None]:
c3=y3.corr()
plt.figure(figsize=(10,6))
sns.heatmap(c3,annot=True,cmap='PuBuGn')

**##########################################################################################################################**

**WRITE CONCLUSIONS OF THE ABOVE CORRELATIONS**

In [None]:
jovian.commit(project=project_name)