## Introduction

_The World Happiness Report is a landmark survey of the state of global happiness . The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness._

![Happy kids!](http://www.awarenessdays.com/wp-content/uploads/2018/07/kidshomepage1.jpg)

## Content

_The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others._

In [None]:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
# matplotlib setting
mpl.rcParams['figure.dpi'] = 200
mpl.rcParams['axes.spines.top'] = False
mpl.rcParams['axes.spines.right'] = False

In [None]:
df = pd.read_csv("../input/world-happiness-report-2021/world-happiness-report-2021.csv")

_Let's see the first 5 records_

In [None]:
df.head()

_Get some information on the data_

In [None]:
df.info()

In [None]:
df.describe().T.style.bar(subset=['mean'], color='#205ff2')\
                            .background_gradient(subset=['std'], cmap='Reds')\
                             .background_gradient(subset=['50%'], cmap='coolwarm')

**Above data is the statistical information of the dataframe**

**Bright red is the standard deviation value 6.7620 because it is the highest amon standard deviations! Healthy Life expectancy have highest standard deviation**

You can get the rest!

_There are no missing values (from df.info())_

In [None]:
plt.figure(figsize = (20,15))
sns.heatmap(df.corr(), annot = True)

In [None]:
high_corruption = df[['Country name','Perceptions of corruption','Healthy life expectancy' ]].sort_values('Perceptions of corruption', ascending = False)

In [None]:
import plotly.express as px

fig = px.bar(high_corruption[:30], x='Country name', y='Perceptions of corruption', color ='Healthy life expectancy',
            title = 'High corruption countries')
fig.show()

**Top 30 countries with highest corruption**

**Afghanistan, Lesotho, Nigeria, Sierra Leone** has the least healthy life expectancy.

In [None]:
#Let's see the top countries with least corruption
fig = px.bar(high_corruption[120:], x='Country name', y='Perceptions of corruption', color ='Healthy life expectancy',
            title = 'High corruption countries')
fig.show()

**Top countries Singapore, Rwanda, Denmark, Sweden, Finland** have the least corruption


In [None]:
life_exp = df[['Country name','Freedom to make life choices','Healthy life expectancy' ]].sort_values('Healthy life expectancy', ascending = False)

In [None]:
fig = px.bar(life_exp[:20], x='Country name', y='Healthy life expectancy', color ='Freedom to make life choices',
            title = 'Low life expectancy and freedom for life choices')
fig.show()

**Singapore has the highest healthy life expectancy with value of 76.953 and freedom to make life choice is also very high .927**

_Happy people_ 🤗

In [None]:
fig = px.bar(life_exp[140:], x='Country name', y='Healthy life expectancy', color ='Freedom to make life choices',
            title = 'Low life expectancy and freedom for life choices')
fig.show()

**Chad, Lesotho, Nigeria have least healthy life expectancy**

_sad people_ ☹

In [None]:
sns.barplot(y = 'Regional indicator', x ='Freedom to make life choices', data = df)

**Low Freedom to make life choices is observed in Middle East and North Africa, Sub-Saharan Africa**

In [None]:
fig = px.scatter(df, x='Regional indicator', y="Freedom to make life choices", color="Healthy life expectancy",
                 size='Logged GDP per capita', hover_data=['Social support'])
fig.show()

*Move your cursor around the bubbles to get detailed information*

**For example if you put your cursor on the lowest bubble of South Asia you will get information like Freedom to make life choices = 0.382, Logged GDP per capita 7.695, Social support 0.463, Healthy life expectancy =52.493**

In [None]:
cols = ['Logged GDP per capita','Social support', 'Healthy life expectancy', 'Freedom to make life choices',
        'Generosity', 'Perceptions of corruption', 'Ladder score in Dystopia']

sns.pairplot(df[cols], height = 3)

**Logged GDP per capita has strong correlation with Healthy Life expectancy and Social support, somewhat good correlation with Freedom to make life choices**

**Social Support has strong correlation with Healthy Life expectancy and somewhat good correlation with Freedom to make life choices**

**Healthy Life expectancy has strong correlation with Logged GDP per capita and Social support, somewhat strong correlation with Freedom to make life choices**

**Freedom to make life choices has somewhat strong correlation with Logged GDP per capita, Healthy Life expectancy, Social support and Generosity**

In [None]:
fig = px.pie(df, values='Logged GDP per capita', names='Regional indicator', title='% of Logged GDP of regions from data')
fig.show()

20.7% of data has Sub Saharan African region.

_Log of real GDP per capita
Natural logs have a few great properties for our purposes. Using them means that every step up the y-axis is an identical percent change in real GDP per capita. Going from 7.0 to 7.5, for example, is a 65% increase in real GDP per capita. Going from 7.5 to 8.0 is also a 65% increase in real GDP per capita._

In [None]:
fig = px.density_heatmap(df, x="Freedom to make life choices", y='Perceptions of corruption', marginal_x="box", marginal_y="violin")
fig.show()

**Freedom to make life choices has median value .804**

**Perceptions of corruption has median value .781**

**The square boxes gives the estimate of freedom to make life choices and perceptions of corruptions with counts, again the reader is advised to move around the cursor to understand the values**

In [None]:
fig = px.scatter(df, x="Healthy life expectancy", y="Logged GDP per capita", color="Regional indicator",
                  marginal_x="box")
fig.show()

**A simple scatter plot to understand the correlation between healthy life expectancy and logged GDP per capita along with the regions they represnt**

**One can notice the pink dots in the left and bottom represents sub saharan Africa with healthy life expectancy of 53.4**

In [None]:
fig = px.violin(df, y="Logged GDP per capita",x ="Generosity", color = 'Regional indicator', box=True, # draw box plot inside the violin
                points='all', hover_data=df.columns # can be 'outliers', or False
               )
fig.show()

**This maybe the most essential plot**

**move your cursor around the points and you will get all detailed for each point(includes all the informations like country name, logged GDP, generosity, etc all the columns)**

#### There you go with a detailed analysis of some of the most important columns, one can dive into more details from this, just copy and edit####
**Be sure to upvote if you like it** 😀😀

![Stay Happy](http://ih1.redbubble.net/image.1249612855.2033/st,small,845x845-pad,1000x1000,f8f8f8.jpg)