***
**World Happiness Report 2021**  
-----
***  
  
This notebook is about Happiness Rate in countries and to find what causes the decrease in rate of happinesss.

<h3>Description</h3>
___

**Context**

The World Happiness Report is a landmark survey of the state of global happiness . The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.
___

**Content**

The happiness scores and rankings use data from the Gallup World Poll . The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
from wordcloud import WordCloud
import matplotlib.lines as lines
import matplotlib.gridspec as gs

In [None]:
df = pd.read_csv('../input/world-happiness-report-2021/world-happiness-report-2021.csv')
df.head(10)

Here `Ladder Score` is the rank which is given to country based on factors which contributes in happiness.

Changing columns names for better selection.

In [None]:
df.shape

In [None]:
df.columns = ['Country_name', 'Regional_indicator', 'Ladder_score',
       'Standard_error_of_ladder_score', 'upperwhisker', 'lowerwhisker',
       'Logged_GDP_per_capita', 'Social_support', 'Healthy_life _expectancy',
       'Freedom_to_make_life_choices', 'Generosity',
       'Perceptions_of_corruption', 'Ladder_score_in_Dystopia',
       'Explained_by:Log_GDP_per_capita', 'Explained_by:Social_support',
       'Explained_by:Healthy_life_expectancy',
       'Explained_by:Freedom_to_make_life_choices',
       'Explained_by:Generosity', 'Explained_by:Perceptions_of_corruption',
       'Dystopia_residual']

In [None]:
df

Finding out how our country is doing in World Happiness Rate

In [None]:
df.loc[df.Country_name == 'India']

In [None]:
df1 = pd.read_csv('../input/world-happiness-report-2021/world-happiness-report.csv')

In [None]:
df1.columns = ['Country_name', 'year', 'Life_Ladder', 'Log_GDP_per_capita',
       'Social_support', 'Healthy_life_expectancy_at_birth',
       'Freedom_to_make_life_choices', 'Generosity',
       'Perceptions_of_corruption', 'Positive_affect', 'Negative_affect']

In [None]:
df1.head()

In [None]:
fig = plt.figure(figsize = (5,2), facecolor = 'white')

ax0 = fig.add_subplot(1,1,1)
ax0.text(1.1,1, "Key figures", color = 'black', fontsize=28, fontweight='bold', 
         fontfamily = 'monospace', ha ='center')


ax0.text(0,0.4,"Finland", color = 'gold', fontsize = 25, fontweight = 'bold', fontfamily='monospace',
        ha ='center')
ax0.text(0,0.001, "Happiest County", color= 'dimgrey', fontsize = 17, fontweight =  'light', 
         fontfamily = 'monospace',ha = 'center')


ax0.text(0.75,0.4,"Afghanistan", color = 'tomato', fontsize = 25, fontweight = 'bold', fontfamily='monospace',
        ha ='center')
ax0.text(0.75,0.001, "Saddiest County", color= 'dimgrey', fontsize = 17, fontweight =  'light', 
         fontfamily = 'monospace',ha = 'center')


ax0.text(1.50,0.4,"9 of top 10", color = 'darkgreen', fontsize = 25, fontweight = 'bold', fontfamily='monospace',
        ha ='center')
ax0.text(1.50,0.001, "in Europe", color= 'dimgrey', fontsize = 17, fontweight =  'light', 
         fontfamily = 'monospace',ha = 'center')



ax0.text(2.35,0.4,"7 of bottom 10", color = 'darkred', fontsize = 25, fontweight = 'bold', fontfamily='monospace',
        ha ='center')
ax0.text(2.35,0.001, "in Africa", color= 'dimgrey', fontsize = 17, fontweight =  'light', 
         fontfamily = 'monospace',ha = 'center')

ax0.set_yticklabels('')
ax0.set_xticklabels('')
ax0.tick_params(axis='both',length=0)

for s in ['top','right','left','bottom']:
    ax0.spines[s].set_visible(False)
    
    
l1 = lines.Line2D([0.15, 1.95], [0.67, 0.67], transform=fig.transFigure, figure=fig,color = 'gray', 
                  linestyle='-',linewidth = 1.1, alpha = .5)
fig.lines.extend([l1])
l2 = lines.Line2D([0.15, 1.95], [0.07, 0.07], transform=fig.transFigure, figure=fig,color = 'gray', 
                  linestyle='-',linewidth = 1.1, alpha = .5)
fig.lines.extend([l2])
    
plt.show()

In [None]:
fig= plt.figure(figsize=(15,8))
g=gs.GridSpec(ncols=1, nrows=2, figure=fig)
plt.suptitle("Top 5 and Bottom 5 countries in Happiens index 2021", family='Serif', weight='bold', size=20)
ax1=plt.subplot(g[0,0])

top_5=df.head(5)
bot_5= df.tail(5)
ax1=sns.barplot(data=top_5, x=top_5['Ladder_score'],y=top_5['Country_name'], color='green')
#ax1.set_xlabel('')
ax1.xaxis.set_visible(False)
ax1.annotate("Top 5 countries in Happiness index",xy=(8,2), family='Serif', weight='bold', size=12)
ax2=plt.subplot(g[1,0], sharex=ax1)
ax2=sns.barplot(data=bot_5, x=bot_5['Ladder_score'],y=bot_5['Country_name'], color='red')
ax2.annotate("Bottom 5 countries in Happiness index",xy=(8,2), family='Serif', weight='bold', size=12)

for s in ['left','right','top','bottom']:
    ax1.spines[s].set_visible(False)
    ax2.spines[s].set_visible(False)

In [None]:
plt.figure(figsize = (10,50))
plt.title('Rate of Happiness in 149 countries')
sns.barplot(y=df.Country_name, x=df.Ladder_score)
plt.xlabel("Ladder Score")

In [None]:
plt.figure(figsize = (10,50))
plt.title('Top Counteries in Dystopia')
sns.scatterplot(y=df.Country_name, x=df.Dystopia_residual)
plt.xlabel("Score in Dystopia")

In [None]:
plt.figure(figsize = (10,8))
plt.title('Region wise')
sns.barplot(y=df.Regional_indicator, x=df.Ladder_score)
plt.xlabel("Score in Dystopia")

In [None]:
df.head()

Lets see if we can find any kind of relation between Numerical variables by excluding categorical data ['Country_name', 'Regional_Indicator']and drawing a heatmap.

In [None]:
dfh = df[['Ladder_score',
       'Standard_error_of_ladder_score', 'upperwhisker', 'lowerwhisker',
       'Logged_GDP_per_capita', 'Social_support', 'Healthy_life _expectancy',
       'Freedom_to_make_life_choices', 'Generosity',
       'Perceptions_of_corruption', 'Ladder_score_in_Dystopia',
       'Explained_by:Log_GDP_per_capita', 'Explained_by:Social_support',
       'Explained_by:Healthy_life_expectancy',
       'Explained_by:Freedom_to_make_life_choices', 'Explained_by:Generosity',
       'Explained_by:Perceptions_of_corruption', 'Dystopia_residual']]

In [None]:
dfh.head()

In [None]:
dfh.corr()

In [None]:
plt.figure(figsize=(10,10))
corr_matrix = dfh.corr()
sns.heatmap(corr_matrix, annot = True)
plt.show()

Only `Explained_by:Generosity` `Perceptions_of_corruption` `Standard_error_of_ladder_score` are negatively correlated else are positively correlated.<br>

Which means these all features are contributing or regressing the `ladder_score`.

In [None]:
fig=plt.figure(figsize=(15,10))
plt.title("Ladder score distribution by Regional indicator",family='Serif', weight='bold', size=20)
sns.kdeplot(df['Ladder_score'], fill=True,hue=df['Regional_indicator'], 
            shade=True, linewidth=2, edgecolor='white', multiple='layer')
plt.axvline(df['Ladder_score'].mean(), c='black',ls='--')
plt.text(x=df['Ladder_score'].mean(),y=-0.01,s='Population_mean', size=15)
for s in ['left','right','top','bottom']:
    plt.gca().spines[s].set_visible(False)


Life Expectancy by GDP in South America

In [None]:
df['Regional_indicator'].unique()

In [None]:
GDP = df[['Country_name','Logged_GDP_per_capita','Healthy_life _expectancy','Regional_indicator']]
# GDP.sort_values(by=['Regional_indicator'== "South America"])
GDP_SA = GDP.loc[GDP.Regional_indicator == "Latin America and Caribbean"]
GDP_SA

In [None]:
plt.figure(figsize = (10,8))
plt.title('Life Expectancy by GDP in South America')
sns.barplot(x=GDP_SA['Healthy_life _expectancy'], 
            y=GDP_SA['Country_name'])
plt.xlabel("Healthy life expectancy/Logged GDP Per Capita")
plt.ylabel("South American Counteries")
plt.show()

In [None]:
plt.plot(GDP_SA.Logged_GDP_per_capita)

In [None]:
GDP_SA

In [None]:
corr = GDP_SA.corr()
corr
# sns.heatmap(GDP_SA)

In [None]:
sns.heatmap(corr, annot=True, cmap='Blues')

In [None]:
# Work in progress