# World Happiness Index: Exploratory Data Analysis on India

### Context
The World Happiness Report is a landmark survey of the state of global happiness . The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.

### Key Features in 'world-happiness-report':
1. <u>Life Ladder</u>: It is the score for happiness and liveability. The high ladder score represents the best possible life for you and low ladder score represents the worst possible life for you.

2. <u>Log GDP per capita</u>: It shows how much economic production value can be attributed to each individual citizen. Alternatively, this translates to a measure of national wealth since GDP market value per person also readily serves as a prosperity measure.

3. <u>Social support</u>: It is defined in terms of social network characteristics such as assistance from family, friends, neighbours and other community members.

4. <u>Healthy life expectancy at birth</u>: Healthy life expectancy is the average life in good health - that is to say without irreversible limitation of activity in daily life or incapacities - of a fictitious generation subject to the conditions of mortality and morbidity prevailing that year

5. <u>Freedom to make life choices</u>: Freedom of choice describes an individual's opportunity and autonomy to perform an action selected from at least two available options, unconstrained by external parties.

6. <u>Generosity</u> Generosity is the virtue of being liberal in giving, often as gifts.

7. <u>Perceptions of corruption</u>: Corruption is a form of dishonesty or criminal offense undertaken by a person or organization entrusted with a position of authority, to acquire illicit benefit



# Import Libraries

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Import the data

In [None]:
df = pd.read_csv("/kaggle/input/world-happiness-report-2021/world-happiness-report.csv")
df.info()

# Cleaning the data

In [None]:
p1= df['Country name'].unique()

for x in df.columns[2:]: 
    df[x] = df[x].fillna(df[x].mean())

df.info()

# Happiest countries in the world by year

In [None]:
df1=df[df['Life Ladder']==df.groupby("year")['Life Ladder'].transform('max').values]
df1=df1.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df1, hue = 'Country name', dodge=False)

df1

# Saddest countries in the world by year

In [None]:
df2=df[df['Life Ladder']==df.groupby("year")['Life Ladder'].transform('min').values]
df2=df2.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df2, hue = 'Country name', dodge=False)

df2

# Asian Countries

In [None]:
asian_countries= ['Afghanistan', 'Bangladesh', 'Bhutan', 'China','Cambodia' 'Hong Kong S.A.R. of China', 'India', 'Indonesia', 'Japan', 'Kazakhstan', 'Kyrgyzstan','Laos', 'Malaysia', 'Maldives', 'Mongolia', 'Myanmar', 'Nepal', 'Pakistan', 'Philippines', 'Singapore', 'South Korea', 'Sri Lanka', 'Taiwan Province of China', 'Tajikistan', 'Thailand', 'Turkmenistan', 'Uzbekistan', 'Vietnam'] 
print("Number of Asian countries:",len(asian_countries))
df_asian= df[df['Country name'].isin(asian_countries)]
df_asian=df_asian.reset_index(drop=True)
df_asian

# Happiest Asian countries by year

In [None]:
df3=df_asian[df_asian['Life Ladder']==df_asian.groupby("year")['Life Ladder'].transform('max').values]
df3=df3.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df3, hue = 'Country name', dodge=False)
df3

# Saddest Asian countries by year

In [None]:
df3=df_asian[df_asian['Life Ladder']==df_asian.groupby("year")['Life Ladder'].transform('min').values]
df3=df3.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df3, hue = 'Country name', dodge=False)
df3

# How happy the people living in each Asian country have been over the years 2005-2020?

In [None]:
df4=df_asian.copy()
df4.pop('year')
df4=df4.pivot_table(values=df4.columns[1:], index='Country name', aggfunc=np.mean)
df4=df4.reset_index()
df4=df4.sort_values(by='Life Ladder')
pal=sns.color_palette("flare", len(df4['Life Ladder']))
plt.figure(figsize=(25,8))
plt.xticks(fontsize=15, rotation=45)
plt.yticks(fontsize=15)
plt.xlim(0,25)
plt.ylim(0,10)
sns.barplot(x='Country name', y='Life Ladder', data = df4, palette=pal, dodge=False)
plt.show()
df4

# Comparing South Asian countries: India and its neighbours

In [None]:
sa=['Afghanistan','Bangladesh','Bhutan','China','India','Nepal','Maldives','Myanmar','Pakistan','Sri Lanka']
print("Number of South-Asian countries:",len(sa))
df_sa= df[df['Country name'].isin(sa)]
df_sa=df_sa.reset_index(drop=True)
df_sa

# Happiest countries in South-Asia by year

In [None]:
df_sa_h=df_sa[df_sa['Life Ladder']==df_sa.groupby("year")['Life Ladder'].transform('max').values]
df_sa_h=df_sa_h.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df_sa_h, hue = 'Country name', dodge=False)
df_sa_h

# Saddest countries in South-Asia by year

In [None]:
df_sa_s=df_sa[df_sa['Life Ladder']==df_sa.groupby("year")['Life Ladder'].transform('min').values]
df_sa_s=df_sa_s.sort_values(by='year')
plt.figure(figsize=(15,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.xlim(0,15)
plt.ylim(0,10)
sns.barplot(x='year', y='Life Ladder', data = df_sa_s, hue = 'Country name', dodge=False)
df_sa_s

# How happy the people living in each Asian country have been over the years 2005-2020?

In [None]:
df5=df_sa.copy()
df5.pop('year')
df5=df5.pivot_table(values=df5.columns[1:], index='Country name', aggfunc=np.mean)
df5=df5.reset_index()
df5=df5.sort_values(by='Life Ladder')
pal=sns.color_palette("flare", len(df5['Life Ladder']))
plt.figure(figsize=(25,8))
plt.xticks(fontsize=15, rotation=45)
plt.yticks(fontsize=15)
plt.xlim(0,25)
plt.ylim(0,10)
sns.barplot(x='Country name', y='Life Ladder', data = df5, palette=pal, dodge=False)
plt.show()
df4

# Analysis: Why India has been among the saddest south-asian countries since 15 years?

We will first analyze and compare the Life Ladder score over the years of India and other south-asian countries. Then we will look into different factors that could've possibly played the role in India's unhappiness over the years

## Life Ladder score comparison of south-asian countries over the years

In [None]:
df_asian
plt.figure(figsize=(40, 20))

for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Life Ladder']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Life Ladder'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Life Ladder Score',fontsize=20)
plt.title('Life Ladder score of Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Life Ladder'].loc[59],'India',horizontalalignment='left',fontsize=25)
plt.show()

From the above graph, it can be seen that happiness index of India has been gradually decreasing since 2006. In 2006, the happiness index was around 5.4 which is the highest value in this period and in 2020 it was around 4.2. The lowest value of happiness index of India was recorded in 2019 which was around 3.4.

## How different factors have been for India in these years?
We will now visualize and analyze about how the following factors have been for India in the period 2005 to 2020:
1. Generosity
2. Healthy life expectancy at birth
3. Freedom to make life choices
4. Log GDP per capita
5. Corruption
6. Social Support

In [None]:
plt.figure(figsize = (20,8))

#Generosity score of India over the years
best_generosity=max(df_sa['Generosity'])
plt.subplot(231)
plt.axhline(best_generosity, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Generosity'], color='r') 
plt.ylabel('Generosity')
plt.title("Generosity in India (2005-2020)")


#Life Expectancy at birth score of India over the years
best_LEB=max(df_sa['Healthy life expectancy at birth'])
plt.subplot(232)
plt.axhline(best_LEB, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Healthy life expectancy at birth'], color='g') 
plt.ylabel('Healthy life expectancy at birth')
plt.title("Healthy life expectancy at birth in India (2005-2020)")


#Freedom to make life choices score of India over the years
best_freedom=max(df_sa['Freedom to make life choices'])
plt.subplot(233)
plt.axhline(best_freedom, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Freedom to make life choices'], color='b') 
plt.ylabel('Freedom to make life choices')
plt.title("Freedom to make life choices in India (2005-2020)")


#Log GDP per capita score of India over the years
best_gdp=max(df_sa['Log GDP per capita'])

plt.subplot(234)
plt.axhline(best_gdp, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Log GDP per capita'], color='y') 
plt.xlabel('Year')
plt.ylabel('Log GDP per capita')
plt.title("Log GDP per capita in India (2005-2020)")

#Corruption score of India over the years
least_corruption=min(df_sa['Perceptions of corruption'])
plt.subplot(235)
plt.axhline(least_corruption, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Perceptions of corruption'], color='c') 
plt.xlabel('Year')
plt.ylabel('Corruption')
plt.title("Corruption in India (2005-2020)")

#Social Support score of India over the years
best_socialsupport=max(df_sa['Social support'])
plt.subplot(236)
plt.axhline(best_socialsupport, ls='--', color='grey')
plt.bar(df_sa[df_sa['Country name']=='India']['year'], df_sa[df_sa['Country name']=='India']['Social support'], color='m') 
plt.xlabel('Year')
plt.ylabel('Social Support')
plt.title("Social Support in India (2005-2020)")


plt.show()

### Key Takeaways from above graphs:
1. <u>Factors where India is doing good at</u>: Healthly life exptancy at birth, Freedom to make life choices and Log GDP per capita
2. <u>Factors where India needs a lot of improvement</u>: Generosity, Social Support and corruption
3. Corruption in India post-2014 has been lesser than that in pre-2014 with 2011 being the year when most of the corruption cases happened.
4. There has been the good improvement in the freedom to make life choices in last 2-3 years
5. India still needs to do a lot of work to get the best score in above factors, which is stated as below:

In [None]:
best_scores=[best_generosity, best_LEB, best_freedom, best_gdp, least_corruption, best_socialsupport]
params=['Generosity', 'Healthy life expectancy at birth', 'Freedom to make life choices','Log GDP per capita', 'Perceptions of corruption', 'Social support']
bsd=dict(zip(params, best_scores))
country=[]
year=[]
for key in bsd:
    country.append(df_sa[df_sa[key]==bsd.get(key)]['Country name'].iloc[0])
    year.append(df_sa[df_sa[key]==bsd.get(key)]['year'].iloc[0])
    
dict1={'Best Score': best_scores, 'Recorded By': country, 'In the year': year}
best_score_df=pd.DataFrame(dict1)
best_score_df.index=params
best_score_df


Now we will compare the above factors of India with that of other south-asian countries

In [None]:
plt.figure(figsize=(40, 120))

#Generosity score of south-asian countries (2005-2020)
plt.subplot(611)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Generosity']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Generosity'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Generosity',fontsize=20)
plt.title('Generosity of South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Generosity'].loc[59],'India',horizontalalignment='left',fontsize=25)

#Life Expectancy at birth score of south-asian countries (2005-2020)
plt.subplot(612)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Healthy life expectancy at birth']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Healthy life expectancy at birth'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Healthy life expectancy at birth',fontsize=20)
plt.title('Healthy life expectancy at birth of South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Healthy life expectancy at birth'].loc[59],'India',horizontalalignment='left',fontsize=25)

#Freedom to make life choices score of south-asian countries (2005-2020)
plt.subplot(613)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Freedom to make life choices']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Freedom to make life choices'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Freedom to make life choices',fontsize=20)
plt.title('Freedom to make life choices of South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Freedom to make life choices'].loc[59],'India',horizontalalignment='left',fontsize=25)

#Log GDP per capita score of south-asian countries (2005-2020)
plt.subplot(614)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Log GDP per capita']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Log GDP per capita'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Log GDP per capita',fontsize=20)
plt.title('Log GDP per capita of South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Log GDP per capita'].loc[59],'India',horizontalalignment='left',fontsize=25)

#Corruption score of south-asian countries (2005-2020)
plt.subplot(615)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Perceptions of corruption']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Perceptions of corruption'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Corruption',fontsize=20)
plt.title('Corruption in South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Perceptions of corruption'].loc[59],'India',horizontalalignment='left',fontsize=25)

#Social support score of south-asian countries (2005-2020)
plt.subplot(616)
for country in sa:
    life_ldr_sc = df_asian[(df_asian['Country name']==country)]
    life_ldr_sc = life_ldr_sc.loc[:,['year','Social support']]
    plt.plot(life_ldr_sc['year'], life_ldr_sc['Social support'].values, label=country, marker='o')

plt.xlabel('Year',fontsize=20)
plt.ylabel('Social support',fontsize=20)
plt.title('Social support in South-Asian countries over the years',fontsize=20)
plt.legend(fontsize=20)
plt.xticks(fontsize = 15)
plt.yticks(fontsize = 15)
plt.text(2020, df_asian[(df_asian['year']==2020) & (df_asian['Country name']=='India')]['Social support'].loc[59],'India',horizontalalignment='left',fontsize=25)



plt.show()

### Key takeaways from above graphs
1. <u>Generosity</u>: Eventhough India's generosity score has been nearly more or less same as majority of other south-asian countries, it still needs to work a lot on it to achieve an average score same as Myanmar
2. <u>Healthy Life Expectancy at Birth</u>: India and other south asian countries have been making a steady improvement in this field. India still needs to work in its healthcare sector to make a significant improvement in Healthy Life expectancy at Birth as countries like China, Bangladesh and Sri Lanka are still ahead
3. <u>Freedom to make life choices</u>: India has been making a significant improvement in this field in last 2-3 years as compared to other countries. Infact in 2020, India ended taking the highest score in this factor. India should keep up this score
4. <u>Log GDP per capita</u>: India's GDP per capita has been in great shape in all these years. However in 2020 it dropped down which could be due to the ongoing COVID-19 pandemic
5. <u>Corruption</u>: Corruption in India has been going down since 2014. India should keep up the good work on decreasing the corruption
6. <u>Social Support</u>: India lags behind a lot in this field and needs a good amount of work here

### Inferences based on above analysis
India's low life ladder score could be due to it's lagging in Social support, generosity and corruption. To make significant improvement and become atleast the happiest south asian country:
1. India should work a lot on Generosity and Social support. 
2. It should also do a good amount of work on decreasing corruption and increasing Life expectancy at birth. 
3. It should keep up the work of maintaining the freedom to make choices and GDP per capita