**Welcome everyone to my notebook, this is my first notebook in Kaggle. I m going perform EDA of asian countries in world-happiness-report (2006-2020) using Seaborn approach.**

## Key Features in 'world-happiness-report':

**Life Ladder**: Imagine a ladder, with steps numbered from 0 at the bottom to 10 at the top. The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you.

**Log GDP per capita**: At its most basic interpretation, per capita GDP shows how much economic production value can be attributed to each individual citizen. Alternatively, this translates to a measure of national wealth since GDP market value per person also readily serves as a prosperity measure.

**Social support**: Social support is defined in terms of social network characteristics such as assistance from family, friends, neighbours and other community members.

**Healthy life expectancy at birth**: Healthy life expectancy is the average life in good health - that is to say without irreversible limitation of activity in daily life or incapacities - of a fictitious generation subject to the conditions of mortality and morbidity prevailing that year

**Freedom to make life choices**: Freedom of choice describes an individual's opportunity and autonomy to perform an action selected from at least two available options, unconstrained by external parties.

**Generosity**: Generosity is the virtue of being liberal in giving, often as gifts.

**Perceptions of corruption**: Corruption is a form of dishonesty or criminal offense undertaken by a person or organization entrusted with a position of authority, to acquire illicit benefit

## Importing Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Loading the dataset

In [None]:
data = pd.read_csv('/kaggle/input/world-happiness-report-2021/world-happiness-report.csv')

In [None]:
data.sample(10)

## Data Preprocessing

In [None]:
data.info()

In [None]:
data.shape

In [None]:
col = data.isnull().sum()
col

In [None]:
def handle_missing_values(col):
    val = data[col].mean()
    data[col] = data[col].fillna(val)

In [None]:
for i in col.index:
    if col[i] > 0:
        handle_missing_values(i)

In [None]:
data.isnull().sum()

## Categorizing Asian Countries

In [None]:
asia = data[(data['Country name'] == 'Nepal')|(data['Country name'] == 'India')|(data['Country name'] == 'Sri Lanka')|(data['Country name'] == 'Laos')|
               (data['Country name'] == 'Malaysia')|(data['Country name'] == 'Japan')|(data['Country name'] == 'Indonesia')|(data['Country name'] == 'Thailand')|
               (data['Country name'] == 'China')|(data['Country name'] == 'Singapore')|(data['Country name'] == 'Philippines')|(data['Country name'] == 'South Korea')|
               (data['Country name'] == 'Mongolia')|(data['Country name'] == 'Myanmar')|(data['Country name'] == 'Vietnam')|(data['Country name'] == 'Pakisthan')|
               (data['Country name'] == 'Maldives')|(data['Country name'] == 'Cambodia')|(data['Country name'] == 'Bangladesh')|(data['Country name'] == 'Taiwan Province of China')|
               (data['Country name'] == 'Hong Kong S.A.R of China')|(data['Country name'] == 'Afghanistan')]
asia.reset_index(drop=True,inplace=True)
asia = asia.rename({'Country name':'Asian Countries'},axis=1)
asia

In [None]:
asia.shape

## Dividing the Asian Countries into three regions such as 'South Asia','East Asia' and 'South-East Asia'

In [None]:
asia.insert(2,'Regional indicator',' ')
sa = ['India','Nepal','Maldives','Bangladesh','Pakisthan','Sri Lanka','Afghanistan']
for i in asia['Asian Countries']:
    for j in sa:
        if i == j:
            asia.loc[asia['Asian Countries'] == i,"Regional indicator"] = 'South Asia'

ea = ['Taiwan Province of China','Japan','South Korea','Mongolia','Hong Kong S.A.R. of China','China']
for i in asia['Asian Countries']:
    for j in ea:
        if i == j:
            asia.loc[asia['Asian Countries'] == i,"Regional indicator"] = 'East Asia'

sea = ['Singapore','Thailand','Philippines','Vietnam','Malaysia','Indonesia','Laos','Cambodia','Myanmar']
for i in asia['Asian Countries']:
    for j in sea:
        if i == j:
            asia.loc[asia['Asian Countries'] == i,"Regional indicator"] = 'Southeast Asia'

### Countplot

In [None]:
sns.countplot(x='Regional indicator',data=asia)

In [None]:
asia['Regional indicator'].value_counts()

## Grouping the features with respect to the regions

In [None]:
ls = asia.groupby('Regional indicator')['Life Ladder'].mean()
gdp = asia.groupby('Regional indicator')['Log GDP per capita'].mean()
ss = asia.groupby('Regional indicator')['Social support'].mean()
he = asia.groupby('Regional indicator')['Healthy life expectancy at birth'].mean()
corr = asia.groupby('Regional indicator')['Perceptions of corruption'].mean()
gen = asia.groupby('Regional indicator')['Generosity'].mean()
free = asia.groupby('Regional indicator')['Freedom to make life choices'].mean()
pos = asia.groupby('Regional indicator')['Positive affect'].mean()
neg = asia.groupby('Regional indicator')['Negative affect'].mean()
frame = {'Ladder score':ls,'Logged GDP per capita':gdp,'Social support':ss,
            'Healthy life expectancy':he,'Perceptions of corruption':corr,'Generosity':gen,'Freedom to make life choices':free,
        'Positive affect':pos,'Negative affect':neg}
Mean_data = pd.DataFrame(data=frame)
Mean_data

### Overview:

1. East Asian Countries having very good in Ladder Score, GDP, Social support & Health life. But the values of Corruption & Generosity are worst and also they haven't given more freedom to make life choices.
2. The perceptions of corruption of very high in South Asian Countries and having minimum values in remaining features which causes increase in Negative affect & decrease in Positive affect.
3. South-East Asian Countries having good values in all features and Positive affect is very high as compared to the other regions.

## Categorizing the Asian Countries with respect to the regions

In [None]:
south_asia = asia[asia['Regional indicator'] == 'South Asia']
south_asia.reset_index(drop=True,inplace=True)
south_asia = south_asia.rename({'Asian Countries':'South Asian Countries'},axis=1)

east_asia = asia[asia['Regional indicator'] == 'East Asia']
east_asia.reset_index(drop=True,inplace=True)
east_asia = east_asia.rename({'Asian Countries':'East Asian Countries'},axis=1)

south_east_asia = asia[asia['Regional indicator'] == 'Southeast Asia']
south_east_asia.reset_index(drop=True,inplace=True)
south_east_asia = south_east_asia.rename({'Asian Countries':'South-East Asian Countries'},axis=1)

In [None]:
south_asia.sample(5)

In [None]:
east_asia.sample(5)

In [None]:
south_east_asia.sample(5)

## Visualization by pointplot with hue

## South Asian Countries

### 1. Life Ladder

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Life Ladder',hue='South Asian Countries',data=south_asia)

1. Sri Lanka had maintained almost constant Life Ladder.
2. Life Ladder of Afghanistan & India are decreasing by every year.
3. Life Ladder of Bangladesh & Nepal are increasing by every year.

### 2. Log GDP per capita

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Log GDP per capita',hue='South Asian Countries',data=south_asia)

Except Afghanistan, all countries had maintained increased GDP value every year.

### 3. Health Life Expectancy

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Healthy life expectancy at birth',hue='South Asian Countries',data=south_asia)

After 2015, Afghanistan lost their consistency and remaining all countries had good consistency.

### 4. Freedom to make life choices

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Freedom to make life choices',hue='South Asian Countries',data=south_asia)

Except Afghanistan, all countries had given more freedom.

### 5. Social support

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Social support',hue='South Asian Countries',data=south_asia)

1. Almost Sri Lanka & Nepal had constant social support.
2. India & Bangladesh increased their social support after 2012.
3. Afghanistan social support was getting low.
4. Maldives had very good social support

### 6. Perceptions of corruption

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Perceptions of corruption',hue='South Asian Countries',data=south_asia)

1. Afghanistan & Sri Lanks had maximum value of Corruption.
2. Nepal, India & Bangladesh corruption rate is decreasing.

### 7. Generosity

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Generosity',hue='South Asian Countries',data=south_asia)

Generosity values hadn't maintained constant by any of these countries.

## East Asian Countries

### 1. Life Ladder

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Life Ladder',hue='East Asian Countries',data=east_asia)

1. Taiwan, China, South Korea & Mongolia had increased Life Ladder.
2. Life Ladder of Japan moving downward every year.

### 2. Log GDP per capita

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Log GDP per capita',hue='East Asian Countries',data=east_asia)

1. Taiwan had the highest GDP but after 2017 it goes down.
2. South Korea, China & Mongolia had increasing curve.
3. Japan had maintained constant GDP.

### 3. Healthy life expectancy

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Healthy life expectancy at birth',hue='East Asian Countries',data=east_asia)

Except Taiwan, all countries had increasing health life values.

### 4. Freedom to make life choices

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Freedom to make life choices',hue='East Asian Countries',data=east_asia)

1. Ups and downs are there in all countries with respect to freedom to make choices.
2. But taiwan hadn't reduced freedom from 2012 to 2017.

### 5. Social support

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Social support',hue='East Asian Countries',data=east_asia)

1. South Korea & China had low social support among these.
2. Japan & Mongolia had very good social support.
3. Taiwan got more social support after 2013

### 6. Perceptions of corruption

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Perceptions of corruption',hue='East Asian Countries',data=east_asia)

1. Between 2010 and 2017, perceptions of corruption of south korea was high.
2. Between 2011 and 2015, perceptions of corruption of taiwan was high.
3. Perceptions of corruption of japan was very low.
4. Corruption rate of China was constant.

### 7. Generosity

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Generosity',hue='East Asian Countries',data=east_asia)

1. Mongolia had very good generosity value.
2. Generosity of Japan and China was decreasing.

## South-East Asian Countries

### 1. Life Ladder

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Life Ladder',hue='South-East Asian Countries',data=south_east_asia)

1. In general, all countries had constant Life Ladder.
2. Life Ladder of Cambodia had a very good positive growth.

### 2. Log GDP per capita

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Log GDP per capita',hue='South-East Asian Countries',data=south_east_asia)

All countries had increased GDP every year.

### 3. Healthy life expectancy

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Healthy life expectancy at birth',hue='South-East Asian Countries',data=south_east_asia)

1. Except Philippines, all countries had little increased in health life expectancy every year.
2. Philippines maintained steady health life expectancy.

### 4. Freedom to make life choices

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Freedom to make life choices',hue='South-East Asian Countries',data=south_east_asia)

All countries got more freedom at the end of 2020 as compared to their previous years.

### 5. Social support

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Social support',hue='South-East Asian Countries',data=south_east_asia)

Most of the countries social support between 0.75 to 0.85 at the end.

### 6. Perceptions of corruption

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Perceptions of corruption',hue='South-East Asian Countries',data=south_east_asia)

1. Singapore had the very low corruption value.
2. Remaining all countries corruption value lies between 0.7 to 0.9

### 7. Generosity

In [None]:
plt.figure(figsize=(20,8))
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
sns.pointplot(x='year',y='Generosity',hue='South-East Asian Countries',data=south_east_asia)

1. At the end, Myanmar had the highest Generosity value.
2. Philipines & Thailand had very low Generosity value from the beginning.

### *Hope you would have learned something from kernal. Please upvote my kernal. Your support mean a lot to me!!!*.

# Thank you