# 2024 World Happines Expanatory Data Analysis

## Introduction 
* The World Happines Report is a presents important research on the state of global happiness

## Analysis Content
1. [Pyhton Libraries](#1)
2. [Data Content](#2) 
3. [Read and Analyse Data](#3)
4. [Data Distributions in 2024](#4)
5. [Happiest and Unhacoiest Countries in 2024](#5)
6. [Ladder Score Distribution by Regional Indicator](#6)
7. [Ladder Score Distribution by Countries In Map View](#7)
8. [Most Generous Countries in 2024](#8)
9. [Generous Distribution by Countries In Map View](#9)
10. [Generous Distribution by Regional Indicator in 2024](#10)
11. [Relationship Between Features](#11)

<a id='1'></a>
## Python Libraries

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

import plotly.express as px
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, iplot
plt.style.use("seaborn-notebook")

import warnings
warnings.filterwarnings("ignore")

<a id='2'></a>
## Data Content
* The World Happiness Report is a significant study that examines the state of global happiness. It demonstrates how well-being metrics can be effectively used to assess national progress and explains the differences in personal and national happiness.
   * The columns in this dataset include country name, region, happiness score and its upper-lower bounds, GDP per capita, social support, healthy life expectancy, freedom to make life choices, generosity, perception of corruption, and the dystopia + residual value, which are factors related to happiness. Additionally, national averages of positive and negative emotions are also included.


<a id='1'></a>

<a id='3'></a>
## Read and Anlyse Data

In [None]:
# read data
df = pd.read_csv("/kaggle/input/world-happiness-report-2024-yearly-updated/World-happiness-report-2024.csv")

In [None]:
# read data
df2024 = pd.read_csv("/kaggle/input/world-happiness-report-2024-yearly-updated/World-happiness-report-updated_2024.csv", encoding='ISO-8859-1')


In [None]:
# show first five row of data
df.head()

In [None]:
# describe basic statistics of data
df.describe()

In [None]:
#information about data
df.info()

<a id='4'></a>
## Data Distributions in 2024
* Unique Countries
* Count Regional Indicator
* Distribution of Remaining Features

In [None]:
df.columns

In [None]:
# unique countries
df["Country name"].unique()

In [None]:
# count regional indicator
df["Regional indicator"].unique()

In [None]:
# distribution of feature set  1
list_features=["Ladder score","Log GDP per capita","Generosity"]
sns.boxenplot(data=df.loc[:,list_features],orient="v",palette="Set1")
plt.show()

<a id='5'></a>
## Happiest and Unhacoiest Countries in 2024

In [None]:
df_happiest_unhappiest =df[(df.loc[:,"Ladder score"]>7.5) | (df.loc[:,"Ladder score"]<2)]
sns.barplot(x="Ladder score",y="Country name",data=df_happiest_unhappiest,palette="coolwarm")
plt.title("Happiest and Unhacoiest Countries in 2024")

<a id='6'></a>
## Ladder Score Distribution by Regional Indicator

In [None]:
df_long = df.melt(id_vars=["Regional indicator"], value_vars=["Ladder score"], var_name="Score Type", value_name="Score")

plt.figure(figsize=(15,8))
sns.kdeplot(data=df_long, x="Score", hue="Regional indicator", fill=True,linewidth=2)
plt.axvline(df["Ladder score"].mean(),c="black")
plt.title("Ladder Score Distribution by Regional Indicator")
plt.show()


<a id='7'></a>
## Ladder Score Distribution by Countries In Map View

In [None]:
df2024.head(1)

In [None]:
fig = px.choropleth(df2024.sort_values("year"),
             locations = "Country name",
             color = "Life Ladder",
             locationmode = "country names",
             animation_frame = "year")
fig.update_layout(title = "Life Ladder Comparison by Countries")
fig.show()

<a id='8'></a>
## Most Generous Countries in 2024

In [None]:
dfg=df[df.loc[:, "Generosity"]>0.25]
sns.barplot(x = "Generosity",y = "Country name", data = dfg,palette="coolwarm")
plt.title("Most Generous Countries in 2024")
plt.show()

<a id='9'></a>
## Generous Distribution by Countries In Map View

In [None]:
fig=px.choropleth(df2024.sort_values("year"),
                  locations = "Country name",
             color = "Generosity",
             locationmode = "country names",
             animation_frame = "year")
fig.update_layout(title = "Generosity Comparison by Countries")
fig.show()

<a id='10'></a>
## Generous Distribution by Regional Indicator in 2024

In [None]:
sns.swarmplot(x = "Regional indicator", y = "Generosity", data = df,palette="coolwarm")
plt.xticks(rotation=60)
plt.title("Generous Distribution by Regional Indicator in 2024")
plt.show()

<a id='11'></a>
## Relationship Between Features

In [None]:
numeric_df = df2024.select_dtypes(include=['number'])
sns.heatmap(numeric_df.corr(),annot=True,fmt=".2f",linewidth=.7)
plt.title("Relationship Between Features")

In [None]:
numeric_df = df2024.select_dtypes(include=['number'])
sns.clustermap(numeric_df.corr(), center=0, cmap="vlag", dendrogram_ratio=(0.1, 0.2), annot=True, linewidths=.7, figsize=(12, 12))