# README File


## Overview
The purpose of this project is to analyze the World Happiness Report 2024 dataset and derive meaningful insights about happiness across different countries. The dataset offers various indicators that contribute to understanding global happiness trends, such as GDP per capita, social support, life expectancy, and more. The data used in this project is sourced from the World Happiness Report ([kaggle dataset source](https://www.kaggle.com/datasets/abdullah0a/world-happiness-data-2024-explore-life)), which is an annual publication that ranks countries based on their happiness levels, considering several socio-economic and environmental factors.


## The Questions

1. Which countries are the happiest and why?

2. Are there geographical trends in happiness?

3. How has happiness evolved globally over the years?

4. Which countries have seen the greatest changes in happiness over time?

5. Which factors have the most significant impact on happiness?

6. What is the relationship between GDP and happiness?

7. How do perceptions of corruption correlate with happiness?

8. How do freedom to make life choices and happiness relate?



## Tools I Used
- **Python**: Primary programming language used.
- **GeoPandas**: For handling and visualizing geographic data.
- **Pandas**: For data manipulation and analysis.
- **Matplotlib**: For data visualization.
- **Jupyter Notebook**: For creating an interactive document.


### Data Preparation and Cleanup
Below is the process used for data preparation and cleanup:

```python
# Importing Libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas as gpd

# Loading Data:
df = pd.read_csv('/kaggle/input/world-happiness-report-2024/World Happiness Report 2024.csv')

# Create copy of dataset
df = df.copy()

# Check for missing data
missing_data = df.isnull().sum()
missing_data[missing_data > 0]

# Data Cleanup:
#The dataset was cleaned by dropping missing values
df_cleaned = df.dropna()

# Check if we have duplicates rows
duplicates = df_cleaned[df_cleaned.duplicated() == True]
print(duplicates)

```

# The Analysis:

#### The analysis is divided into sections based on the questions that guide the project. Each section consists of:

## 1. Which countries are the happiest and why?


### Visualize Data:
```python
# Top 10 Highest Paid Skills for Data Analysts
fig, ax = plt.subplots(figsize=(7, 6))

sns.set_theme(style='ticks')
sns.barplot(
    data = top_10_happiest_countries,
    x = 'Life Ladder Median Value',
    y = top_10_happiest_countries.index,
    hue = 'Life Ladder Median Value',
    ax = ax,
    palette = 'dark:b_r',
    width = 0.7,
    dodge = False
)
ax.legend().remove()

# top_10_happiest_countries[::-1].plot(kind='barh', y='Life Ladder Median Value' , ax=ax , legend=False)
ax.set_title('Top 10 Happiest Countries', pad=11)
ax.set_ylabel('')
ax.set_xlabel('Life Ladder (Median)', labelpad=15)
plt.tight_layout()
plt.show()

```

### Results

![image.png](attachment:image.png)

### Visualize Data:
``` Python
# Plot: Top 10 Happiest Countries with Life Ladder and Social Support
plt.figure(figsize=(12, 6))

# Bar plot for Life Ladder
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Life Ladder'], color='r', label='Life Ladder')
plt.ylabel('Life Ladder', labelpad=13)
plt.xlabel('')
plt.title('Top 10 Happiest Countries and Social Support', pad=12)
plt.xticks(rotation=45)
plt.ylim(0,9)

# Adding bars for Social Support
plt.twinx()  # Create a second y-axis
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Social support'], color='orange', label='Social Support', alpha=0.98, width=0.74)
plt.ylabel('Social Support (Average)', labelpad=15)
plt.ylim(0, 1.4)

# Legends
plt.legend(loc='upper left')
plt.show()

# Plot: Top 10 Happiest Countries with Life Ladder and Log GDP per capita
plt.figure(figsize=(12, 6))

# Bar plot for Life Ladder
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Life Ladder'], color='r', label='Life Ladder')
plt.ylabel('Life Ladder', labelpad=13)
plt.xlabel('')
plt.title('Top 10 Happiest Countries and Log GDP per capita (Median)', pad=12)
plt.xticks(rotation=45)
plt.ylim(0,9)

# Adding bars for Log GDP per capita
plt.twinx()  # Create a second y-axis
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Log GDP per capita'], color='green', label='Log GDP per capita', alpha=0.85, width=0.74)
plt.ylabel('Log GDP per capita', labelpad=15)
plt.ylim(0,16)

# Legends
plt.legend(loc='upper left')
plt.show()

# Plot: Top 10 Happiest Countries with Life Ladder and Healthy Life Expectancy at Birth
plt.figure(figsize=(12, 6))

# Bar plot for Life Ladder
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Life Ladder'], color='r', label='Life Ladder')
plt.ylabel('Life Ladder', labelpad=13)
plt.xlabel('')
plt.title('Top 10 Happiest Countries and Healthy Life Expectancy at Birth (Median)', pad=12)
plt.xticks(rotation=45)
plt.ylim(0,9)

# Adding bars for Healthy Life Expectancy at Birth
plt.twinx()  # Create a second y-axis
sns.barplot(x=top_10_indexed.index, y=top_10_indexed['Healthy life expectancy at birth'], color='purple', label='Healthy Life Expectancy', alpha=0.80, width=0.74)
plt.ylabel('Healthy Life Expectancy at Birth', labelpad=15)
plt.ylim(0,100)

# Legends
plt.legend(loc='upper left')
plt.show()

```

### Results:

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

### Insights:

 
**Top Happiest Countries**:
- The happiest countries are mainly located in Northern Europe, with Finland, Denmark, and Iceland leading the list. This suggests a geographical concentration of high happiness levels in these regions.

**Social Support**:
- Social support is a key factor contributing to happiness in these countries. The happiest countries, such as Finland and Denmark, show very high levels of social support. This means people in these regions can rely on help and support from family, friends, or the community, which positively impacts their well-being.

**Log GDP per Capita**:
- Economic prosperity is also an important factor. Countries like Switzerland and Norway have high GDP per capita values, indicating that financial stability and economic opportunities play a crucial role in supporting happiness.

**Healthy Life Expectancy**:
- All top 10 happiest countries have high healthy life expectancy rates, meaning that people in these regions live longer, healthier lives. This is likely due to good healthcare systems and healthy lifestyles, which contribute to higher satisfaction and quality of life.

**Balanced Factors**:
- The happiest countries show a balance between high social support, strong economic indicators, and good health. This combination of factors suggests that happiness is influenced by a mix of community support, financial security, and well-being.


### 2. Are there geographical trends in happiness?

### Visualize Data:

``` Python
# Plotting
plt.figure(figsize=(17, 17))

# Setting the axis background to blue
ax = plt.gca()
ax.set_facecolor('#ADD8E6')

# Drawing country borders
merged_data.boundary.plot(ax=ax, linewidth=0.6, color='black')

# Using a green-blue color palette
merged_data.plot(column='Life Ladder',
                 ax=ax,
                 legend=True,
                 cmap='summer',  # Color palette
                 legend_kwds={
                     'label': "Life Ladder Score",
                     'orientation': "vertical",  # Vertical legend on the right
                     'shrink': 0.28,  # Shrink the legend size
                     'pad': 0.02,
                     'aspect': 20  # Setting the size of the legend bar
                 },
                 missing_kwds={"color": "lightgrey", "edgecolor": "black", "hatch": "///"}
                )

plt.title('World Happiness - Life Ladder')
plt.tight_layout()
plt.show()

```

### Results:

![image.png](attachment:image.png)

### Insights:

- Northern Europe stands out as the happiest region, with countries like Finland and Denmark having the highest scores, over 7.

- North America and Australia show relatively high happiness, with scores between 6-7, indicating that developed regions tend to have higher happiness.

- Africa and parts of Asia generally have lower happiness levels, with scores around 4-5, suggesting that developing regions face more challenges in terms of well-being.

- South America shows moderate happiness levels, with most countries scoring between 5-6, indicating some regional variation in well-being.

- Some areas like Greenland, China and parts of Africa, lack data, but overall, happiness levels are higher in developed regions and lower in developing ones.


### 3. How has happiness evolved globally over the years?

### Visualize Data:

``` Python
global_happiness_trend = df_cleaned.groupby('year')['Life Ladder'].mean()
plt.plot(global_happiness_trend.index, global_happiness_trend.values)
plt.xlabel('Year')
plt.ylabel('Average Happiness Score')
plt.title('Global Happiness Trend Over the Years')
plt.show()

```

### Results:

![image.png](attachment:image.png)


### Insights:

- **2005**: Happiness level was very high, with a median Life Ladder value around 7.5.

- **2006**: There was a sudden drop in happiness to a median value of 5. This could be due to global issues or crises.

- **2007**-**2019**: Happiness slowly improved with some ups and downs, indicating gradual recovery but with occasional setbacks.

- **2020**-**2023**: A steady rise in happiness, possibly linked to positive changes and adaptation to challenges like the pandemic.

- **2023**: Happiness reached the highest level since 2006, showing a significant global improvement in well-being.

### 4. Which countries have seen the greatest changes in happiness over time?

### Visualize Data:

``` Python
plt.scatter(df_cleaned['Log GDP per capita'], df_cleaned['Life Ladder'])
plt.xlabel('Log GDP per capita')
plt.ylabel('Happiness Score')
plt.title('Relationship between GDP and Happiness')
plt.show()

```

### Results:

![image.png](attachment:image.png)

### Insights:

**Positive Changes**:

- Nicaragua, Bulgaria, and Serbia have seen the largest increases in happiness, with improvements of over 1.5 on the Life Ladder scale. These countries have experienced significant gains in well-being over time.

- Other countries with notable increases include Georgia, Latvia, and Paraguay.

**Negative Changes**:
- On the other hand, Pakistan, Zambia, and Afghanistan have seen the greatest decreases in happiness, with declines close to -2.0 on the Life Ladder scale. These countries have faced significant challenges that have led to a drop in well-being.

- Other countries with significant decreases include Botswana, Venezuela, and Jordan.


### 5. Which factors have the most significant impact on happiness?
### Visualize Data:

``` Python
# Calculate correlation coefficients for relevant factors
correlation_matrix = df_cleaned[['Life Ladder', 'Social support', 'Log GDP per capita', 'Healthy life expectancy at birth']].corr()

correlation_matrix

# Plotting the heatmap of correlations
plt.figure(figsize=(6, 4))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', square=True)
plt.title('Correlation Heatmap of Factors Affecting Happiness', pad=13)
plt.show()

```

### Results:

![image.png](attachment:image.png)

### Insights:


- **Log GDP per capita** has the strongest impact on happiness, with a correlation of 0.79. This means that countries with higher income per person tend to be happier.

- **Healthy life expectancy at birth** is also important, with a correlation of 0.73. Longer, healthier lives lead to higher happiness.

- **Social support** has a strong effect too, with a correlation of 0.72. People who have support from others are generally happier.




### 6. What is the relationship between GDP and happiness?

### Visualize Data:

``` Python
#Scatter plot to show the relationship between Log GDP per capita and Life Ladder
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df_cleaned, x='Log GDP per capita', y='Life Ladder', alpha=0.48)
sns.regplot(data=df_cleaned, x='Log GDP per capita', y='Life Ladder', scatter=False, color='red', line_kws={"linewidth": 2.5, "alpha": 0.3})

plt.title('Relationship Between GDP and Happiness')
plt.xlabel('Log GDP per Capita')
plt.ylabel('Life Ladder')
plt.grid()

# Calculate the correlation coefficient
correlation = df_cleaned['Log GDP per capita'].corr(df_cleaned['Life Ladder'])
plt.text(5.4,7.5, f'Correlation: {correlation:.2f}', fontsize=12, color='red')
plt.show()

```

### Results:

![image.png](attachment:image.png)

### Insights:

- The chart shows a positive relationship between GDP per capita and happiness.

- As GDP goes up, happiness also tends to increase.

- The correlation is 0.79, meaning there is a strong relationship. 

- Countries with more money usually have happier people, but GDP is not the only factor.

- In short, more wealth leads to more happiness, but other things matter too.


### 7. How do perceptions of corruption correlate with happiness?

### Visualize Data:

``` Python
# Scatter plot to show the relationship between Perceptions of Corruption and Life Ladder
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df_cleaned, x='Perceptions of corruption', y='Life Ladder', alpha=0.48)
sns.regplot(data=df_cleaned, x='Perceptions of corruption', y='Life Ladder', scatter=False, color='red', line_kws={"linewidth": 2.5, "alpha": 0.3})

plt.title('Relationship Between Perceptions of Corruption and Happiness')
plt.xlabel('Perceptions of Corruption')
plt.ylabel('Life Ladder')
plt.grid()

# Calculate the correlation coefficient
correlation = df_cleaned['Perceptions of corruption'].corr(df_cleaned['Life Ladder'])
plt.text(0.1, 5.4, f'Correlation: {correlation:.2f}', fontsize=12, color='red')
plt.show()

```

### Results:

![image.png](attachment:image.png)

### Insights:

- The chart shows a negative relationship between perceptions of corruption and happiness.

- As corruption increases, happiness tends to go down.

-  The correlation is -0.45, showing a moderate relationship. This means that in places where people feel there is more corruption, happiness is usually lower. 

- However, corruption is not the only factor affecting happiness. In summary, less corruption generally means more happiness, but it's not the only thing that matters.


### 8. How do freedom to make life choices and happiness relate?


### Visualize Data:
``` Python
# Scatter plot to show the relationship between Freedom to make life choices and Life Ladder
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df_cleaned, x='Freedom to make life choices', y='Life Ladder', alpha=0.48)
sns.regplot(data=df_cleaned, x='Freedom to make life choices', y='Life Ladder', scatter=False, color='red', line_kws={"linewidth": 2.5, "alpha": 0.3})

plt.title('Relationship Between Freedom to Make Life Choices and Happiness', pad=12)
plt.xlabel('Freedom to Make Life Choices', labelpad=13)
plt.ylabel('Life Ladder', labelpad=13)
plt.grid()

# Calculate the correlation coefficient
correlation = df_cleaned['Freedom to make life choices'].corr(df_cleaned['Life Ladder'])
plt.text(0.3, 6.5, f'Correlation: {correlation:.2f}', fontsize=12, color='red')
plt.show()
```

### Results:

![image.png](attachment:image.png)

### Insights:

- The chart shows a positive relationship between freedom to make life choices and happiness.

- As freedom increases, happiness also tends to rise.

- The correlation is 0.53, indicating a moderate relationship. This means that people who feel they have more freedom to make their own choices in life generally report higher levels of happiness.

- However, other factors also play a role, as the correlation is not extremely strong. In summary, more freedom leads to higher happiness, but it’s not the only factor.

## What I Learned

Working on this project helped me understand how different social and economic factors affect the happiness of a country.

I also gained practical experience in creating visualizations and analyzing data, which has helped me improve my data analysis skills.

## Insights

- GDP per capita and social support are strongly correlated with happiness levels across countries.

- Countries with higher life expectancy are generally happier, which suggests that good healthcare is a critical component of happiness.

- Freedom to make life choices and lower perceptions of corruption also positively contribute to happiness.

- Geographic trends indicate that countries in Europe and North America tend to rank higher in happiness compared to those in Africa and parts of Asia.

- There are significant changes in happiness levels over time for some countries, which may reflect economic or political shifts.

## Challenges I Faced

- One major challenge was dealing with missing data, as it required careful consideration of how to handle missing values without introducing systematic errors or reducing the quality of the analysis. I decided to drop missing values for simplicity, but in future analyses, I would consider imputation techniques to make better use of the available data.

- Another challenge was using GeoPandas for the first time. Understanding how to merge geographic data with my dataset, especially ensuring correct matches for country names.

## Conclusion

The analysis of the World Happiness Report 2024 dataset reveals that happiness is influenced by a combination of economic, social, and individual factors. Higher GDP per capita, better social support, longer life expectancy, and freedom of choice are all important contributors to a country's happiness. Moreover, corruption tends to negatively impact happiness levels. These insights suggest that policies promoting economic stability, healthcare, social support, and transparency can potentially enhance the overall happiness of a nation.