# World Happiness
**Team:** Data Optimists  
**Dataset:** Kaggle World Happiness Report 2021  
**Source:** https://www.kaggle.com/ajaypalsinghlo/world-happiness-report-2021



--------------
### Introduction and Data loading

--------------

The **World Happiness Report** is a landmark survey of the state of global happiness, ranking countries by how happy their citizens perceive themselves to be.

Drawing on data from the Gallup World Poll and other sources, the report evaluates well-being using a variety of social, economic, and health-related factors, including GDP per capita, social support, healthy life expectancy, freedom, generosity, and perceptions of corruption.

We chose this dataset because it provides a unique, multidimensional view of human well-being across the globe. 
Unlike traditional economic or health datasets, the World Happiness Report integrates subjective measures of life satisfaction with objective indicators, offering a holistic perspective on what drives happiness at both the individual and societal levels. 

Its annual publication and global coverage make it a rich resource for exploring trends, disparities, and the impact of policy and culture on quality of life.

Analyzing this data is significant for several reasons:

- **Policy Insights:** Understanding the drivers of happiness can inform better policy decisions, targeting not just economic growth but also social cohesion, health, and governance.
- **Global Comparisons:** The dataset enables meaningful comparisons between countries and regions, highlighting best practices and areas for improvement.
- **Temporal Trends:** With data spanning multiple years, we can examine how happiness evolves over time and in response to global events.
- **Interdisciplinary Value:** The report bridges economics, psychology, sociology, and public health, making it valuable for a wide range of research and practical applications.

By visualizing and analyzing this dataset, we aim to uncover patterns and insights that can contribute to a deeper understanding of well-being worldwide, and inspire data-driven approaches to improving happiness at scale.

In [89]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from sklearn.cluster import DBSCAN
from sklearn.linear_model import LinearRegression
from math import pi
import zipfile
import warnings

warnings.filterwarnings('ignore')



# Define global font configuration
import plotly.io as pio
pio.templates["custom"] = pio.templates["plotly"].update(
    layout={
        "font": {
            "family": "Times New Roman",
            "size": 14
        }
    }
)

# Apply this template globally
pio.templates.default = "custom"

# ==================== DATA LOADING & PREPARATION ====================

# Load datasets
with zipfile.ZipFile('world-happiness-report.zip') as z:
    with z.open('world-happiness-report/world-happiness-report.csv') as f:
        df_all = pd.read_csv(f)
    with z.open('world-happiness-report/world-happiness-report-2021.csv') as g:
         df_2021 = pd.read_csv(g)

# Standardize column names
df_2021.columns = df_2021.columns.str.lower().str.replace(' ', '_')
df_all.columns = df_all.columns.str.lower().str.replace(' ', '_')

# Column mapping for consistency
column_mapping = {
    'country_name': 'country',
    'life_ladder': 'happiness_score',
    'ladder_score': 'happiness_score',
    'regional_indicator': 'region'
}

df_2021.rename(columns=column_mapping, inplace=True)
df_all.rename(columns=column_mapping, inplace=True)

# Find GDP column (handles different naming conventions)
gdp_col = None
for col in df_2021.columns:
    if 'gdp' in col.lower() and 'capita' in col.lower():
        gdp_col = col
        df_2021.rename(columns={col: 'gdp_per_capita'}, inplace=True)
        break

# Merge regional data
region_map = df_2021[["country", "region"]].drop_duplicates()
df_all = df_all.merge(region_map, on="country", how="left")

In [90]:
# Display a preview of sample columns from df_2021
sample_columns = ["country", "region", "happiness_score", "gdp_per_capita", "social_support", "healthy_life_expectancy"]
df_2021[sample_columns].head()

Unnamed: 0,country,region,happiness_score,gdp_per_capita,social_support,healthy_life_expectancy
0,Finland,Western Europe,7.842,10.775,0.954,72.0
1,Denmark,Western Europe,7.62,10.933,0.954,72.7
2,Switzerland,Western Europe,7.571,11.117,0.942,74.4
3,Iceland,Western Europe,7.554,10.878,0.983,73.0
4,Netherlands,Western Europe,7.464,10.932,0.942,72.4


#### Creating Income Tiers Based on GDP per Capita

- To enable richer analysis and visualization, we categorize each country into an "income tier" based on its GDP per capita. 

- This allows us to compare happiness and other factors across economic strata, not just by country or region. 

- We use quantile-based binning (`qcut`) to divide countries into four groups: Low, Medium, High, and Very High income. 

- This approach ensures each tier contains a roughly equal number of countries, making comparisons fair and meaningful.

- If there are too few valid GDP values or an error occurs, we fall back to manual quantile binning.


In [91]:
# ==================== INCOME TIER ====================

if 'gdp_per_capita' in df_2021.columns:
    try:
        # Convert to numeric and clean
        gdp_clean = pd.to_numeric(df_2021['gdp_per_capita'], errors='coerce').dropna()
        
        if len(gdp_clean) >= 4:
            # Convert to basic Python list (always 1D)
            gdp_data = gdp_clean.tolist()
            
            # Perform qcut with absolute 1D guarantee
            tiers = pd.qcut(
                gdp_data,
                q=4,
                labels=['Low', 'Medium', 'High', 'Very High'],
                duplicates='drop'
            )
            
            # Assign back to dataframe
            df_2021['income_tier'] = pd.Series(tiers, index=gdp_clean.index)
            print("Income tiers successfully created!")
        else:
            print(f"Warning: Only {len(gdp_clean)} valid GDP values (need ≥4)")
    except Exception as e:
        print(f"Income tier fallback: Using manual quantiles")
        quantiles = gdp_clean.quantile([0, 0.25, 0.5, 0.75, 1.0])
        df_2021['income_tier'] = pd.cut(
            df_2021['gdp_per_capita'],
            bins=quantiles,
            labels=['Low', 'Medium', 'High', 'Very High'],
            include_lowest=True
        )

Income tiers successfully created!


---------
# Overview

---------

This section provides a high-level summary of the World Happiness Report dataset and the analytical approach used in this notebook. Here, we introduce the main variables, describe the structure of the data, and outline the key steps taken to prepare it for analysis.

- **Dataset:** The World Happiness Report combines survey-based happiness scores with a range of economic, social, and health indicators for countries worldwide.
- **Key Variables:** Includes happiness score, GDP per capita, social support, healthy life expectancy, freedom to make life choices, generosity, and perceptions of corruption.
- **Data Preparation:** The data was cleaned, standardized, and merged to ensure consistency across years and regions. Countries were also categorized into income tiers for deeper analysis.
- **Purpose:** This overview sets the stage for exploring global happiness patterns, regional differences, and the factors that drive well-being across the world.

In [92]:
# Animated Happiness Over Time (larger size)
if all(col in df_all.columns for col in ["country", "year", "happiness_score"]):
    df_anim = df_all[(df_all["year"] >= 2008) & (df_all["year"] <= 2022)][["country", "year", "happiness_score"]].dropna()
    fig2 = px.choropleth(
        df_anim,
        locations="country",
        locationmode="country names",
        color="happiness_score",
        animation_frame="year",
        title="Global Happiness Over Time (2008-2021)",
        color_continuous_scale="Plasma",
        width=1200,
        height=700
    )
    fig2.show()

------
# 1. Regional Contributions
------

## Regional Contributors: Understanding Regional Patterns in Happiness

The "Regional Contributors" section explores how happiness scores and their underlying factors vary across different regions of the world. This analysis helps us identify not only which regions are the happiest, but also the unique social, economic, and cultural drivers behind these patterns.

### Key Visualizations

- **Boxplot of Happiness by Region:**  
    This plot shows the distribution of happiness scores within each region. It highlights both the median happiness and the spread (inequality) within regions, revealing which areas have consistently high or low happiness and where disparities are greatest.

- **Treemap of Regional and Country Contributions:**  
    The treemap visualizes the hierarchical structure of happiness, starting from regions and drilling down to individual countries. The size and color of each box represent the happiness score, making it easy to spot regional leaders and laggards.

### Insights from the Analysis

- **Regional Differences:**  
    Western Europe and North America & ANZ (Australia and New Zealand) consistently rank at the top, with high median happiness and relatively low internal disparities. Sub-Saharan Africa and South Asia, on the other hand, tend to have lower average scores and greater variability.

- **Drivers of Regional Happiness:**  
    The factors contributing to happiness—such as GDP per capita, social support, and healthy life expectancy—also vary by region. Wealthier regions benefit from higher economic security and stronger social networks, while regions with lower scores often face challenges related to governance, health, or economic instability.

- **Inequality Within Regions:**  
    The boxplot reveals that some regions, like Latin America and the Caribbean, have a wide range of happiness scores, indicating significant differences between countries within the same region.

- **Policy Implications:**  
    Understanding regional contributors allows policymakers to tailor interventions. For example, boosting social support may be more effective in some regions, while improving governance or health infrastructure could be key elsewhere.

### Why Regional Analysis Matters

- **Contextual Understanding:**  
    Happiness is not determined by a single factor but by a combination of influences that differ by region. Regional analysis uncovers these nuances.

- **Benchmarking and Best Practices:**  
    By comparing regions, we can identify successful policies and cultural practices that could be adapted elsewhere.

- **Targeted Solutions:**  
    Recognizing regional strengths and weaknesses enables more effective, context-sensitive strategies to improve well-being.

---

**In summary:**  
The regional contributors section provides a comprehensive look at how happiness is distributed around the world, what drives these differences, and how this knowledge can inform better policy and societal outcomes.

In [93]:
# Regional Distribution (larger size)
if all(col in df_2021.columns for col in ["region", "happiness_score"]):
    fig5 = px.box(
        df_2021,
        x="region",
        y="happiness_score",
        title="Happiness Score Distribution by Region",
        width=1200,
        height=700
    )
    fig5.show()

In [94]:
# Treemap (larger size)
if all(col in df_2021.columns for col in ["region", "country", "happiness_score"]):
    fig9 = px.treemap(
        df_2021,
        path=["region", "country"],
        values="happiness_score",
        title="Region and Country Happiness Contribution",
        width=1200,
        height=700
    )
    fig9.show()

-----
# 2. Happiness Factors, in general
----- 

### Understanding the Main Factors Behind Happiness

The World Happiness Report evaluates well-being using a set of key factors, each representing a different dimension of what contributes to life satisfaction. Here’s a detailed look at each:

##### 1. **GDP per Capita**
- **What it measures:** The economic output per person, adjusted for purchasing power.
- **Why it matters:** Higher income generally provides greater access to resources, security, and opportunities, which can improve quality of life. However, the effect of income on happiness tends to diminish at higher levels.

##### 2. **Social Support**
- **What it measures:** The extent to which people feel they have someone to count on in times of trouble.
- **Why it matters:** Strong social networks and supportive relationships are consistently linked to higher well-being, resilience, and lower stress.

##### 3. **Healthy Life Expectancy**
- **What it measures:** The average number of years a person can expect to live in good health.
- **Why it matters:** Good health is foundational to happiness, enabling people to pursue goals, enjoy life, and participate in society.

##### 4. **Freedom to Make Life Choices**
- **What it measures:** The perceived freedom individuals have to make important life decisions.
- **Why it matters:** Autonomy and the ability to shape one’s own life are crucial for well-being and personal fulfillment.

##### 5. **Generosity**
- **What it measures:** The tendency of people to donate to charity or help others.
- **Why it matters:** Acts of kindness and generosity can foster social bonds and a sense of purpose, though the impact varies by culture and context.

##### 6. **Perceptions of Corruption**
- **What it measures:** The degree to which people believe corruption is widespread in government and business.
- **Why it matters:** Trust in institutions and fairness in society are important for collective well-being. High corruption undermines trust and can lead to dissatisfaction.

#### #7. **Happiness Score (Life Ladder)**
- **What it measures:** An individual’s self-reported assessment of their life, typically on a scale from 0 (worst) to 10 (best).
- **Why it matters:** This is the primary outcome variable, reflecting overall life satisfaction as perceived by individuals.

---

**Interplay of Factors:**  
No single factor fully explains happiness. Instead, these dimensions interact—economic security, social support, health, freedom, generosity, and trust together shape the overall well-being of societies. The relative importance of each factor can vary by country, culture, and over time.

In [95]:
# PARALLEL COORDINATES PLOT
if all(col in df_2021.columns for col in ["happiness_score", "gdp_per_capita"]):
    try:
        # Define and validate metrics
        possible_metrics = [
            "gdp_per_capita",
            "social_support",
            "healthy_life_expectancy",
            "freedom_to_make_life_choices",
            "generosity",
            "perceptions_of_corruption",
            "happiness_score"
        ]
        available_metrics = [m for m in possible_metrics if m in df_2021.columns]
        
        if len(available_metrics) >= 2:
            # Create clean numeric dataframe with explicit 1D columns
            plot_data = df_2021[available_metrics].copy()
            for col in available_metrics:
                plot_data[col] = pd.to_numeric(plot_data[col], errors='coerce')
            plot_data = plot_data.dropna()
            
            if len(plot_data) > 0:
                fig10 = px.parallel_coordinates(
                    plot_data,
                    dimensions=available_metrics,
                    color="happiness_score",
                    color_continuous_scale="Tealrose",
                    title="Multidimensional View of Happiness Factors",
                    width=1200,
                    height=700
                )
                fig10.show()
            else:
                print("Parallel coordinates: No valid data after cleaning")
        else:
            print(f"Parallel coordinates: Need at least 2 metrics, found {len(available_metrics)}")
    except Exception as e:
        print(f"Parallel coordinates visualization failed: {str(e)}")

In [96]:
# Top 10 Countries' Factor Contributions
if all(col in df_2021.columns for col in ["country", "happiness_score"]):
    factors = [c for c in df_2021.columns if 'explained_by:' in c]
    if factors:
        top_10 = df_2021.sort_values("happiness_score", ascending=False).head(10)
        fig3 = px.bar(top_10, x="country", y=factors, 
                     title="Top 10 Countries: Contribution of Factors", 
                     barmode="stack",
                     width=1200,
        height=700)
        fig3.show()

In [97]:
# Factor Contribution Over Time
if all(col in df_all.columns for col in ['year', 'social_support', 'freedom_to_make_life_choices', 'generosity']):
    try:
        factors = ['social_support', 'freedom_to_make_life_choices', 'generosity']
        factors = [f for f in factors if f in df_all.columns]
        if factors:
            regional_factors = df_all.groupby('year')[factors].mean().reset_index()
            fig22 = px.area(regional_factors, x='year', y=factors, 
                          title='Changing Drivers of Happiness',
                          width=1200,
        height=700)
            fig22.show()
    except Exception as e:
        print(f"Visualization 3b failed: {str(e)}")

In [98]:
# Correlation Heatmap (larger size)
corr_cols = ["gdp_per_capita", "social_support", "healthy_life_expectancy",
            "freedom_to_make_life_choices", "generosity", "perceptions_of_corruption", 
            "happiness_score"]
if all(col in df_2021.columns for col in corr_cols):
    corr = df_2021[corr_cols].corr()
    fig6 = px.imshow(
        corr,
        text_auto=True,
        title="Correlation Between Happiness Factors",
        width=1200,
        height=700
    )
    fig6.show()

-----
# 3. Compare Regions

-----

This section explores inequality in happiness and its underlying factors across countries and regions. The visualizations here help us understand not just average happiness, but also how it is distributed and how it changes over time.

**Key Visualizations:**

- **Correlation Heatmap (fig6):**  
    This heatmap shows the relationships between key happiness factors such as GDP per capita, social support, healthy life expectancy, freedom, generosity, and perceptions of corruption. Strong positive or negative correlations highlight which factors tend to move together, revealing the interconnected nature of well-being and inequality.

- **Rank Change Slope Chart (fig7):**  
    This slope chart tracks how the happiness rankings of selected countries have changed from 2005 to 2020. It visually highlights countries that have improved or declined in happiness rank, illustrating dynamic shifts and growing or shrinking inequalities between nations.

- **Radar Chart Comparison (fig8):**  
    The radar chart compares the profiles of two countries (e.g., United States and South Africa) across multiple happiness factors. This visualization makes it easy to see where countries differ most, and which factors contribute to their relative happiness or inequality.

- **Treemap of Regional and Country Contributions (fig9):**  
    The treemap displays the hierarchical structure of happiness scores, starting from regions and drilling down to individual countries. The size and color of each box represent the happiness score, making it easy to spot both regional leaders and countries that are outliers within their region.

**Why These Visualizations Matter:**

- They reveal not just which countries are happiest, but also the spread and drivers of happiness within and between regions.
- By examining correlations and changes over time, we can identify persistent inequalities and emerging trends.
- Comparing countries side-by-side highlights the multidimensional nature of happiness and the importance of addressing multiple factors to reduce inequality.

Together, these visualizations provide a comprehensive overview of global happiness inequality, its drivers, and its evolution over time.

In [99]:
# Happiness Inequality Radar
if all(col in df_all.columns for col in ['region', 'social_support', 'freedom_to_make_life_choices']):
    try:
        factors = ['social_support', 'freedom_to_make_life_choices', 'generosity']
        factors = [f for f in factors if f in df_all.columns]
        if factors:
            regional_std = df_all.groupby('region')[factors].std().reset_index()
            
            fig18 = px.line_polar(regional_std, 
                                r=factors[0],
                                theta='region',
                                line_close=True,
                                title='Regional Inequality in Key Factors',
                                template='plotly_dark',
                                width=1200,
        height=700)
            fig18.show()
    except Exception as e:
        print(f"Visualization [Regional Inequality in Key Factors] failed: {str(e)}")

In [100]:

# Radar Chart Comparison
if all(col in df_2021.columns for col in ["country", "social_support", "freedom_to_make_life_choices"]):
    metrics = ["gdp_per_capita", "social_support", "healthy_life_expectancy",
              "freedom_to_make_life_choices", "generosity", "perceptions_of_corruption"]
    metrics = [m for m in metrics if m in df_2021.columns]
    if metrics:
        compare = df_2021[df_2021["country"].isin(["United States", "South Africa"])]
        fig8 = go.Figure()
        for _, row in compare.iterrows():
            fig8.add_trace(go.Scatterpolar(
                r=row[metrics], 
                theta=metrics, 
                fill='toself', 
                name=row["country"]
            ))
        fig8.update_layout(polar=dict(radialaxis=dict(visible=True)), 
                          title="Happiness Factor Comparison",
                          width=1200,
        height=700)
        fig8.show()

-----
# 4. Look into specific Hapiness Factors

-----

## Income Tier Visualizations

The income tier visualizations provide a detailed look at how economic status, as measured by GDP per capita, relates to happiness and its underlying factors across countries. Here’s what these visualizations show and how to interpret them:

#### 1. **Income Tier Assignment**

- Countries are grouped into four income tiers—**Low**, **Medium**, **High**, and **Very High**—using quantile-based binning (`qcut`) on GDP per capita.
- This ensures each tier contains a similar number of countries, allowing for fair comparisons.

#### 2. **Parallel Categories Plot**

- **What it shows:**  
    This interactive plot displays the relationships between a country's happiness level (Low, Medium, High), its region, and its income tier.
- **How to read it:**  
    - Each vertical axis represents a categorical variable (happiness level, region, income tier).
    - The colored bands show how countries flow between categories, e.g., which regions have more countries in the "Very High" income tier and "High" happiness level.
    - The color intensity reflects the happiness score, making it easy to spot patterns (e.g., most "Very High" income countries cluster in "High" happiness).

#### 3. **Sunburst Chart**

- **What it shows:**  
    The sunburst chart visualizes the hierarchical breakdown of happiness scores by region and income tier.
- **How to read it:**  
    - The center represents regions, with each ring moving outward to income tiers.
    - The size of each segment corresponds to the total happiness score for that group.
    - The color gradient indicates the average happiness score, highlighting which regions and income tiers are happiest.

#### **Key Insights from the Visualizations**

- **Economic Status Matters:**  
    Higher income tiers generally correspond to higher happiness scores, but there are exceptions—some lower-income countries outperform their peers.
- **Regional Variation:**  
    The impact of income on happiness varies by region. For example, some regions with lower average GDP still achieve moderate happiness due to strong social support or other factors.
- **Intersectionality:**  
    By combining region and income tier, these visualizations reveal nuanced patterns—such as regions where income is less predictive of happiness, or where disparities are greatest.

#### **Why These Visualizations Are Important**

- They move beyond simple country rankings to show how economic and regional contexts interact.
- Policymakers can identify which income tiers or regions need targeted interventions.
- Researchers can explore why some countries "punch above their weight" in happiness despite lower income.

---

**In summary:**  
Income tier visualizations help us understand the complex relationship between wealth, geography, and well-being, highlighting both global trends and unique outliers.

In [101]:
# Interactive Happiness Diagnostic Tool
if all(col in df_2021.columns for col in ['happiness_score', 'region', 'income_tier']):
    try:
        df_2021['happiness_level'] = pd.cut(df_2021['happiness_score'], bins=3,
                                          labels=['Low', 'Medium', 'High'])
        fig25 = px.parallel_categories(df_2021,
                                     dimensions=['happiness_level', 'region', 'income_tier'],
                                     color='happiness_score',
                                     title='Happiness Profile Explorer',
                                     width=1200,
        height=700)
        fig25.show()
    except Exception as e:
        print(f"Visualization 10b failed: {str(e)}")

print("Dashboard generation complete")

Dashboard generation complete


In [102]:
# Happiness Demographic Breakdown
if all(col in df_2021.columns for col in ['region', 'income_tier', 'happiness_score']):
    try:
        fig21 = px.sunburst(df_2021, path=['region', 'income_tier'], 
                          values='happiness_score', color='happiness_score',
                          title='Happiness by Region & Income Tier',
                          width=1200,
        height=700)
        fig21.show()
    except Exception as e:
        print(f"Visualization 21 failed: {str(e)}")

---------
# 5. Conclussion

---------

The analysis of the World Happiness Report 2021 provides a comprehensive view of global well-being, revealing how happiness is shaped by a complex interplay of economic, social, and institutional factors. By leveraging advanced visualizations and statistical techniques, we have uncovered key patterns and insights that can inform both policy and personal understanding.

## Key Takeaways

- **Happiness is Multidimensional:**  
    No single factor determines happiness. Instead, it emerges from a combination of GDP per capita, social support, healthy life expectancy, freedom, generosity, and perceptions of corruption. The correlation heatmap (fig6) highlights strong positive relationships between economic/social factors and happiness, and negative correlations with corruption.

- **Regional Disparities Persist:**  
    Western Europe and North America & ANZ consistently top the happiness rankings, while regions like Sub-Saharan Africa and South Asia face greater challenges. However, there are outliers—some countries achieve higher happiness than their income or region might predict, often due to strong social support or governance.

- **Trends Over Time:**  
    The slope chart (fig7) shows that happiness rankings are not static. Some countries have improved significantly, while others have declined, reflecting the impact of policy, economic changes, and global events.

- **Income Matters, But Isn't Everything:**  
    Higher income tiers generally correspond to higher happiness, but the relationship is not absolute. Social and institutional factors can compensate for lower income in some contexts.

- **Inequality Within and Between Regions:**  
    Radar and box plots reveal that disparities in happiness and its drivers exist not only between regions but also within them. Addressing these inequalities requires targeted, context-sensitive interventions.

## What We Learned

- **Policy Implications:**  
    Effective policies to boost happiness must address multiple dimensions—improving economic conditions, strengthening social support networks, promoting health, and building trust in institutions.

- **Importance of Social Support and Trust:**  
    Countries with strong social support and low corruption often outperform their peers, even at similar income levels. Investing in social capital and good governance pays dividends in well-being.

- **Dynamic Nature of Happiness:**  
    Happiness is not fixed; it responds to changes in society, economy, and governance. Monitoring trends over time is crucial for understanding the impact of interventions and global events.

- **Value of Data-Driven Insights:**  
    Combining subjective well-being data with objective indicators enables a richer, more actionable understanding of what drives happiness. This approach can guide both national policy and international development efforts.

---

**In summary:**  
The World Happiness Report teaches us that well-being is a holistic, dynamic phenomenon shaped by both material and immaterial factors. By understanding and acting on these insights, societies can make meaningful progress toward greater happiness for all.

In [103]:
# Rank Change Slope Chart (larger size)
if all(col in df_all.columns for col in ["country", "year", "happiness_score"]):
    years_available = sorted(df_all['year'].unique())
    if len(years_available) >= 2:
        year1, year2 = years_available[0], years_available[-1]
        df_rank = df_all[df_all["year"].isin([year1, year2])].copy()
        df_rank["rank"] = df_rank.groupby("year")["happiness_score"].rank(ascending=False)
        pivot = df_rank.pivot(index="country", columns="year", values="rank").dropna()
        pivot = pivot.reset_index().sort_values(year2)
        fig7 = go.Figure()
        for _, row in pivot.iterrows():
            fig7.add_trace(go.Scatter(
                x=[year1, year2], 
                y=[row[year1], row[year2]], 
                mode='lines+markers+text',
                text=[None, row["country"]], 
                name=row["country"]
            ))
        fig7.update_layout(
            title=f"Rank Changes from {year1} to {year2}", 
            xaxis_title="Year", 
            yaxis_title="Rank", 
            yaxis_autorange="reversed",
            width=1200,
            height=700
        )
        fig7.show()

In [104]:
# Regional Happiness Trends (larger size)
if all(col in df_all.columns for col in ["region", "year", "happiness_score"]):
    regional_avg = df_all.groupby(["region", "year"])["happiness_score"].mean().reset_index()
    fig4 = px.line(
        regional_avg,
        x="year",
        y="happiness_score",
        color="region",
        title="Regional Happiness Trends",
        width=1200,
        height=700
    )
    fig4.show()

In [105]:
# Predictive Happiness Trajectories (larger size)
if all(col in df_all.columns for col in ['country', 'year', 'happiness_score']):
    try:
        countries = [
            'Finland', 'Denmark', 'Afghanistan', 'Zimbabwe', 'North America and ANZ',
            'United States', 'Canada', 'South Africa'
        ]
        projections = []
        
        for country in countries:
            country_data = df_all[df_all['country'] == country][['year', 'happiness_score']].dropna()
            if len(country_data) >= 3:
                model = LinearRegression()
                X = country_data['year'].values.reshape(-1, 1)
                y = country_data['happiness_score'].values
                model.fit(X, y)
                
                for future_year in [2022, 2025, 2030]:
                    pred = model.predict([[future_year]])
                    projections.append({
                        'country': country,
                        'year': future_year,
                        'happiness_score': pred[0],
                        'type': 'projected'
                    })
        
        if projections:
            historical = df_all[df_all['country'].isin(countries)][['country', 'year', 'happiness_score']]
            historical['type'] = 'historical'
            combined = pd.concat([historical, pd.DataFrame(projections)])
            
            fig19 = px.line(
                combined,
                x='year',
                y='happiness_score',
                color='country',
                line_dash='type',
                title='Happiness Projections for Selected Countries',
                width=1200,
                height=700
            )
            fig19.show()
    except Exception as e:
        print(f"Visualization 19 failed: {str(e)}")