## Project 2 – CO₂ Emissions and Happiness

Our goal is to visualize the relationship between “CO₂ emissions per capita” and “happiness scores” across countries in a given year, and discuss patterns.


### Part1: Code

In [67]:
import pandas as pd
import plotly.express as px

In [68]:
# Read WDI data (CSV)
WDI = pd.read_csv("WDICSV.csv")
pd.read_csv("WDICSV.csv")

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2015,2016,2017,2018,2019,2020,2021,2022,2023,2024
0,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.ZS,,,,,,,...,18.001597,18.558234,19.043572,19.586457,20.192064,20.828814,21.372164,22.100884,,
1,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.RU.ZS,,,,,,,...,7.096003,7.406706,7.666648,8.020952,8.403358,8.718306,9.097176,9.473374,,
2,Africa Eastern and Southern,AFE,Access to clean fuels and technologies for coo...,EG.CFT.ACCS.UR.ZS,,,,,,,...,38.488233,38.779953,39.068462,39.445526,39.818645,40.276374,40.687817,41.211606,,
3,Africa Eastern and Southern,AFE,Access to electricity (% of population),EG.ELC.ACCS.ZS,,,,,,,...,33.922276,38.859598,40.223744,43.035073,44.390861,46.282371,48.127211,48.801258,50.668330,
4,Africa Eastern and Southern,AFE,"Access to electricity, rural (% of rural popul...",EG.ELC.ACCS.RU.ZS,,,,,,,...,16.527554,24.627753,25.432092,27.061929,29.154282,31.022083,32.809138,33.783960,35.375216,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403251,Zimbabwe,ZWE,Women who believe a husband is justified in be...,SG.VAW.REFU.ZS,,,,,,,...,14.500000,,,,,,,,,
403252,Zimbabwe,ZWE,Women who were first married by age 15 (% of w...,SP.M15.2024.FE.ZS,,,,,,,...,3.700000,,,,5.400000,,,,,
403253,Zimbabwe,ZWE,Women who were first married by age 18 (% of w...,SP.M18.2024.FE.ZS,,,,,,,...,32.400000,,,,33.700000,,,,,
403254,Zimbabwe,ZWE,Women's share of population ages 15+ living wi...,SH.DYN.AIDS.FE.ZS,,,,,,,...,58.687901,58.916636,59.131787,59.318579,59.495248,59.675019,59.832577,59.955283,60.053675,60.155308


In [69]:
# Extract CO₂ emissions per capita from WDI
CO2 = WDI[WDI["Indicator Code"] == "EN.GHG.CO2.PC.CE.AR5"] 

# Drop columns that are not years (keep Country Name + year columns)
CO2 = CO2.drop(columns=["Country Code","Indicator Name","Indicator Code"])
CO2

Unnamed: 0,Country Name,1960,1961,1962,1963,1964,1965,1966,1967,1968,...,2015,2016,2017,2018,2019,2020,2021,2022,2023,2024
176,Africa Eastern and Southern,,,,,,,,,,...,1.043373,1.026116,1.012873,0.997157,0.983492,0.845417,0.863533,0.816361,0.784641,
1692,Africa Western and Central,,,,,,,,,,...,0.499587,0.506323,0.495060,0.509713,0.518890,0.497159,0.512749,0.505462,0.482217,
3208,Arab World,,,,,,,,,,...,4.796053,4.743015,4.718876,4.600948,4.572472,4.272909,4.400698,4.403588,4.380929,
4724,Caribbean small states,,,,,,,,,,...,10.302975,9.744127,9.310695,9.244274,9.594039,8.731591,8.512252,8.317394,8.178227,
6240,Central Europe and the Baltics,,,,,,,,,,...,6.796332,6.919925,7.204385,7.189206,6.834593,6.438835,7.001686,6.884049,6.262062,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
395852,Virgin Islands (U.S.),,,,,,,,,,...,0.001857,0.001860,0.001864,0.001869,0.001875,0.001882,0.001889,0.001897,0.001906,
397368,West Bank and Gaza,,,,,,,,,,...,,,,,,,,,,
398884,"Yemen, Rep.",,,,,,,,,,...,0.457352,0.331022,0.314050,0.356586,0.355001,0.318360,0.310938,0.292772,0.276684,
400400,Zambia,,,,,,,,,,...,0.324408,0.343080,0.405772,0.448225,0.379538,0.392310,0.396080,0.387388,0.388873,


In [70]:
# Reshape from wide to long: each row becomes (Country Name, Year, CO2)
CO2_long = CO2.melt(
    id_vars="Country Name",
    var_name="Year",
    value_name="CO2"
)
# Remove rows with missing CO2 values and convert Year from string to integer
CO2_long = CO2_long.dropna()
CO2_long["Year"] = CO2_long["Year"].astype(int)
CO2_long


Unnamed: 0,Country Name,Year,CO2
2660,Africa Eastern and Southern,1970,1.355303
2661,Africa Western and Central,1970,0.347427
2662,Arab World,1970,2.247111
2663,Caribbean small states,1970,3.540789
2664,Central Europe and the Baltics,1970,8.960268
...,...,...,...
17018,Viet Nam,2023,3.716396
17019,Virgin Islands (U.S.),2023,0.001906
17021,"Yemen, Rep.",2023,0.276684
17022,Zambia,2023,0.388873


In [71]:
# Read World Happiness Report data (Excel)
WHR25 = pd.read_excel("WHR25_Data_Figure_2.1v3.xlsx")
pd.read_excel("WHR25_Data_Figure_2.1v3.xlsx")

Unnamed: 0,Year,Rank,Country name,Life evaluation (3-year average),Lower whisker,Upper whisker,Explained by: Log GDP per capita,Explained by: Social support,Explained by: Healthy life expectancy,Explained by: Freedom to make life choices,Explained by: Generosity,Explained by: Perceptions of corruption,Dystopia + residual
0,2024,147,Afghanistan,1.364,1.301,1.427,0.649,0.0,0.155,0.0,0.075,0.135,0.348
1,2023,143,Afghanistan,1.721,1.667,1.775,0.628,0.0,0.242,0.0,0.091,0.088,0.672
2,2022,137,Afghanistan,1.859,1.795,1.923,0.645,0.0,0.087,0.0,0.093,0.059,0.976
3,2021,146,Afghanistan,2.404,2.339,2.469,0.758,0.0,0.289,0.0,0.089,0.005,1.263
4,2020,150,Afghanistan,2.523,2.449,2.596,0.370,0.0,0.126,0.0,0.122,0.010,1.895
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1964,2016,138,Zimbabwe,3.875,,,,,,,,,
1965,2015,131,Zimbabwe,4.193,,,,,,,,,
1966,2014,115,Zimbabwe,4.610,,,,,,,,,
1967,2012,103,Zimbabwe,4.827,,,,,,,,,


In [72]:
# Keep only the three relevant columns
WHR25_small = WHR25[["Year", "Country name", "Life evaluation (3-year average)"]]
WHR25_small = WHR25_small.sort_values(by="Year")
WHR25_small

Unnamed: 0,Year,Country name,Life evaluation (3-year average)
1968,2011,Zimbabwe,3.978
953,2011,Lao PDR,5.161
293,2011,Cameroon,4.376
1768,2011,Togo,3.007
306,2011,Canada,7.499
...,...,...,...
722,2024,Hungary,5.915
735,2024,Iceland,7.515
1464,2024,Romania,6.563
1500,2024,Saudi Arabia,6.600


In [73]:
# Choose the year for analysis
YEAR = 2020  # feel free to change to 2018, 2019, 2021, etc.

# Subset happiness data for that year
WHR_year = WHR25_small[WHR25_small["Year"] == YEAR]

# Subset CO2 data for that year
CO2_year = CO2_long[CO2_long["Year"] == YEAR]

# Merge happiness and CO2 datasets
merged = pd.merge(
    WHR_year,
    CO2_year,
    left_on=["Country name", "Year"],
    right_on=["Country Name", "Year"],
    how="inner"
)

# Drop duplicated country name column
merged = merged.drop(columns=["Country Name"])

merged.head()


Unnamed: 0,Year,Country name,Life evaluation (3-year average),CO2
0,2020,Hungary,5.992,5.067381
1,2020,Bangladesh,5.025,0.616108
2,2020,Israel,7.157,6.676835
3,2020,Kenya,4.607,0.367857
4,2020,United Kingdom,7.064,4.779676


In [74]:
import plotly.express as px

fig = px.scatter(
    merged,
    x="CO2",
    y="Life evaluation (3-year average)",
    hover_name="Country name",
    labels={
        "CO2": f"CO₂ emissions per capita ({YEAR})",
        "Life evaluation (3-year average)": f"Happiness score ({YEAR})"
    },
    title=f"CO₂ Emissions vs Happiness Across Countries in {YEAR}"
)

fig.show()


### Part2: Interpretation

Overall pattern:
  There is no simple one-to-one relationship between CO₂ emissions per capita and happiness. Some high-emission countries do not necessarily have high happiness scores, while some countries with moderate emissions (for example, certain European or Nordic countries) have very high life evaluation scores.

High-emission outliers:  
  Resource-rich or highly industrialized countries may appear on the far right of the plot with very high CO₂ emissions, but their happiness levels vary. This suggests that energy consumption alone does not guarantee subjective well-being.

Low-emission countries:  
  Many low- and middle-income countries have low per-capita emissions but show a wide range of happiness scores, indicating that institutional quality, social safety nets, and governance likely play an important role beyond environmental footprint.