In [1]:
import pandas as pd

# load data
temperature_data = pd.read_csv('./data/temperature.csv')
co2_data = pd.read_csv('./data/carbon_emission.csv')

temperature_data.head(), co2_data.head()

(   ObjectId                       Country ISO2 ISO3  F1961  F1962  F1963  \
 0         1  Afghanistan, Islamic Rep. of   AF  AFG -0.113 -0.164  0.847   
 1         2                       Albania   AL  ALB  0.627  0.326  0.075   
 2         3                       Algeria   DZ  DZA  0.164  0.114  0.077   
 3         4                American Samoa   AS  ASM  0.079 -0.042  0.169   
 4         5      Andorra, Principality of   AD  AND  0.736  0.112 -0.752   
 
    F1964  F1965  F1966  ...  F2013  F2014  F2015  F2016  F2017  F2018  F2019  \
 0 -0.764 -0.244  0.226  ...  1.281  0.456  1.093  1.555  1.540  1.544  0.910   
 1 -0.166 -0.388  0.559  ...  1.333  1.198  1.569  1.464  1.121  2.028  1.675   
 2  0.250 -0.100  0.433  ...  1.192  1.690  1.121  1.757  1.512  1.210  1.115   
 3 -0.140 -0.562  0.181  ...  1.257  1.170  1.009  1.539  1.435  1.189  1.539   
 4  0.308 -0.490  0.415  ...  0.831  1.946  1.690  1.990  1.925  1.919  1.964   
 
    F2020  F2021  F2022  
 0  0.498  1.327  2.01

We are using two datasets:

1. Temperature Data: Annual temperature anomalies measured in degrees Celsius across decades.

2. CO2 Data: Monthly global atmospheric CO2 concentrations in parts per million (ppm)


Getting key statistics for both datasets:

In [3]:
temperature_values = temperature_data.filter(regex='^F').stack() # extract year column
temperature_sats = {
    "Mean": temperature_values.mean(),
    "Median": temperature_values.median(),
    "Variance": temperature_values.var()
}

co2_values = co2_data["Value"]
co2_stats = {
    "Mean": co2_values.mean(),
    "Median": co2_values.median(),
    "Variance": co2_values.var()
}

temperature_sats, co2_stats

({'Mean': np.float64(0.5377713483146068),
  'Median': np.float64(0.47),
  'Variance': np.float64(0.4294524831504413)},
 {'Mean': np.float64(180.71615286624203),
  'Median': np.float64(313.835),
  'Variance': np.float64(32600.002004693)})

Mean temperature change is apporximately 0.54°C, with median of 0.47°C and variance of 0.43, which indicates slight variability in temperature anomalies.

For CO2 concentrations the mean is 180.72 ppm, the median is higher at 313.84 ppm and the variance is 32,600, which reflects substantial variability in CO2 levels over the dataset's timeframe.

# Time-Series Analysis

Examine how temperature changes and CO2 concentrations have evolvednovertime and the relationship between them:

In [5]:
import plotly.graph_objects as go
import plotly.express as px

In [6]:
# extract time-series data for plotting
# temperature: averaging across countries for each year

temperature_years = temperature_data.filter(regex='^F').mean(axis=0)
temperature_years.index = temperature_years.index.str.replace('F', '').astype(int)

# CO2: parsing year and averaging monthly data
co2_data['Year'] = co2_data['Date'].str[:4].astype(int)
co2_yearly = co2_data.groupby('Year')['Value'].mean()

# plot for levels
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values,
    mode='lines+markers', name='Temperature Change (°C)'
))
fig.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values,
    mode='lines+markers', name='CO2 concentration (ppm)'
))
fig.update_layout(
    title="Time-series of Temperature Change and CO2 Concentrations",
    xaxis_title="Year", 
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig.show()

The time-series graph shows a consistent increase in CO2 concentrations over the years, which indicates accumulation of greenhouse gases in the atmosphere. Simultaneously, a sloght upward trend in global temperature change suggests that rising CO2 levels are associated with global warming. The temporal alignment supports this hypothesis.

In [7]:
# correlation heatmap
merged_data = pd.DataFrame({
    "Temperature Change": temperature_years,
    "CO2 concentration": co2_yearly
}).dropna()

heatmap_fig = px.imshow(
    merged_data.corr(),
    text_auto=".2f",
    color_continuous_scale="RdBu",
    title="Correlation Heatmap"
)
heatmap_fig.update_layout(
    template="plotly_white"
)
heatmap_fig.show()

scatter_fig = px.scatter(
    merged_data,
    x="CO2 concentration", y="Temperature Change",
    labels={"CO2 concentration": "CO2 concentration (ppm)", "Temperature Change": "Temperature Change (°C)"},
    template="plotly_white"
)

scatter_fig.update_traces(marker=dict(size=10, opacity=0.7))
scatter_fig.show()

The heatmap revelas a strong positive correlation (0.96) between CO2 concentrations and temperature changes. This statisitcal relationship reinforces the observation that higher CO2 levels are closely linked with increasing global temperatures, which highlights the importance of addressing carbon emissions to mitigate climate change.

The scatter plot shows a clear linear trend, where higher CO2 concentrations correspond to greater temperature changes. This underscores the direct relationship between CO2 emissions and global warming, which provides further support for policies targeting reductions in carbon emissions to combat climate impacts.

# Trends and Seasonal Variational Analysis

In [10]:
from scipy.stats import linregress

# temperature trend
temp_trend = linregress(temperature_years.index, temperature_years.values)
temp_trend_line = temp_trend.slope * temperature_years.index + temp_trend.intercept

# CO2 trend
co2_trend = linregress(co2_yearly.index, co2_yearly.values)
co2_trend_line = co2_trend.slope * co2_yearly.index + co2_trend.intercept

fig_trends = go.Figure()

fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values,
    mode='lines+markers', name='Temperature Change (°C)'
))
fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temp_trend_line,
    mode='lines', name=f'Temperature Trend (Slope: {temp_trend.slope:.2f})', line=dict(dash='dash')
))

fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values,
    mode='lines+markers', name='CO2 concentratioj (ppm)'
))
fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_trend_line,
    mode='lines', name=f'CO2 Trend (Slope: {co2_trend.slope:.2f})', line=dict(dash='dash')
))

fig_trends.update_layout(
    title="Trends in Temperature Change and CO₂ Concentrations",
    xaxis_title="Year",
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig_trends.show()

# seasonal variations in CO2 concentrations
co2_data['Month'] = co2_data['Date'].str[-2:].astype(int)
co2_monthly = co2_data.groupby('Month')['Value'].mean()

fig_seasonal = px.line(
    co2_monthly,
    x=co2_monthly.index,
    y=co2_monthly.values,
    labels={"x": "Month", "y": "CO₂ Concentration (ppm)"},
    title="Seasonal Variations in CO₂ Concentrations",
    markers=True
)
fig_seasonal.update_layout(
    xaxis=dict(tickmode="array", tickvals=list(range(1, 13))),
    template="plotly_white"
)
fig_seasonal.show()

The first graph shows the linear trends in both temperature change and CO₂ concentrations over time, represented by their respective slopes. The CO₂ trend has a much steeper slope (0.32) compared to temperature (0.03), which indicates a faster rate of increase in CO₂ emissions relative to temperature change. This suggests that while CO₂ levels are rising rapidly, the temperature impact, though slower, is accumulating steadily and may have long-term consequences.

The second graph highlights the seasonal fluctuations in CO₂ concentrations, which peak during late spring and early summer (around May) and reach the lowest levels in fall (around September). These variations are likely due to natural processes such as plant photosynthesis, which absorbs CO₂ during the growing season, and respiration, which releases CO₂ in the off-season. This seasonal cycle underscores the role of natural carbon sinks in moderating atmospheric CO₂ levels.


# Correlation and Causality Analysis

To quantify the relationship between CO2 and temperature anomalies, we will compute Pearson and Spearman correlation coefficients. And to investigate whether changes in CO2 cause temperature anomalies, we will perform Granger Causality tests: