Carbon Emissions Impact Analysis with Python

Climate change is one of the most critical challenges of our time, with rising carbon emissions playing a pivotal role in driving global temperature anomalies. Analyzing the relationship between CO₂ concentrations and temperature changes provides valuable insights into the underlying patterns and trends that shape our planet’s climate.

Using Python to analyse the impact of Carbon emissions.

In [2]:
import pandas as pd 
 
temperature_data = pd.read_csv(r"C:\Users\David Adebisi\Desktop\My_DataAnalyst_Tutorial\Project\Carbon_Emissions\temperature.csv")
co2_data = pd.read_csv(r"C:\Users\David Adebisi\Desktop\My_DataAnalyst_Tutorial\Project\Carbon_Emissions\carbon_emmission.csv")

temperature_data_preview = temperature_data.head()
co2_data_preview = co2_data.head()

temperature_data_preview, co2_data_preview

(   ObjectId                       Country ISO2 ISO3  F1961  F1962  F1963  \
 0         1  Afghanistan, Islamic Rep. of   AF  AFG -0.113 -0.164  0.847   
 1         2                       Albania   AL  ALB  0.627  0.326  0.075   
 2         3                       Algeria   DZ  DZA  0.164  0.114  0.077   
 3         4                American Samoa   AS  ASM  0.079 -0.042  0.169   
 4         5      Andorra, Principality of   AD  AND  0.736  0.112 -0.752   
 
    F1964  F1965  F1966  ...  F2013  F2014  F2015  F2016  F2017  F2018  F2019  \
 0 -0.764 -0.244  0.226  ...  1.281  0.456  1.093  1.555  1.540  1.544  0.910   
 1 -0.166 -0.388  0.559  ...  1.333  1.198  1.569  1.464  1.121  2.028  1.675   
 2  0.250 -0.100  0.433  ...  1.192  1.690  1.121  1.757  1.512  1.210  1.115   
 3 -0.140 -0.562  0.181  ...  1.257  1.170  1.009  1.539  1.435  1.189  1.539   
 4  0.308 -0.490  0.415  ...  0.831  1.946  1.690  1.990  1.925  1.919  1.964   
 
    F2020  F2021  F2022  
 0  0.498  1.327  2.01

Calculating key statistics for temperature changes and CO₂ concentrations, such as mean, median, and variance:

In [3]:
temperature_values = temperature_data.filter(regex='^F').stack()
temperature_stats = {
    "Mean": temperature_values.mean(),
    "Median": temperature_values.median(),
    "Variance": temperature_values.var()
}

co2_values = co2_data["Value"]
co2_stats = {
    "Mean": co2_values.mean(),
    "Median": co2_values.median(),
    "Variance": co2_values.var()

}

temperature_stats, co2_stats

({'Mean': 0.5377713483146068, 'Median': 0.47, 'Variance': 0.4294524831504378},
 {'Mean': 180.71615286624203,
  'Median': 313.835,
  'Variance': 32600.00200469294})

Time-Series Analysis

In [4]:
import plotly.graph_objects as go
import plotly.express as px

temperature_years = temperature_data.filter(regex='^F').mean(axis=0)
temperature_years.index = temperature_years.index.str.replace('F', '').astype(int)

co2_data['Year'] = co2_data['Date'].str[:4].astype(int)
co2_yearly = co2_data.groupby('Year')['Value'].mean()

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values, mode='lines+markers', name="Temperature Change (°C)"
))

fig.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values, mode='lines+markers', name="CO₂ Concentration (ppm)", line=dict(dash='dash')
))
fig.update_layout(
    title="Time-series of Temperature Change and CO₂ Concentrations",
    xaxis_title="Year",
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig.show()

merged_data = pd.DataFrame({
    "Temperature Change": temperature_years,
    "CO₂ Concentration": co2_yearly
}).dropna()

heatmap_fig = px.imshow(
    merged_data.corr(),
    text_auto=".2f",
    color_continuous_scale="RdBu", 
    title="Correlation Heatmap"
)
heatmap_fig.update_layout(
    template="plotly_white"
)
heatmap_fig.show()

scatter_fig = px.scatter(
    merged_data,
    x="CO₂ Concentration", y="Temperature Change",
    labels={"CO₂ Concentration": "CO₂ Concentration (ppm)", "Temperature Change": "Temperature Change (°C)"},
    title="Temperature Change vs CO₂ Concentration",
    template="plotly_white"
)

scatter_fig.update_traces(marker=dict(size=10, opacity=0.7))
scatter_fig.show()

The time-series graph reveals a steady rise in CO₂ concentrations (measured in ppm) over the years, highlighting the ongoing buildup of greenhouse gases in the atmosphere. Alongside this, there is a noticeable upward trend in global temperature change, suggesting a link between increasing CO₂ levels and global warming. The alignment over time reinforces the hypothesis that CO₂ plays a significant role in driving temperature increases.

The heatmap shows a strong positive correlation (0.96) between CO₂ concentrations and temperature changes. This strong statistical link further supports the observation that rising CO₂ levels are closely associated with increasing global temperatures, emphasizing the need to address carbon emissions to combat climate change.

The scatter plot shows a clear linear trend, where higher CO₂ concentrations correspond to greater temperature changes. This visual evidence underscores the direct relationship between CO₂ emissions and global warming, which provides further support for policies targeting reductions in carbon emissions to combat climate impacts.

Now, Trends and Seasonal Variatiobs Analysis

In [5]:
from scipy.stats import linregress

# temperature trend
temp_trend = linregress(temperature_years.index, temperature_years.values)
temp_trend_line = temp_trend.slope * temperature_years.index + temp_trend.intercept

# CO2 trend
co2_trend = linregress(co2_yearly.index, co2_yearly.values)
co2_trend_line = co2_trend.slope * co2_yearly.index + co2_trend.intercept

fig_trends = go.Figure()

fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temperature_years.values,
    mode='lines+markers', name="Temperature Change (°C)"
))
fig_trends.add_trace(go.Scatter(
    x=temperature_years.index, y=temp_trend_line,
    mode='lines', name=f"Temperature Trend (Slope: {temp_trend.slope:.2f})", line=dict(dash='dash')
))
fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_yearly.values,
    mode='lines+markers', name="CO₂ Concentration (ppm)"
))
fig_trends.add_trace(go.Scatter(
    x=co2_yearly.index, y=co2_trend_line,
    mode='lines', name=f"CO₂ Trend (Slope: {co2_trend.slope:.2f})", line=dict(dash='dash')
))

fig_trends.update_layout(
    title="Trends in Temperature Change and CO₂ Concentrations",
    xaxis_title="Year",
    yaxis_title="Values",
    template="plotly_white",
    legend_title="Metrics"
)
fig_trends.show()

# seasonal variations in CO2 concentrations
co2_data['Month'] = co2_data['Date'].str[-2:].astype(int)
co2_monthly = co2_data.groupby('Month')['Value'].mean()

fig_seasonal = px.line(
    co2_monthly,
    x=co2_monthly.index,
    y=co2_monthly.values,
    labels={"x": "Month", "y": "CO₂ Concentration (ppm)"},
    title="Seasonal Variations in CO₂ Concentrations",
    markers=True
)
fig_seasonal.update_layout(
    xaxis=dict(tickmode="array", tickvals=list(range(1, 13))),
    template="plotly_white"
)
fig_seasonal.show()

The First graph illustrates the linear trends of both temperature change and CO₂ concentrations over time, as reflected by their respective slopes. The CO₂ trend exhibits a significantly steeper slope (0.32) compared to that of temperature change (0.03), indicating a faster rate of increase in CO₂ emissions relative to temperature rise. This pattern suggests that although temperature changes are progressing more gradually, the steady accumulation driven by rising CO₂ levels could lead to substantial long-term impacts.

The Second graph highlights the seasonal variations in CO₂ concentrations, with peaks occurring in late spring and early summer (around May) and the lowest levels observed in the fall (around September). These fluctuations are largely driven by natural processes like plant photosynthesis, which absorbs CO₂ during the growing season, and respiration, which releases CO₂ during the off-season. This seasonal cycle emphasizes the critical role of natural carbon sinks in regulating atmospheric CO₂ levels.

Correlation and Causality Analysis

In [6]:
from scipy.stats import pearsonr, spearmanr
from statsmodels.tsa.stattools import grangercausalitytests

# pearson and spearman correlation coefficients
pearson_corr, _ = pearsonr(merged_data["CO₂ Concentration"], merged_data["Temperature Change"])
spearman_corr, _ = spearmanr(merged_data["CO₂ Concentration"], merged_data["Temperature Change"])

# granger causality test
granger_data = merged_data.diff().dropna()  # first differencing to make data stationary
granger_results = grangercausalitytests(granger_data, maxlag=3, verbose=False)
#granger_results = grangercausalitytests(granger_data, maxlag=3, verbose=False)

# extracting p-values for causality
granger_p_values = {f"Lag {lag}": round(results[0]['ssr_chi2test'][1], 4)
                    for lag, results in granger_results.items()}

pearson_corr, spearman_corr, granger_p_values


verbose is deprecated since functions should not print results



(0.9554282559257312,
 0.9379013371609882,
 {'Lag 1': 0.0617, 'Lag 2': 0.6754, 'Lag 3': 0.2994})

Lagged Effects Analysis

In [7]:
import statsmodels.api as sm

# creating lagged CO2 data to investigate lagged effects
merged_data['CO₂ Lag 1'] = merged_data["CO₂ Concentration"].shift(1)
merged_data['CO₂ Lag 2'] = merged_data["CO₂ Concentration"].shift(2)
merged_data['CO₂ Lag 3'] = merged_data["CO₂ Concentration"].shift(3)

# dropping rows with NaN due to lags
lagged_data = merged_data.dropna()

X = lagged_data[['CO₂ Concentration', 'CO₂ Lag 1', 'CO₂ Lag 2', 'CO₂ Lag 3']]
y = lagged_data['Temperature Change']
X = sm.add_constant(X)  # adding a constant for intercept

model = sm.OLS(y, X).fit()

model_summary = model.summary()
model_summary

0,1,2,3
Dep. Variable:,Temperature Change,R-squared:,0.949
Model:,OLS,Adj. R-squared:,0.945
Method:,Least Squares,F-statistic:,252.5
Date:,"Mon, 05 May 2025",Prob (F-statistic):,2.9699999999999997e-34
Time:,14:49:32,Log-Likelihood:,45.098
No. Observations:,59,AIC:,-80.2
Df Residuals:,54,BIC:,-69.81
Df Model:,4,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,-4.7980,0.317,-15.137,0.000,-5.434,-4.163
CO₂ Concentration,0.3245,0.055,5.942,0.000,0.215,0.434
CO₂ Lag 1,-0.2962,0.068,-4.361,0.000,-0.432,-0.160
CO₂ Lag 2,0.0104,0.068,0.153,0.879,-0.126,0.146
CO₂ Lag 3,-0.0107,0.056,-0.191,0.849,-0.123,0.101

0,1,2,3
Omnibus:,2.369,Durbin-Watson:,1.554
Prob(Omnibus):,0.306,Jarque-Bera (JB):,2.077
Skew:,-0.457,Prob(JB):,0.354
Kurtosis:,2.902,Cond. No.,7540.0


Clustering Climate Patterns

In [8]:
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import numpy as np

# preparing the data for clustering
clustering_data = merged_data[["Temperature Change", "CO₂ Concentration"]].dropna()

scaler = StandardScaler()
scaled_data = scaler.fit_transform(clustering_data)

# applying K-Means clustering
kmeans = KMeans(n_clusters=3, random_state=42)  # assuming 3 clusters for simplicity
clustering_data['Cluster'] = kmeans.fit_predict(scaled_data)

# adding labels for periods with similar climate patterns
clustering_data['Label'] = clustering_data['Cluster'].map({
    0: 'Moderate Temp & CO₂',
    1: 'High Temp & CO₂',
    2: 'Low Temp & CO₂'
})

import plotly.express as px

fig_clusters = px.scatter(
    clustering_data,
    x="CO₂ Concentration",
    y="Temperature Change",
    color="Label",
    color_discrete_sequence=px.colors.qualitative.Set2,
    labels={
        "CO₂ Concentration": "CO₂ Concentration (ppm)",
        "Temperature Change": "Temperature Change (°C)",
        "Label": "Climate Pattern"
    },
    title="Clustering of Years Based on Climate Patterns"
)

fig_clusters.update_layout(
    template="plotly_white",
    legend_title="Climate Pattern"
)

fig_clusters.show()


KMeans is known to have a memory leak on Windows with MKL, when there are less chunks than available threads. You can avoid it by setting the environment variable OMP_NUM_THREADS=1.



Predicting Temperature Changes Under What If Analysis

     We will use a simple linear regression model to simulate how changes in CO₂ concentrations might influence global temperatures. By leveraging the historical relationship between CO₂ concentrations and temperature anomalies, this model allows us to predict the potential impact of different emission scenarios.

        The scenarios we simulate include:
            1. Increase CO₂ by 10%: Predict the rise in temperature anomalies.
            2. Decrease CO₂ by 10%: Estimate the cooling effect.
            3. Increase CO₂ by 20%: Analyze the impact of more aggressive emissions growth.
            4. Decrease CO₂ by 20%: Evaluate the benefit of significant emission reductions. 

In [9]:
from sklearn.linear_model import LinearRegression

# Preparing data
X = merged_data[["CO₂ Concentration"]].values  # CO₂ concentration as input
y = merged_data["Temperature Change"].values   # temperature change as target

model = LinearRegression()
model.fit(X, y)

# function to simulate "what-if" scenarios
def simulate_temperature_change(co2_percentage_change):
    # Calculate new CO2 concentrations
    current_mean_co2 = merged_data["CO₂ Concentration"].mean()
    new_co2 = current_mean_co2 * (1 + co2_percentage_change / 100)

    # predict temperature change
    predicted_temp = model.predict([[new_co2]])
    return predicted_temp[0]

# simulating scenarios
scenarios = {
    "Increase CO₂ by 10%": simulate_temperature_change(10),
    "Decrease CO₂ by 10%": simulate_temperature_change(-10),
    "Increase CO₂ by 20%": simulate_temperature_change(20),
    "Decrease CO₂ by 20%": simulate_temperature_change(-20),
}

scenarios

{'Increase CO₂ by 10%': 1.0866445037958163,
 'Decrease CO₂ by 10%': -0.059993041237237144,
 'Increase CO₂ by 20%': 1.6599632763123422,
 'Decrease CO₂ by 20%': -0.6333118137537621}

A 10% increase in CO₂ results in a notable rise in temperature anomalies, which demonstrates the sensitivity of global temperatures to CO₂ levels. Conversely, a 10-20% reduction in CO₂ could lead to significant cooling effects, which will potentially reverse some warming trends.