In [6]:
import pandas as pd

# load data
temperature_data = pd.read_csv('./data/temperature.csv')
co2_data = pd.read_csv('./data/carbon_emission.csv')

temperature_data.head(), co2_data.head()

(   ObjectId                       Country ISO2 ISO3  F1961  F1962  F1963  \
 0         1  Afghanistan, Islamic Rep. of   AF  AFG -0.113 -0.164  0.847   
 1         2                       Albania   AL  ALB  0.627  0.326  0.075   
 2         3                       Algeria   DZ  DZA  0.164  0.114  0.077   
 3         4                American Samoa   AS  ASM  0.079 -0.042  0.169   
 4         5      Andorra, Principality of   AD  AND  0.736  0.112 -0.752   
 
    F1964  F1965  F1966  ...  F2013  F2014  F2015  F2016  F2017  F2018  F2019  \
 0 -0.764 -0.244  0.226  ...  1.281  0.456  1.093  1.555  1.540  1.544  0.910   
 1 -0.166 -0.388  0.559  ...  1.333  1.198  1.569  1.464  1.121  2.028  1.675   
 2  0.250 -0.100  0.433  ...  1.192  1.690  1.121  1.757  1.512  1.210  1.115   
 3 -0.140 -0.562  0.181  ...  1.257  1.170  1.009  1.539  1.435  1.189  1.539   
 4  0.308 -0.490  0.415  ...  0.831  1.946  1.690  1.990  1.925  1.919  1.964   
 
    F2020  F2021  F2022  
 0  0.498  1.327  2.01

We are using two datasets:

1. Temperature Data: Annual temperature anomalies measured in degrees Celsius across decades.

2. CO2 Data: Monthly global atmospheric CO2 concentrations in parts per million (ppm)


Getting key statistics for both datasets:

In [7]:
temperature_values = temperature_data.filter(regex='^F').stack() # extract year column
temperature_sats = {
    "Mean": temperature_values.mean(),
    "Median": temperature_values.median(),
    "Variance": temperature_values.var()
}

co2_values = co2_data["Value"]
co2_stats = {
    "Mean": co2_values.mean(),
    "Median": co2_values.median(),
    "Variance": co2_values.var()
}

temperature_sats, co2_stats

({'Mean': 0.5377713483146068, 'Median': 0.47, 'Variance': 0.4294524831504413},
 {'Mean': 180.71615286624203, 'Median': 313.835, 'Variance': 32600.002004693})

Mean temperature change is apporximately 0.54°C, with median of 0.47°C and variance of 0.43, which indicates slight variability in temperature anomalies.

For CO2 concentrations the mean is 180.72 ppm, the median is higher at 313.84 ppm and the variance is 32,600, which reflects substantial variability in CO2 levels over the dataset's timeframe.