 # Project 2 - Programming for Data Analysis
 
 ---
 An analysis of paleo-present climate data


**Problem statement:**

An analysis of paleo-present climate data

• Analyse CO2 vs Temperature Anomaly from 800kyrs – present.

• Examine one other (paleo/modern) features (e.g. CH4 or polar ice-coverage)

• Examine Irish context: Climate change signals: (see Maynooth study: The emergence of a climate change signal in long-term Irish meteorological observations - ScienceDirect)

• Fuse and analyse data from various data sources and format fused data set as a pandas dataframe and export to csv and json formats

• For all of the above variables, analyse the data, the trends and the relationships between them (temporal leads/lags/frequency analysis).

• Predict global temperature anomaly over next few decades (synthesise data) and compare to published climate models if atmospheric CO2 trends continue • Comment on accelerated warming based on very latest features (e.g. temperature/polar-icecoverage)

Use a Jupyter notebook for your analysis and track your progress using GitHub. Use an academic referencing style

In [249]:
#imports

import pandas as pd

import matplotlib.pyplot as plt


## 1. Data loading

### 1.1. Data for CO2 analysis


The first dataset is Lüthi, D., Le Floch, M., Bereiter, B. et al. High-resolution carbon dioxide concentration record 650,000–800,000 years before present.

Before present (BP) Present-1950

Data source: 

https://www.nature.com/articles/nature06949

Official citation: Lüthi, D., Le Floch, M., Bereiter, B. et al. High-resolution carbon dioxide concentration record 650,000–800,000 years before present. Nature 453, 379–382 (2008). https://doi.org/10.1038/nature06949



In [250]:
df_composite_co2 = pd.read_excel('datasets/41586_2008_BFnature06949_MOESM31_ESM.xls', sheet_name='3.  Composite CO2',  skiprows=2, header=4)

In [251]:
df_composite_co2.head(15)


Unnamed: 0,EDC3_gas_a (yr),CO2 (ppmv)
0,137,280.4
1,268,274.9
2,279,277.9
3,395,279.1
4,404,281.9
5,485,277.7
6,559,281.1
7,672,282.2
8,754,280.1
9,877,278.4


`EDC3_gas_a(yr)`  - years before present. Present is 1950. 

So if EDC3_gas_a(yr)=137, so year will be 1950-137=1813 A.D.

More discussion on this topic can be found:  (https://www.sedgeochem.uni-bremen.de/kiloyears.html)

`CO2 (ppmv)`- CO2 concentration measured (parts per million by volume)

In [252]:
df_composite_co2['Year']=1950 - df_composite_co2['EDC3_gas_a (yr)']

In [253]:
df_composite_co2.columns


Index(['EDC3_gas_a (yr)', 'CO2 (ppmv)', 'Year'], dtype='object')

In [254]:
df_composite_co2.head(15)


Unnamed: 0,EDC3_gas_a (yr),CO2 (ppmv),Year
0,137,280.4,1813
1,268,274.9,1682
2,279,277.9,1671
3,395,279.1,1555
4,404,281.9,1546
5,485,277.7,1465
6,559,281.1,1391
7,672,282.2,1278
8,754,280.1,1196
9,877,278.4,1073


In [255]:
df_composite_co2 = df_composite_co2.astype({'EDC3_gas_a (yr)': 'int', 'CO2 (ppmv)':'float64',  'Year':'int'})

The second dataset for CO2 data will be from IPCC Report https://www.ipcc.ch/

https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_Chapter01_SM.pdf

It is changed Lüthi et al. (2008) and  added on top Bereiter et al. (2015) data


Composite of atmospheric CO2 records from Antarctic ice cores

Reference:
Bereiter et al. (2014), Revision of the EPICA Dome C CO2 record from 800 to 600 kyr before present, Geophysical Research Letters, doi: 10.1002/2014GL061957.

This new version of CO2 composite replaces the old version of Lüthi et al. (2008), which contains the analytical bias described in the article mentioned above and lower quality data and many other sections.
For details about the improvements relative to the previous version see supplementary information of the main article.
For detailed references of all records collected in this file also refer to the supplemetary information of the main article.
For latest anthropogenic data refer to NOAA/Mauna Loa record.
Age unit is in years before present (yr BP) whereas present refers to 1950.

Note, not all records shown in sheet "all records" are part of the composite. 
If millennial scale or smaller details of the composite are studied, we recommend to look into all records available for that period and not only in the composite.


In [256]:
ipcc_co2_composite = pd.read_excel('datasets/grl52461-sup-0003-supplementary.xls', sheet_name='CO2 Composite',skiprows=7, header=7)
ipcc_co2_composite

Unnamed: 0,Gasage (yr BP),CO2 (ppmv),sigma mean CO2 (ppmv)
0,-51.030000,368.022488,0.060442
1,-48.000000,361.780737,0.370000
2,-46.279272,359.647793,0.098000
3,-44.405642,357.106740,0.159923
4,-43.080000,353.946685,0.043007
...,...,...,...
1896,803925.284376,202.921723,2.064488
1897,804009.870607,207.498645,0.915083
1898,804522.674630,204.861938,1.642851
1899,805132.442334,202.226839,0.689587


In [257]:
ipcc_co2_composite.columns = ['gasage_yr_bp', 'CO2 (ppmv)', 'sigma mean CO2 (ppmv)']

In [258]:
ipcc_co2_composite.drop(['sigma mean CO2 (ppmv)'], axis=1)

Unnamed: 0,gasage_yr_bp,CO2 (ppmv)
0,-51.030000,368.022488
1,-48.000000,361.780737
2,-46.279272,359.647793
3,-44.405642,357.106740
4,-43.080000,353.946685
...,...,...
1896,803925.284376,202.921723
1897,804009.870607,207.498645
1898,804522.674630,204.861938
1899,805132.442334,202.226839


In [259]:
ipcc_co2_composite.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1901 entries, 0 to 1900
Data columns (total 3 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   gasage_yr_bp           1901 non-null   float64
 1   CO2 (ppmv)             1901 non-null   float64
 2   sigma mean CO2 (ppmv)  1901 non-null   float64
dtypes: float64(3)
memory usage: 44.7 KB


Here again Gassage (yr BP) is year presented as before present where present is 1950

In [260]:
ipcc_co2_composite['Year'] = 1950 - ipcc_co2_composite['gasage_yr_bp']


In [261]:
ipcc_co2_composite.columns

Index(['gasage_yr_bp', 'CO2 (ppmv)', 'sigma mean CO2 (ppmv)', 'Year'], dtype='object')

In [262]:
ipcc_co2_composite = ipcc_co2_composite.astype({'gasage_yr_bp': 'float64', 'CO2 (ppmv)':'float64',  'Year':'int'})

IPCC data has more recent measurments up to 2001 when df_composite_co2 by Luthi only to 1813

In [263]:

co2_df_merged = ipcc_co2_composite.merge(df_composite_co2, on = 'Year', how = 'outer') 

In [264]:
co2_df_merged


Unnamed: 0,gasage_yr_bp,CO2 (ppmv)_x,sigma mean CO2 (ppmv),Year,EDC3_gas_a (yr),CO2 (ppmv)_y
0,-51.030000,368.022488,0.060442,2001,,
1,-48.000000,361.780737,0.370000,1998,,
2,-46.279272,359.647793,0.098000,1996,,
3,-44.405642,357.106740,0.159923,1994,,
4,-43.080000,353.946685,0.043007,1993,,
...,...,...,...,...,...,...
2988,,,,-792658,794608.0,199.4
2989,,,,-793252,795202.0,195.2
2990,,,,-794517,796467.0,189.3
2991,,,,-795149,797099.0,188.4


In [265]:
co2_df_merged['merged_CO2'] = co2_df_merged['CO2 (ppmv)_x'].fillna(co2_df_merged['CO2 (ppmv)_y'])
co2_df_merged

Unnamed: 0,gasage_yr_bp,CO2 (ppmv)_x,sigma mean CO2 (ppmv),Year,EDC3_gas_a (yr),CO2 (ppmv)_y,merged_CO2
0,-51.030000,368.022488,0.060442,2001,,,368.022488
1,-48.000000,361.780737,0.370000,1998,,,361.780737
2,-46.279272,359.647793,0.098000,1996,,,359.647793
3,-44.405642,357.106740,0.159923,1994,,,357.106740
4,-43.080000,353.946685,0.043007,1993,,,353.946685
...,...,...,...,...,...,...,...
2988,,,,-792658,794608.0,199.4,199.400000
2989,,,,-793252,795202.0,195.2,195.200000
2990,,,,-794517,796467.0,189.3,189.300000
2991,,,,-795149,797099.0,188.4,188.400000


In [266]:
co2_merged = co2_df_merged[['Year', 'merged_CO2']].copy()
co2_merged

Unnamed: 0,Year,merged_CO2
0,2001,368.022488
1,1998,361.780737
2,1996,359.647793
3,1994,357.106740
4,1993,353.946685
...,...,...
2988,-792658,199.400000
2989,-793252,195.200000
2990,-794517,189.300000
2991,-795149,188.400000


### 1.2. Data for temperature analysis

Temperature anomaly:

In climate change studies, temperature anomalies are more important than absolute temperature. A temperature anomaly is the difference from an average, or baseline, temperature. The baseline temperature is typically computed by averaging 30 or more years of temperature data. A positive anomaly indicates the observed temperature was warmer than the baseline, while a negative anomaly indicates the observed temperature was cooler than the baseline. (https://www.ncei.noaa.gov/access/monitoring/dyk/anomalies-vs-temperature)



Summary/Abstract:
A high-resolution deuterium profile is now available along the entire European Project for Ice Coring in Antarctica Dome C ice core, extending this climate record back to marine isotope stage 20.2, ~800,000 years ago. Experiments performed with an atmospheric general circulation model including water isotopes support its temperature interpretation. We assessed the general correspondence between Dansgaard-Oeschger events and their smoothed Antarctic counterparts for this Dome C record, which reveals the presence of such features with similar amplitudes during previous glacial periods. We suggest that the interplay between obliquity and precession accounts for the variable intensity of interglacial periods in ice core records.

Reference (https://www.ncei.noaa.gov/access/paleo-search/study/6080)

Link to dataset (https://www.ncei.noaa.gov/pub/data/paleo/icecore/antarctica/epica_domec/edc3deuttemp2007.txt)

In [267]:
epicaDC_temp_anom=pd.read_csv('datasets\edc3deuttemp2007.csv',skiprows=range(91),delimiter=r"\s+")


Column 1: Bag number (55 cm sample)

Column 2: Top depth (m)

Column 3: EDC3 age scale (years before year 1950)

Column 4:  Deuterium dD data (per mille with respect to SMOW)

Column 5: Temperature estimate (temperature difference from the average of the last 1000 years)


In [268]:
epicaDC_temp_anom.head(15)

Unnamed: 0,Bag,ztop,Age,Deuterium,Temperature
0,1,0.0,-50.0,,
1,2,0.55,-43.54769,,
2,3,1.1,-37.41829,,
3,4,1.65,-31.61153,,
4,5,2.2,-24.51395,,
5,6,2.75,-17.73776,,
6,7,3.3,-10.95945,,
7,8,3.85,-3.20879,,
8,9,4.4,5.48176,,
9,10,4.95,13.52038,,


In [269]:
epicaDC_temp_anom['Year']=1950 - (epicaDC_temp_anom['Age'].astype(int))


In [270]:
epicaDC_temp_anom.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5800 entries, 0 to 5799
Data columns (total 6 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Bag          5800 non-null   int64  
 1   ztop         5800 non-null   float64
 2   Age          5800 non-null   float64
 3   Deuterium    5788 non-null   float64
 4   Temperature  5785 non-null   float64
 5   Year         5800 non-null   int32  
dtypes: float64(4), int32(1), int64(1)
memory usage: 249.3 KB


In [271]:
epicaDC_temp_anom

Unnamed: 0,Bag,ztop,Age,Deuterium,Temperature,Year
0,1,0.00,-50.00000,,,2000
1,2,0.55,-43.54769,,,1993
2,3,1.10,-37.41829,,,1987
3,4,1.65,-31.61153,,,1981
4,5,2.20,-24.51395,,,1974
...,...,...,...,...,...,...
5795,5796,3187.25,797408.00000,-440.20,-8.73,-795458
5796,5797,3187.80,798443.00000,-439.00,-8.54,-796493
5797,5798,3188.35,799501.00000,-441.10,-8.88,-797551
5798,5799,3188.90,800589.00000,-441.42,-8.92,-798639


In [272]:
epicaDC_temp_anom.describe()

Unnamed: 0,Bag,ztop,Age,Deuterium,Temperature,Year
count,5800.0,5800.0,5800.0,5788.0,5785.0,5800.0
mean,2900.5,1594.725,190016.390617,-417.57961,-4.580228,-188065.971207
std,1674.460112,920.953062,192546.207239,20.359332,3.446971,192546.358262
min,1.0,0.0,-50.0,-449.5,-10.58,-799712.0
25%,1450.75,797.3625,46330.56935,-432.5,-7.45,-270506.25
50%,2900.5,1594.725,121793.34,-421.3,-5.2,-119843.0
75%,4350.25,2392.0875,272456.74,-403.2,-1.82,-44380.0
max,5800.0,3189.45,801662.0,0.95,5.46,2000.0


In [273]:
epica_temp_anom = epicaDC_temp_anom[['Year', 'Temperature']].copy()
epica_temp_anom

Unnamed: 0,Year,Temperature
0,2000,
1,1993,
2,1987,
3,1981,
4,1974,
...,...,...
5795,-795458,-8.73
5796,-796493,-8.54
5797,-797551,-8.88
5798,-798639,-8.92


This dataset covers up to 2000


Nasa temperature dataset
Combined Land-Surface Air and Sea-Surface Water Temperature Anomalies (Land-Ocean Temperature Index, L-OTI)
The following are plain-text files in tabular format of temperature anomalies, i.e. deviations from the corresponding 1951-1980 means.

Global-mean monthly, seasonal, and annual means, 1880-present, updated through most recent month


https://data.giss.nasa.gov/gistemp/


In [274]:
nasa_month_anom_temp = pd.read_csv('datasets/GLB.Ts+dSST.csv', header=1)
nasa_month_anom_temp

Unnamed: 0,Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec,J-D,D-N,DJF,MAM,JJA,SON
0,1880,-0.19,-0.24,-0.09,-0.16,-0.10,-0.21,-0.18,-0.10,-0.14,-0.23,-0.21,-0.18,-0.17,***,***,-0.12,-0.16,-0.19
1,1881,-0.20,-0.14,0.03,0.05,0.06,-0.19,0.00,-0.04,-0.15,-0.22,-0.19,-0.07,-0.09,-.10,-.17,0.05,-0.08,-0.19
2,1882,0.16,0.14,0.05,-0.17,-0.14,-0.23,-0.16,-0.07,-0.14,-0.23,-0.16,-0.35,-0.11,-.09,.08,-0.09,-0.15,-0.18
3,1883,-0.29,-0.37,-0.12,-0.18,-0.17,-0.08,-0.06,-0.14,-0.21,-0.11,-0.23,-0.11,-0.17,-.19,-.34,-0.16,-0.09,-0.18
4,1884,-0.13,-0.07,-0.36,-0.40,-0.34,-0.36,-0.30,-0.27,-0.27,-0.25,-0.33,-0.30,-0.28,-.27,-.10,-0.37,-0.31,-0.28
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
139,2019,0.93,0.95,1.17,1.01,0.85,0.90,0.94,0.95,0.92,1.00,0.99,1.09,0.98,.96,.93,1.01,0.93,0.97
140,2020,1.17,1.24,1.17,1.13,1.01,0.92,0.90,0.87,0.98,0.88,1.10,0.80,1.01,1.04,1.17,1.10,0.89,0.99
141,2021,0.81,0.64,0.89,0.75,0.78,0.84,0.92,0.82,0.92,1.00,0.94,0.86,0.85,.84,.75,0.81,0.86,0.95
142,2022,0.91,0.89,1.05,0.84,0.84,0.92,0.94,0.95,0.89,0.96,0.72,0.80,0.89,.90,.89,0.91,0.94,0.86


In [275]:
nasa_month_anom_temp.columns

Index(['Year', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep',
       'Oct', 'Nov', 'Dec', 'J-D', 'D-N', 'DJF', 'MAM', 'JJA', 'SON'],
      dtype='object')

In [276]:
nasa_anom_temp = nasa_month_anom_temp[['Year', 'J-D']].copy()
nasa_anom_temp

Unnamed: 0,Year,J-D
0,1880,-0.17
1,1881,-0.09
2,1882,-0.11
3,1883,-0.17
4,1884,-0.28
...,...,...
139,2019,0.98
140,2020,1.01
141,2021,0.85
142,2022,0.89


In [277]:
nasa_anom_temp.rename(columns={'J-D': 'Temperature'}, inplace=True)
nasa_anom_temp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 144 entries, 0 to 143
Data columns (total 2 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Year         144 non-null    int64  
 1   Temperature  144 non-null    float64
dtypes: float64(1), int64(1)
memory usage: 2.4 KB


In [278]:
temp_anom_merged = nasa_anom_temp.merge(epica_temp_anom, on = 'Year', how = 'outer') 
temp_anom_merged

Unnamed: 0,Year,Temperature_x,Temperature_y
0,1880,-0.17,
1,1881,-0.09,
2,1882,-0.11,
3,1883,-0.17,
4,1884,-0.28,
...,...,...,...
5923,-795458,,-8.73
5924,-796493,,-8.54
5925,-797551,,-8.88
5926,-798639,,-8.92


In [279]:
temp_anom_merged['merged_temp'] = temp_anom_merged['Temperature_x'].fillna(temp_anom_merged['Temperature_y'])
temp_anom_merged

Unnamed: 0,Year,Temperature_x,Temperature_y,merged_temp
0,1880,-0.17,,-0.17
1,1881,-0.09,,-0.09
2,1882,-0.11,,-0.11
3,1883,-0.17,,-0.17
4,1884,-0.28,,-0.28
...,...,...,...,...
5923,-795458,,-8.73,-8.73
5924,-796493,,-8.54,-8.54
5925,-797551,,-8.88,-8.88
5926,-798639,,-8.92,-8.92


In [280]:
temp_anom_merged.tail()

Unnamed: 0,Year,Temperature_x,Temperature_y,merged_temp
5923,-795458,,-8.73,-8.73
5924,-796493,,-8.54,-8.54
5925,-797551,,-8.88,-8.88
5926,-798639,,-8.92,-8.92
5927,-799712,,-8.82,-8.82


### 1.3. Data for CH4 analysis

https://www.methanelevels.org/


NAME OF DATA SET: EPICA Dome C Ice Core 800KYr Methane Data

https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=noaa-icecore-6093

DESCRIPTION:   
Methane record from the EPICA (European Project for Ice Coring in Antarctica) 
Dome C ice core covering 0 to 800 kyr BP.  The air from polar ice-core samples 
of about 40g (Bern) and 50g (LGGE) is extracted with a melt-refreezing method 
under vacuum, and the extracted gas is then analysed for CH4 by gas chromatography. 
Two standard gases (408 p.p.b.v. CH4, 1,050 p.p.b.v. CH4) were used at Bern 
and one (499 p.p.b.v. CH4) at LGGE, to calibrate the gas chromatographs. 
The mean CH4 analytical uncertainty (1s) is 10 p.p.b.v.

In [281]:
methan_epica=pd.read_csv('datasets/ch4_800kyr.csv',delimiter=r"\s+",  header=1)


Column 1: EDC1999 depth (m)
Column 2: Gas Age (EDC3 gas age, years before 1950 AD)
Column 3: CH4 mean (ppbv)
Column 4: 1-sigma uncertainty (ppbv)
Column 5: Laboratory (b=Bern, g=Grenoble)

In [282]:
methan_epica

Unnamed: 0,Depth,Gas_Age,CH4_mean,1s,Lab.
0,99.34,13,907,10.0,b
1,102.45,126,784,10.0,g
2,102.58,130,762,10.0,b
3,103.34,151,710,10.0,g
4,104.33,184,727,10.0,g
...,...,...,...,...,...
2098,3188.08,794938,428,10.0,g
2099,3188.95,796320,418,10.0,b
2100,3189.43,797277,396,10.0,g
2101,3190.03,798417,458,10.0,g


In [283]:
methan_epica['Year']=1950 - (methan_epica['Gas_Age'].astype(int))


In [284]:
methan_epica

Unnamed: 0,Depth,Gas_Age,CH4_mean,1s,Lab.,Year
0,99.34,13,907,10.0,b,1937
1,102.45,126,784,10.0,g,1824
2,102.58,130,762,10.0,b,1820
3,103.34,151,710,10.0,g,1799
4,104.33,184,727,10.0,g,1766
...,...,...,...,...,...,...
2098,3188.08,794938,428,10.0,g,-792988
2099,3188.95,796320,418,10.0,b,-794370
2100,3189.43,797277,396,10.0,g,-795327
2101,3190.03,798417,458,10.0,g,-796467


NAME OF DATA SET:  Law Dome Ice Core 2000-Year CO2, CH4, and N2O Data
The time period coverage is from 1949 to -56 in calendar years before present (BP). See metadata information for parameter and study location details. Please cite this study when using the data.
https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=noaa-icecore-9959

	
MacFarling Meure, C.; Etheridge, D.M.; Trudinger, C.; Steele, L.P.; Langenfelds, R.L.; van Ommen, T.D.; Smith, A.M.; Elkins, J. (2010-07-16): NOAA/WDS Paleoclimatology - Law Dome Ice Core 2000-Year CO2, CH4, and N2O Data. NOAA National Centers for Environmental Information. https://doi.org/10.25921/g6kd-k189. Accessed [17.01.2024].

In [285]:
methan_law = pd.read_excel('datasets/law2006.xls', sheet_name='CH4 by age')

In [286]:
methan_law=methan_law.drop(['Sample Type', 'Unnamed: 4', 'publication status'], axis=1)

In [287]:
methan_law.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 331 entries, 0 to 330
Data columns (total 3 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   CH4 gas age years AD      331 non-null    float64
 1   CH4 (ppb)                 331 non-null    float64
 2   CH4 (ppb) - NOAA04 Scale  331 non-null    float64
dtypes: float64(3)
memory usage: 7.9 KB


In [288]:
methan_law.columns = ['Year', 'CO4(ppb)', 'CO4(ppb)noaa04']

In [289]:
methan_law

Unnamed: 0,Year,CO4(ppb),CO4(ppb)noaa04
0,2004.461749,1708.842301,1729.690177
1,2004.054645,1707.299996,1728.129056
2,2003.386301,1709.650905,1730.508646
3,2003.200000,1707.224783,1728.052926
4,2003.161644,1706.182243,1726.997666
...,...,...,...
326,136.983072,638.815040,646.608583
327,105.481929,632.028000,639.738742
328,56.959144,628.443720,636.110733
329,30.524058,633.875360,641.608639


In [290]:
methan_law = methan_law.astype({'Year': 'int'})

In [291]:
methan_law

Unnamed: 0,Year,CO4(ppb),CO4(ppb)noaa04
0,2004,1708.842301,1729.690177
1,2004,1707.299996,1728.129056
2,2003,1709.650905,1730.508646
3,2003,1707.224783,1728.052926
4,2003,1706.182243,1726.997666
...,...,...,...
326,136,638.815040,646.608583
327,105,632.028000,639.738742
328,56,628.443720,636.110733
329,30,633.875360,641.608639


In [292]:
methan_law.value_counts('Year')

Year
1997    15
1995    13
1990    12
1994     7
1993     6
        ..
1667     1
1682     1
1690     1
1695     1
1856     1
Name: count, Length: 198, dtype: int64

тут треба якось обєднати рядки років до того як переводити в роки

In [293]:
methan_law_grouped = methan_law.groupby('Year').mean()
methan_law_grouped.reset_index(inplace=True)

methan_law_grouped

Unnamed: 0,Year,CO4(ppb),CO4(ppb)noaa04
0,14,647.640200,655.541410
1,30,633.875360,641.608639
2,56,628.443720,636.110733
3,105,632.028000,639.738742
4,136,638.815040,646.608583
...,...,...,...
193,2000,1707.355610,1728.185349
194,2001,1708.885638,1729.734043
195,2002,1709.131351,1729.982754
196,2003,1707.431474,1728.262138


In [294]:
methan_law_grouped.value_counts('Year')

Year
14      1
1912    1
1915    1
1917    1
1919    1
       ..
1682    1
1690    1
1695    1
1723    1
2004    1
Name: count, Length: 198, dtype: int64

In [295]:
methan_merged_inner = methan_epica.merge(methan_law_grouped, on = 'Year', how = 'inner') 
methan_merged

Unnamed: 0,index,Year,methan_merged
197,197,2004,1708.071148
196,196,2003,1707.431474
195,195,2002,1709.131351
194,194,2001,1708.885638
193,193,2000,1707.355610
...,...,...,...
2291,2291,-792988,428.000000
2292,2292,-794370,418.000000
2293,2293,-795327,396.000000
2294,2294,-796467,458.000000


різні два датасети...
вони не сходяться у тих даних які є(

In [296]:
methan_merged = methan_law_grouped.merge(methan_epica, on = 'Year', how = 'outer') 
methan_merged

Unnamed: 0,Year,CO4(ppb),CO4(ppb)noaa04,Depth,Gas_Age,CH4_mean,1s,Lab.
0,14,647.64020,655.541410,,,,,
1,30,633.87536,641.608639,,,,,
2,56,628.44372,636.110733,,,,,
3,105,632.02800,639.738742,,,,,
4,136,638.81504,646.608583,,,,,
...,...,...,...,...,...,...,...,...
2291,-792988,,,3188.08,794938.0,428.0,10.0,g
2292,-794370,,,3188.95,796320.0,418.0,10.0,b
2293,-795327,,,3189.43,797277.0,396.0,10.0,g
2294,-796467,,,3190.03,798417.0,458.0,10.0,g


In [297]:
methan_merged['methan_merged'] = methan_merged['CO4(ppb)'].fillna(methan_merged['CH4_mean'])
methan_merged

Unnamed: 0,Year,CO4(ppb),CO4(ppb)noaa04,Depth,Gas_Age,CH4_mean,1s,Lab.,methan_merged
0,14,647.64020,655.541410,,,,,,647.64020
1,30,633.87536,641.608639,,,,,,633.87536
2,56,628.44372,636.110733,,,,,,628.44372
3,105,632.02800,639.738742,,,,,,632.02800
4,136,638.81504,646.608583,,,,,,638.81504
...,...,...,...,...,...,...,...,...,...
2291,-792988,,,3188.08,794938.0,428.0,10.0,g,428.00000
2292,-794370,,,3188.95,796320.0,418.0,10.0,b,418.00000
2293,-795327,,,3189.43,797277.0,396.0,10.0,g,396.00000
2294,-796467,,,3190.03,798417.0,458.0,10.0,g,458.00000


In [298]:
methan_merged = methan_merged[['Year', 'methan_merged']].copy()


In [299]:
methan_merged = methan_merged.sort_values(by='Year', ascending=False)



In [301]:
methan_merged

Unnamed: 0,Year,methan_merged
197,2004,1708.071148
196,2003,1707.431474
195,2002,1709.131351
194,2001,1708.885638
193,2000,1707.355610
...,...,...
2291,-792988,428.000000
2292,-794370,418.000000
2293,-795327,396.000000
2294,-796467,458.000000


### 1.4 Data for Irish context

In [302]:

# Read Irish meteorological data from the CSV file
df_irish_met_data = pd.read_csv('datasets/LongTermTemperatures_1900-2022_annual.csv')



In [303]:
df_irish_met_data

Unnamed: 0,year,Annual
0,2022,10.9
1,2021,10.5
2,2020,10.4
3,2019,10.5
4,2018,10.3
...,...,...
118,1904,9.1
119,1903,9.1
120,1902,9.2
121,1901,9.1
