# Calculate GHG concentrations

IPCC AR6 methodology:

The following description comes from the Excel sheet of long-lived greenhouse gas concentrations, v9. See https://github.com/chrisroadmap/ar6/blob/main/data_input/observations/LLGHG_history_AR6_v9_for_archive.xlsx

All values are mid-year mean.

**While we wait for the updates from Xin Lan and Jens Muhle, we do 2023 estimates using an extrapolation. This is for the purposes of the forcing time series only.**

NOAA only:
- CO2
- CH3CCl3  (Chris override AR6)
- CCl4 (Chris override AR6)
- CFC-11 (Chris override AR6)
- CFC-12 (Chris override AR6)

AGAGE only:
- CFC-114
- CFC-115
- CFC-13
- CF4
- C2F6
- C3F8
- c-C4F8
- NF3
- SO2F2
- HFC-23
- HFC-236ea (?)
- HFC-245fa
- HFC-43-10mee
- CHCl3
- c-C4F8    xxx  (Note that there is ~7% calibration difference for c-C4F8 between AGAGE (Muhle et al. (2019) and Droste et al (2020), but it cannot be resolved using a simple scaling factor, therefore only AGAGE results are included here.)

Merged NOAA and AGAGE:
- CH4          
- N2O
- HFC-134a
- HFC-32
- HFC-125
- HFC-143a
- HFC-152a
- HFC-227ea  
- HFC-365mfc
- SF6
- CFC-113
- HCFC-22
- HCFC-141b
- HCFC-142b
- CH2Cl2        xxx
- CH3Cl         xxx !!!
- CH3Br
- Halon-1211
- Halon-1301
- Halon-2402
- HCFC-133a     xxx !!! (AGAGE HCFC-133a were adjusted down 7% for account for ~14% calibration difference between AGAGE and Laube et al (2014), in an attempt to express HCFC-133a as average of AGAGE and UEA estimates.)


xxx : data is not available from the aggregated 
!!! : data is not available from the disaggreated


Laube et al. (2014) and WMO (2018) Southern Hemisphere, extrapolation needed to PD
- CFC-112
- CFC-112a
- CFC-113a

Laube et al. (2014) and WMO (2018), extrapolation needed to PD
- CFC-114a

Schoenberger et al. (2015), extrapolation needed to PD
- HCFC-31

Simmonds et al. (2017), extrapolation needed to PD
- HCFC-124

Droste et al. (2020). CMIP6 scaled to Droste et al. (2020) to account for calibration change.
- n-C4F10
- n-C5F12
- n-C6F14
- i-C6F14
- C7F16	

Vollmer et al. (2018), extrapolation needed to PD
- C8F18

https://gml.noaa.gov/aftp/data/ is usually a good place to look

NOAA (accessed 2024-02-26):
- https://gml.noaa.gov/webdata/ccgg/trends/co2/co2_annmean_gl.txt  [NOT USED: WE USE XIN LAN'S DATA DIRECT]
- https://gml.noaa.gov/webdata/ccgg/trends/ch4/ch4_annmean_gl.txt  [NOT USED: WE USE XIN LAN'S DATA DIRECT]
- https://gml.noaa.gov/webdata/ccgg/trends/n2o/n2o_annmean_gl.txt  [NOT USED: WE USE XIN LAN'S DATA DIRECT]
- https://gml.noaa.gov/webdata/ccgg/trends/sf6/sf6_annmean_gl.txt
- https://gml.noaa.gov/aftp/data/hats/Total_Cl_Br/2023%20update%20total%20Cl%20Br%20&%20F.xls  (converted to CSV)

AGAGE (accessed 2023-03-09; has not been updated as of 2024-02-26):
- https://agage2.eas.gatech.edu/data_archive/global_mean/global_mean_ms.txt
- https://agage2.eas.gatech.edu/data_archive/global_mean/global_mean_md.txt

CO2, CH4 and N2O updated using Brad Hall's extended IPCC methodology. **In this pre-release they are extrapolated**

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as pl
from scipy.optimize import curve_fit

In [None]:
pd.set_option('display.max_columns', 500)

In [None]:
# df_co2 = pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/co2_annmean_gl.txt', 
#     delim_whitespace=True,
#     comment='#', 
#     names=['year', 'mean', 'unc'],
#     index_col=0
# )

In [None]:
# df_ch4_noaa = pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/ch4_annmean_gl.txt', 
#     delim_whitespace=True,
#     comment='#', 
#     names=['year', 'mean', 'unc'],
#     index_col=0
# )

In [None]:
# df_n2o_noaa = pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/n2o_annmean_gl.txt', 
#     delim_whitespace=True,
#     comment='#', 
#     names=['year', 'mean', 'unc'],
#     index_col=0
# )

In [None]:
df_sf6_noaa = pd.read_csv(
    '../data/ghg_concentrations/noaa_gml/sf6_annmean_gl.txt', 
    sep=r'\s+',
    comment='#', 
    names=['year', 'mean', 'unc'],
    index_col=0
)

In [None]:
df_noaa = pd.read_csv(
    '../data/ghg_concentrations/noaa_gml/noaa_2023_global_mean_mixing_ratios.csv'
)

In [None]:
df_noaa

In [None]:
df_noaa.loc[162:167,'CFC-12']

In [None]:
df_noaa[df_noaa=="ND"]=np.nan

In [None]:
df_noaa = df_noaa.rolling(6, center=True).mean()
df_noaa['YYYY'] = df_noaa.date-0.5
df_noaa.drop(df_noaa.tail(2).index,inplace=True)
df_noaa.drop(df_noaa.head(3).index,inplace=True)
df_noaa.set_index('YYYY', inplace=True)
df_noaa.drop(columns=['date'], inplace=True)
df_noaa.rename(columns={'H2402': 'H-2402'}, inplace=True)
df_noaa = df_noaa[df_noaa.index % 1 == 0]
df_noaa

In [None]:
# pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/ch2cl2_GCMS_flask.txt', 
#     delim_whitespace=True,
#     skiprows=1
# )

In [None]:
# pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/OCS__GCMS_flask.txt', 
#     delim_whitespace=True,
#     skiprows=1
# )

In [None]:
# pd.read_csv(
#     '../data/ghg_concentrations/noaa_gml/pce_GCMS_flask.txt', 
#     delim_whitespace=True,
#     skiprows=1
# )

In [None]:
df_agage_ms = pd.read_csv(
    '../data/ghg_concentrations/agage/global_mean_ms.txt', 
    sep=r'\s+',
    skiprows=14,
    index_col=0
)

In [None]:
df_agage_ms = df_agage_ms.rolling(12, center=True).mean().drop([col for col in df_agage_ms.columns if '---' in col],axis=1)
df_agage_ms.drop(columns='MM', inplace=True)
df_agage_ms.set_index('YYYY', inplace=True)
df_agage_ms = df_agage_ms[df_agage_ms.index % 1 == 0]

In [None]:
df_agage_ms[df_agage_ms.index % 1 == 0]

In [None]:
df_conc = pd.read_csv(
    '../data/ghg_concentrations/ar6_updated/ipcc_ar6_wg1.csv',
    index_col=0
)

In [None]:
df_conc.loc[2020, :] = np.nan
df_conc.loc[2021, :] = np.nan
df_conc.loc[2022, :] = np.nan
df_conc.loc[2023, :] = np.nan

In [None]:
df_co2_ch4_n2o_update = pd.read_csv(
    '../data/ghg_concentrations/ar6_updated/co2_ch4_n2o_2019-2022.csv',
    index_col=0
)
df_co2_ch4_n2o_update

In [None]:
df_conc

In [None]:
df_conc.loc[2020:2022, 'CO2':'N2O'] = df_co2_ch4_n2o_update.loc[2020:2022, 'CO2':'N2O']

In [None]:
df_conc

In [None]:
df_agage_md = pd.read_csv(
    '../data/ghg_concentrations/agage/global_mean_md.txt', 
    sep=r'\s+',
    skiprows=14,
    index_col=0
)

In [None]:
df_agage_md = df_agage_md.rolling(12, min_periods=12, center=True, step=12).mean().drop([col for col in df_agage_md.columns if '---' in col],axis=1)
df_agage_md.drop(columns='MM', inplace=True)
df_agage_md.set_index('YYYY', inplace=True)
df_agage_md.drop(index=np.nan, inplace=True)

In [None]:
df_agage_md

In [None]:
sf6_mean_offset = -(df_sf6_noaa.loc[2019, 'mean'] - df_agage_ms.loc[2019, 'SF6']).mean()
df_conc.loc[2004:2019, 'SF6'] = 0.5 * (df_sf6_noaa.loc[2004:2019, 'mean'] + df_agage_ms.loc[2004:2019, 'SF6'])
df_conc.loc[2020:2022, 'SF6'] = df_sf6_noaa.loc[2020:2022, 'mean'] + sf6_mean_offset

In [None]:
df_conc

In [None]:
species = [
    'HFC-134a', 'HFC-32', 'HFC-125', 'HFC-143a', 'HFC-152a', 'HFC-227ea', 'HFC-365mfc', 'HCFC-22', 
    'HCFC-141b', 'HCFC-142b', 'CH3CCl3', 'CH3Br', 'H-1211', 'H-1301', 'H-2402'
]
names = {specie: specie for specie in species}
first = {specie: 2004 for specie in species}
names['H-1211'] = 'Halon-1211'
names['H-1301'] = 'Halon-1301'
names['H-2402'] = 'Halon-2402'
first['HFC-365mfc'] = 2008
first['HFC-227ea'] = 2008

In [None]:
for specie in species:
    mean_offset = -(df_noaa.loc[2020, specie] - df_agage_ms.loc[2020, specie]).mean()
    df_conc.loc[first[specie]:2020, names[specie]] = 0.5 * (df_noaa.loc[first[specie]:2020, specie] + df_agage_ms.loc[first[specie]:2020, specie])
    df_conc.loc[2021, names[specie]] = df_noaa.loc[2021, specie] + mean_offset

In [None]:
df_conc

In [None]:
species = [
    'CFC-11', 'CFC-12', 'CCl4'
]
names = {specie: specie for specie in species}
first = {specie: 1992 for specie in species}
last = {specie: 2019 for specie in species}

In [None]:
df_conc.loc[2020:2022, 'CH3CCl3'] = df_noaa.loc[2020:2022, 'CH3CCl3']

In [None]:
for specie in species:
    mean_offset = -(df_noaa.loc[last[specie], specie] - df_agage_md.loc[last[specie], specie]).mean()
    df_conc.loc[first[specie]:last[specie], names[specie]] = 0.5 * (df_noaa.loc[first[specie]:last[specie], specie] + df_agage_md.loc[first[specie]:last[specie], specie])
    df_conc.loc[last[specie]+1:2022, names[specie]] = df_noaa.loc[last[specie]+1:, specie] + mean_offset

In [None]:
df_conc.loc[2007:2020, 'CF4'] = df_agage_ms.loc[2007:2020, 'PFC-14']

In [None]:
df_conc.loc[2004:2020, 'C2F6'] = df_agage_ms.loc[2004:2020, 'PFC-116']
df_conc.loc[2004:2020, 'C3F8'] = df_agage_ms.loc[2004:2020, 'PFC-218']

In [None]:
df_conc.loc[2008:2020, 'HFC-23'] = df_agage_ms.loc[2008:2020, 'HFC-23']

In [None]:
df_conc.loc[2007:2020, 'HFC-236fa'] = df_agage_ms.loc[2007:2020, 'HFC-236fa']

In [None]:
df_conc.loc[2007:2020, 'HFC-245fa'] = df_agage_ms.loc[2007:2020, 'HFC-245fa']

In [None]:
df_conc.loc[2008:2020, 'HFC-23'] = df_agage_ms.loc[2008:2020, 'HFC-23']

In [None]:
df_agage_ms.loc[2008:2020, 'HFC4310mee']

In [None]:
df_conc.loc[2011:2020, 'HFC-43-10mee'] = df_agage_ms.loc[2011:2020, 'HFC4310mee']

In [None]:
df_agage_ms.loc[:, 'SO2F2']

In [None]:
df_conc.loc[2004:2020, 'CFC-13'] = df_agage_ms.loc[2004:2020, 'CFC-13']
df_conc.loc[2004:2020, 'CFC-114'] = df_agage_ms.loc[2004:2020, 'CFC-114']
df_conc.loc[2004:2020, 'CFC-115'] = df_agage_ms.loc[2004:2020, 'CFC-115']

In [None]:
df_conc.loc[2016:2020, 'NF3'] = df_agage_ms.loc[2016:2020, 'NF3']

In [None]:
df_conc.loc[2005:2020, 'SO2F2'] = df_agage_ms.loc[2005:2020, 'SO2F2']

In [None]:
df_conc.loc[2004:2020, 'CH3Cl'] = df_agage_ms.loc[2004:2020, 'CH3Cl']

In [None]:
df_conc.loc[2004:2020, 'CH2Cl2'] = df_agage_ms.loc[2004:2020, 'CH2Cl2']

In [None]:
df_conc.loc[2004:2020, 'CHCl3'] = df_agage_ms.loc[2004:2020, 'CHCl3']

In [None]:
df_conc.loc[1850:1989, 'i-C6F14'] = 0
df_conc.loc[1990:2015, 'i-C6F14'].interpolate(inplace=True)

In [None]:
df_conc.loc[1850:1977, 'CFC-112'] = 0
df_conc.loc[1850:1977, 'CFC-112a'] = 0
df_conc.loc[1850:1977, 'CFC-113a'] = 0
df_conc.loc[1850:1977, 'CFC-114a'] = 0
df_conc.loc[1850:1979, 'HCFC-133a'] = 0
df_conc.loc[1850:1999, 'HCFC-31'] = 0
df_conc.loc[1850:2003, 'HCFC-124'] = 0

In [None]:
# Function to curve fit to the data
def linear(x, c, d):
    return c * x + d

# Initial parameter guess, just to kick off the optimization
guess = (1, 0)

# Place to store function parameters for each column
col_params = {}

# Curve fit each column
for col in df_conc.columns:
    # Create copy of data to remove NaNs for curve fitting
    fit_df = df_conc[col].dropna()

    # Get x & y
    x = fit_df.index.astype(float).values[-5:]
    y = fit_df.values[-5:]
    print (col, x, y)
    # Curve fit column and get curve parameters
    params = curve_fit(linear, x, y, guess)
    # Store optimized parameters
    col_params[col] = params[0]

# Extrapolate each column
for col in df_conc.columns:
    # Get the index values for NaNs in the column
    x = df_conc[pd.isnull(df_conc[col])].index.astype(float).values
    print(col, x)
    # Extrapolate those points with the fitted function
    df_conc[col][x] = linear(x, *col_params[col])

In [None]:
df_conc

In [None]:
os.makedirs('../output', exist_ok = True)
df_conc.to_csv('../output/ghg_concentrations_1750-2023.csv')

## Aggregated categories

In [None]:
gases_hfcs = [
    'HFC-134a',
    'HFC-23', 
    'HFC-32', 
    'HFC-125',
    'HFC-143a', 
    'HFC-152a', 
    'HFC-227ea', 
    'HFC-236fa', 
    'HFC-245fa', 
    'HFC-365mfc',
    'HFC-43-10mee',
]
gases_montreal = [
    'CFC-12',
    'CFC-11',
    'CFC-113',
    'CFC-114',
    'CFC-115',
    'CFC-13',
    'HCFC-22',
    'HCFC-141b',
    'HCFC-142b',
    'CH3CCl3',
    'CCl4',  # yes
    'CH3Cl',  # no
    'CH3Br',  # yes
    'CH2Cl2',  # no!
    'CHCl3',  # no
    'Halon-1211',
    'Halon-1301',
    'Halon-2402',
    'CFC-112',
    'CFC-112a',
    'CFC-113a',
    'CFC-114a',
    'HCFC-133a',
    'HCFC-31',
    'HCFC-124'
]
gases_pfc = [
    'CF4',
    'C2F6',
    'C3F8',
    'c-C4F8',
    'n-C4F10',
    'n-C5F12',
    'n-C6F14',
    'i-C6F14',
    'C7F16',
    'C8F18',
]

In [None]:
# source: Hodnebrog et al 2020 https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019RG000691
radeff = {
    'HFC-125':      0.23378,
    'HFC-134a':     0.16714,
    'HFC-143a':     0.168,
    'HFC-152a':     0.10174,
    'HFC-227ea':    0.27325,
    'HFC-23':       0.19111,
    'HFC-236fa':    0.25069,
    'HFC-245fa':    0.24498,
    'HFC-32':       0.11144,
    'HFC-365mfc':   0.22813,
    'HFC-43-10mee': 0.35731,
    'NF3':          0.20448,
    'C2F6':         0.26105,
    'C3F8':         0.26999,
    'n-C4F10':      0.36874,
    'n-C5F12':      0.4076,
    'n-C6F14':      0.44888,
    'i-C6F14':      0.44888,
    'C7F16':        0.50312,
    'C8F18':        0.55787,
    'CF4':          0.09859,
    'c-C4F8':       0.31392,
    'SF6':          0.56657,
    'SO2F2':        0.21074,
    'CCl4':         0.16616,
    'CFC-11':       0.25941,
    'CFC-112':      0.28192,
    'CFC-112a':     0.24564,
    'CFC-113':      0.30142,
    'CFC-113a':     0.24094, 
    'CFC-114':      0.31433,
    'CFC-114a':     0.29747,
    'CFC-115':      0.24625,
    'CFC-12':       0.31998,
    'CFC-13':       0.27752,
    'CH2Cl2':       0.02882,
    'CH3Br':        0.00432,
    'CH3CCl3':      0.06454,
    'CH3Cl':        0.00466,
    'CHCl3':        0.07357,
    'HCFC-124':     0.20721,
    'HCFC-133a':    0.14995,
    'HCFC-141b':    0.16065,
    'HCFC-142b':    0.19329,
    'HCFC-22':      0.21385,
    'HCFC-31':      0.068,
    'Halon-1202':   0,       # not in dataset
    'Halon-1211':   0.30014,
    'Halon-1301':   0.29943,
    'Halon-2402':   0.31169,
    'CO2':          0,       # different relationship
    'CH4':          0,       # different relationship
    'N2O':          0        # different relationship
}

In [None]:
pfc_hfc134a_eq_1750 = 0
for gas in gases_pfc:
    pfc_hfc134a_eq_1750 = pfc_hfc134a_eq_1750 + (df_conc.loc[1750, gas] * radeff[gas] / radeff['CF4'])
hfc_hfc134a_eq_1750 = 0
for gas in gases_hfcs:
    hfc_hfc134a_eq_1750 = hfc_hfc134a_eq_1750 + (df_conc.loc[1750, gas] * radeff[gas] / radeff['HFC-134a'])
montreal_cfc12_eq_1750 = 0
for gas in gases_montreal:
    montreal_cfc12_eq_1750 = montreal_cfc12_eq_1750 + (df_conc.loc[1750, gas] * radeff[gas] / radeff['CFC-12'])

In [None]:
pfc_hfc134a_eq_1750, hfc_hfc134a_eq_1750, montreal_cfc12_eq_1750

In [None]:
pfc_hfc134a_eq_1850 = 0
for gas in gases_pfc:
    pfc_hfc134a_eq_1850 = pfc_hfc134a_eq_1850 + (df_conc.loc[1850, gas] * radeff[gas] / radeff['CF4'])
hfc_hfc134a_eq_1850 = 0
for gas in gases_hfcs:
    hfc_hfc134a_eq_1850 = hfc_hfc134a_eq_1850 + (df_conc.loc[1850, gas] * radeff[gas] / radeff['HFC-134a'])
montreal_cfc12_eq_1850 = 0
for gas in gases_montreal:
    montreal_cfc12_eq_1850 = montreal_cfc12_eq_1850 + (df_conc.loc[1850, gas] * radeff[gas] / radeff['CFC-12'])

In [None]:
pfc_hfc134a_eq_1850, hfc_hfc134a_eq_1850, montreal_cfc12_eq_1850

In [None]:
pfc_hfc134a_eq_2019 = 0
for gas in gases_pfc:
    pfc_hfc134a_eq_2019 = pfc_hfc134a_eq_2019 + (df_conc.loc[2019, gas] * radeff[gas] / radeff['CF4'])
hfc_hfc134a_eq_2019 = 0
for gas in gases_hfcs:
    hfc_hfc134a_eq_2019 = hfc_hfc134a_eq_2019 + (df_conc.loc[2019, gas] * radeff[gas] / radeff['HFC-134a'])
montreal_cfc12_eq_2019 = 0
for gas in gases_montreal:
    montreal_cfc12_eq_2019 = montreal_cfc12_eq_2019 + (df_conc.loc[2019, gas] * radeff[gas] / radeff['CFC-12'])

In [None]:
pfc_hfc134a_eq_2019, hfc_hfc134a_eq_2019, montreal_cfc12_eq_2019

In [None]:
pfc_hfc134a_eq_2022 = 0
for gas in gases_pfc:
    pfc_hfc134a_eq_2022 = pfc_hfc134a_eq_2022 + (df_conc.loc[2022, gas] * radeff[gas] / radeff['CF4'])
hfc_hfc134a_eq_2022 = 0
for gas in gases_hfcs:
    hfc_hfc134a_eq_2022 = hfc_hfc134a_eq_2022 + (df_conc.loc[2022, gas] * radeff[gas] / radeff['HFC-134a'])
montreal_cfc12_eq_2022 = 0
for gas in gases_montreal:
    montreal_cfc12_eq_2022 = montreal_cfc12_eq_2022 + (df_conc.loc[2022, gas] * radeff[gas] / radeff['CFC-12'])

In [None]:
pfc_hfc134a_eq_2022, hfc_hfc134a_eq_2022, montreal_cfc12_eq_2022