## Generating a Table of Global Fluxes

This notebook reads in the CESM1 historical run (for CMIP5),
the ensemble of 11 CESM2 historical runs (for CMIP6),
and also the four SSP CESM2 ensembles (for CMIP6).
A table is generated containing values listed [issue #6](https://github.com/marbl-ecosys/cesm2-marbl/issues/6)


> * Net primary production (PgC/yr) (`photoC_TOT_zint`)
> * Diatom primary production (%)   (`photoC_diat_zint`)
> * Sinking POC at 100 m (PgC/yr)   (`POC_FLUX_100m`)
> * Sinking CaCO3 at 100 m (PgC/yr) (`CaCO3_FLUX_100m`)
> * Rain ratio (CaCO3/POC) 100 m    (ratio of two above)
> * Nitrogen fixation (TgN/yr)      (`diaz_Nfix`)
> * Nitrogen deposition (TgN/yr)    (`NOx_FLUX` + `NHy_FLUX`)
> * Denitrification (TgN/yr)        (`DENITRIF`)
> * N cycle imbalance = deposition + fixation - denitrification (TgN/yr) # deposition = N* [see Kristen's notebook -- Biological Diagnostics?]
> * Air–sea CO2 flux (PgC yr21)     (`FG_CO2`)
> * Mean ocean oxygen (uM = umol/L = mmol/m^3)    (`O2`)
> * Volume where O2 <80 mmol/m^3 (10^15 m^3) # based on others
> * Volume where O2 <60 mmol/m^3 (10^15 m^3) # based on others
> * Volume where O2 <5 mmol/m^3 (10^15 m^3)  # based on others

Values will be computed one at a time, due to an issue with `xr.merge` and trying to read multiple variables at once.

### This notebook uses several python packages

The watermark package shows the version number used to help others recreate this environment.

In [1]:
import os

import xarray as xr
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.colors as colors

from pint import UnitRegistry

# Add new units to UnitRegistry
units = UnitRegistry()
units.define('gram N = mol / 14 = gN')
units.define('gram C = mol / 12 = gC')
units.define('year = 365 day = yr')

%load_ext watermark
%watermark -d -iv -m -g -h

numpy      1.17.3
pandas     0.25.3
xarray     0.14.0
matplotlib 3.1.2
2020-01-29 

compiler   : GCC 7.3.0
system     : Linux
release    : 3.10.0-693.21.1.el7.x86_64
machine    : x86_64
processor  : x86_64
CPU cores  : 72
interpreter: 64bit
host name  : casper10
Git hash   : b458b3c9308d8ca665c306e8a4812d5d940c6263


### Define our experiments

In [2]:
# Should the annual averages include the marginal seas or just be open ocean?
# (USE TRUE FOR PAPER)
include_marg_seas = True
# include_marg_seas = False
xp_dir = 'with_marginal_seas' if include_marg_seas else 'no_marginal_seas'

# Process for updating intake-esm catalog
#       1. download all data from HPSS via get_ocn_cmip5_files.sh
#       2. rm /glade/u/home/mlevy/.intake_esm/collections/CESM1-CMIP5.nc
#       3. regenerate it via Anderson's legacy intake-esm
#       4. re-run build intake collections notebook
#       5. commit change to .csv.gz in /glade/work/mlevy/intake-esm-collection/csv.gz/
# NOTE: steps 2-5 can be done with notebooks/intake-esm-collection-defs/rebuild.sh
vars = [
        'photoC_TOT_zint_100m', 'photoC_diat_zint_100m',
        'photoC_TOT_zint', 'photoC_diat_zint',
        'POC_FLUX_100m', 'CaCO3_FLUX_100m',
        'diaz_Nfix', 'NOx_FLUX', 'NHy_FLUX', 'DENITRIF',
        'SedDenitrif', 'DON_RIV_FLUX', 'DONr_RIV_FLUX',
        'FG_CO2', 'O2' ,
        'O2_under_thres' # add a thres dimension corresponding to limits
       ]
# experiments is a list of experiments to compute values for
experiments = dict()
experiments['cesm1'] = ['cesm1_PI',
                        'cesm1_PI_esm',
                        'cesm1_hist',
                        'cesm1_hist_esm',
                        'cesm1_RCP85',
                       ]
               # CESM 2
experiments['cesm2'] = ['cesm2_PI',
                        'cesm2_hist',
                        'cesm2_SSP1-2.6',
                        'cesm2_SSP2-4.5',
                        'cesm2_SSP3-7.0',
                        'cesm2_SSP5-8.5',
                       ]

# experiment_longnames defines the table headers
experiment_longnames={'cesm1_PI' : 'preindustrial (CESM1)',
                      'cesm1_PI_esm' : 'preindustrial (CESM1, BPRP)',
                      'cesm1_hist' : '1981-2005 (CESM1)',
                      'cesm1_hist_esm' : '1990s (CESM1)',
                      'cesm1_RCP45' : 'RCP 4.5 2090s (CESM1)', # not available yet
                      'cesm1_RCP85' : 'RCP 8.5 2090s (CESM1)',
                      'cesm1_RCP85_esm' : 'RCP 8.5 2090s (CESM1)',
                      'cesm2_PI' : 'preindustrial (CESM2)',
                      'cesm2_hist' : '1990-2014 (CESM2)',
                      'cesm2_SSP1-2.6' : 'RCP26 2090s (CESM2)',
                      'cesm2_SSP2-4.5' : 'RCP45 2090s (CESM2)',
                      'cesm2_SSP3-7.0' : 'RCP70 2090s (CESM2)',
                      'cesm2_SSP5-8.5' : 'RCP85 2090s (CESM2)'}

# experiment_dict determines which module version & intake data each experiment uses
experiment_dict = {'cesm1_PI' : ('cesm1', 'piControl'),
                   'cesm1_PI_esm' : ('cesm1', 'esm-piControl'),
                   'cesm1_hist' : ('cesm1', 'historical'),
                   'cesm1_hist_esm' : ('cesm1', 'esm-hist'),
                   'cesm1_RCP85' : ('cesm1', 'RCP-8.5'),
                   'cesm1_RCP85_esm' : ('cesm1', 'esm-RCP-8.5'),
                   'cesm2_PI' : ('cesm2', 'piControl'),
                   'cesm2_hist' : ('cesm2', 'historical'),
                   'cesm2_SSP1-2.6' : ('cesm2', 'SSP1-2.6'),
                   'cesm2_SSP2-4.5' : ('cesm2', 'SSP2-4.5'),
                   'cesm2_SSP3-7.0' : ('cesm2', 'SSP3-7.0'),
                   'cesm2_SSP5-8.5' : ('cesm2', 'SSP5-8.5')
                  }

#### Read output from Make Timeseries.ipynb

Files were written by `xpersist` and are read in using `xr.open_dataset`

In [3]:
%%time

cache_dir = os.path.join(os.path.sep, 'glade', 'p', 'cgd', 'oce', 'projects', 'cesm2-marbl', 'xpersist_cache', xp_dir)

ann_avg = dict()
new_units = dict()
for variable in vars:
    ann_avg[variable] = dict()
    new_units[variable] = dict()
    for model_version in experiments:
        # Skip files that are not written out for CESM1
        if model_version == 'cesm1' and variable in ['SedDenitrif', 'DON_RIV_FLUX', 'DONr_RIV_FLUX']:
            continue
        for exp in experiments[model_version]:
            filename = f'{exp}_{variable}.nc'
            ann_avg[variable][exp] = xr.open_dataset(os.path.join(cache_dir, filename))
            new_units[variable][exp] = units[ann_avg[variable][exp][variable].attrs['units']]

CPU times: user 1.38 s, sys: 105 ms, total: 1.48 s
Wall time: 1.74 s


In [4]:
# Build time series that combines cesm1_hist to 2004 and cesm1_RCP85 after
if 'cesm1_hist' in experiments['cesm1'] and 'cesm1_RCP85' in experiments['cesm1']:
    experiment_longnames['cesm1_hist_RCP85'] = '1990 - 2014 (CESM1)'
    for var in ann_avg:
        if 'cesm1_hist' in ann_avg[var]:
            ann_avg[var]['cesm1_hist_RCP85'] = xr.concat([ann_avg[var]['cesm1_hist'].isel(time=slice(0,-1)),
                                                          ann_avg[var]['cesm1_RCP85']],
                                                         dim='time'
                                                        )
            new_units[var]['cesm1_hist_RCP85'] = new_units[var]['cesm1_hist'].copy()

## Reduce Data Sets

Data has been reduced to annual means, but the netcdf files contain every year in the dataset.
For generating tables, we want to look at specific time periods.

####  Define the time periods we will average over

This could be done earlier in the notebook, but I think it makes sense to wait until we have annual / global means.

In [5]:
# NOTE: 2090-01-01 0:00:00 is the time stamp on the Dec 2089 monthly average
#       So slice("2090", "2100") would actually return Dec 2090 - Nov 2099
#       Specifying a day mid-month gets us to Jan 2090 - Dec 2099 (the 2090s)
#       (this can be verified by looking at time bounds)
time_slices_SSP = slice("2090-01-15", "2100-01-15")

time_slices = dict()

# 200 year averages for CESM1 PI runs, per Lindsay et al 2014
# (He starts 30 years prior to branch point, so I will too)
time_slices['cesm1_PI'] = slice(120, 320) # cfunits doesn't years too far in past; this is 121-07-01 - 320-07-01
time_slices['cesm1_PI_esm'] = slice(320, 520) # cfunits doesn't years too far in past; this is 321-07-01 - 520-07-01
# For CESM2, going from 50 years prior to first historical branch point
#                  to 50 years after end of last historical member
# TODO: These dates should be computed automatically based on intake metadata!
time_slices['cesm2_PI'] = slice(550, 1070) # cfunits doesn't years too far in past; this is 551-07-01 - 1070-07-01

# Historical runs all use slightly different time periods
# Note: that the annual mean data is actually running from July 1st to June 30th
#       these slices were defined to work with monthly data, but pick up the correct years as well
time_slices['cesm1_hist'] = slice("1981-01-15", "2006-01-15") # per Lindsay et al 2014
time_slices['cesm1_hist_esm'] = slice("1990-01-15", "2000-01-15") # per Moore et al 2013
time_slices['cesm1_hist_RCP85'] = slice("1990-01-15", "2015-01-15") # For our paper
time_slices['cesm2_hist'] = slice("1990-01-15", "2015-01-15") # For our paper

# RCP runs use 2090s
time_slices['cesm1_RCP45'] = time_slices_SSP
time_slices['cesm1_RCP85'] = time_slices_SSP
time_slices['cesm1_RCP85_esm'] = time_slices_SSP
time_slices['cesm2_SSP1-2.6'] = time_slices_SSP
time_slices['cesm2_SSP2-4.5'] = time_slices_SSP
time_slices['cesm2_SSP3-7.0'] = time_slices_SSP
time_slices['cesm2_SSP5-8.5'] = time_slices_SSP

In [6]:
# Verify time bounds for each experiment
for exp in ann_avg[vars[0]]:
    try:
        bounds = list(ann_avg[vars[0]][exp].sel(time=time_slices[exp]).time_bound.values[ind] for ind in [(0,0), (-1,-1)])
    except:
        bounds = list(ann_avg[vars[0]][exp].isel(time=time_slices[exp]).time_bound.values[ind] for ind in [(0,0), (-1,-1)])
    print(f'Experiment: {exp}\nRequested time bounds\n----\n{bounds}\n\n')

Experiment: cesm1_PI
Requested time bounds
----
[cftime.DatetimeNoLeap(121, 1, 1, 0, 0, 0, 0, 2, 1), cftime.DatetimeNoLeap(321, 1, 1, 0, 0, 0, 0, 6, 1)]


Experiment: cesm1_PI_esm
Requested time bounds
----
[cftime.DatetimeNoLeap(321, 1, 1, 0, 0, 0, 0, 6, 1), cftime.DatetimeNoLeap(521, 1, 1, 0, 0, 0, 0, 3, 1)]


Experiment: cesm1_hist
Requested time bounds
----
[cftime.DatetimeNoLeap(1981, 1, 1, 0, 0, 0, 0, 0, 1), cftime.DatetimeNoLeap(2006, 1, 1, 0, 0, 0, 0, 4, 1)]


Experiment: cesm1_hist_esm
Requested time bounds
----
[cftime.DatetimeNoLeap(1990, 1, 1, 0, 0, 0, 0, 2, 1), cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 0, 5, 1)]


Experiment: cesm1_RCP85
Requested time bounds
----
[cftime.DatetimeNoLeap(2090, 1, 1, 0, 0, 0, 0, 4, 1), cftime.DatetimeNoLeap(2100, 1, 1, 0, 0, 0, 0, 0, 1)]


Experiment: cesm2_PI
Requested time bounds
----
[cftime.DatetimeNoLeap(551, 1, 1, 0, 0, 0, 0, 5, 1), cftime.DatetimeNoLeap(1071, 1, 1, 0, 0, 0, 0, 0, 1)]


Experiment: cesm2_hist
Requested time bounds
---

#### Define the units to use in final table

Note that in the first cell of the notebook, we defined a year to be 365 days as well as `PgC` and `TgN` units.

In [7]:
# Define final units
PgC_per_year = 'PgC/yr'
TgN_per_year = 'TgN/yr'
uM = 'uM'

final_units = dict()
final_units['photoC_TOT_zint'] = PgC_per_year
final_units['photoC_diat_zint'] = PgC_per_year
final_units['photoC_TOT_zint_100m'] = PgC_per_year
final_units['photoC_diat_zint_100m'] = PgC_per_year
final_units['POC_FLUX_100m'] = PgC_per_year
final_units['CaCO3_FLUX_100m'] = PgC_per_year
final_units['diaz_Nfix'] = TgN_per_year
final_units['NOx_FLUX'] = TgN_per_year
final_units['NHy_FLUX'] = TgN_per_year
final_units['DENITRIF'] = TgN_per_year
final_units['SedDenitrif'] = TgN_per_year
final_units['DON_RIV_FLUX'] = TgN_per_year
final_units['DONr_RIV_FLUX'] = TgN_per_year
final_units['FG_CO2'] = PgC_per_year
final_units['O2'] = 'uM'
final_units['O2_under_thres'] = 'Pm * m^2'

#### Define labels for rows in each table

Also determine correct number of digits to write each value out to

In [8]:
# Define keys that will go into table columns

def O2_vol_keys(o2_thres):
    if o2_thres == 20:
        return f'OMZ volume (10$^1$$^5$ m$^3$; <20 $\mu$M)'
    return f'Volume (10$^1$$^5$ m$^3$) where O$_2$ <{o2_thres} $\mu$M)'

# SETTING UP NAMES FOR ALL TABLE KEYS
POC_key = f'Sinking POC at 100 m ({PgC_per_year})'
CaCO3_key = f'Sinking CaCO$_3$ at 100 m ({PgC_per_year})'
rain_key = f'Rain ratio (CaCO$_3$/POC) at 100 m'
NPP_key = f'Net primary production, full depth ({PgC_per_year})'
NPP_diat_key = f'Diatom primary production, full depth (%)'
NPP_100m_key = f'Net primary production, top 100m ({PgC_per_year})'
NPP_diat_100m_key = f'Diatom primary production, top 100m (%)'
Nfix_key = f'Nitrogen fixation ({TgN_per_year})'
Ndep_key = f'Nitrogen deposition ({TgN_per_year})'
denitrif_key = f'Water Column Denitrification ({TgN_per_year})'
denitrif2_key = f'Sediment Denitrification ({TgN_per_year})'
rivflux_key = f'Nitrogen River Flux ({TgN_per_year})'
Ncycle_key = f'N cycle imbalance* ({TgN_per_year})'
CO2_key = f'Air–sea CO2 flux ({PgC_per_year})'
O2_key = f'Mean ocean oxygen ($\mu$M)'

# Define rounding digit count here
rounding = dict()
rounding[POC_key] = 2
rounding[CaCO3_key] = 3
rounding[rain_key] = 3
rounding[NPP_key] = 1
rounding[NPP_diat_key] = 0
rounding[NPP_100m_key] = 1
rounding[NPP_diat_100m_key] = 0
rounding[Nfix_key] = 0
rounding[Ndep_key] = 1
rounding[denitrif_key] = 0
rounding[denitrif2_key] = 0
rounding[rivflux_key] = 0
rounding[Ncycle_key] = 0
rounding[CO2_key] = 2
rounding[O2_key] = 0
for o2_thres in [5, 20, 60, 80]:
    rounding[O2_vol_keys(o2_thres)] = 0

#### Average over all ensemble members and time (for proper time period)

In [9]:
def get_time_and_ensemble_mean(variable, ann_avg, exp, new_units, final_units):
    try:
        if exp in ['cesm1_PI', 'cesm1_PI_esm', 'cesm2_PI']:
            # Need isel instead of sel since PI slices are in index space rather than years
            ens_time_mean = (ann_avg[variable][exp][variable].isel(time=time_slices[exp]).mean('member_id')).mean('time').values
        else:
            ens_time_mean = (ann_avg[variable][exp][variable].sel(time=time_slices[exp]).mean('member_id')).mean('time').values
    except:
        print(f'   * Can not compute {variable} for {exp}')
        return('-')
    return((ens_time_mean * new_units[variable][exp]).to(final_units[variable]))

In [10]:
%%time

diagnostic_values = dict()
for model_version in experiments:
    exp_loop = experiments[model_version]
    if model_version == 'cesm1' and 'cesm1_hist' in exp_loop and 'cesm1_RCP85' in exp_loop:
        exp_loop.append('cesm1_hist_RCP85')
    for exp in exp_loop:
        diagnostic_values[exp] = dict()
        # Compute each value by hand
        print(f'Computing 100m POC flux for {exp}')
        diagnostic_values[exp][POC_key] = get_time_and_ensemble_mean('POC_FLUX_100m', ann_avg, exp, new_units, final_units)

        print(f'Computing 100m CaCO3 flux for {exp}')
        diagnostic_values[exp][CaCO3_key] = get_time_and_ensemble_mean('CaCO3_FLUX_100m', ann_avg, exp, new_units, final_units)

        print(f'Computing 100m rain rate for {exp}')
        try:
            diagnostic_values[exp][rain_key] = (diagnostic_values[exp][CaCO3_key] /
                                                 diagnostic_values[exp][POC_key])
        except:
            print(f'   * Can not compute rain rate for {exp}')

        print(f'Computing full depth net primary production for {exp}')
        diagnostic_values[exp][NPP_key] = get_time_and_ensemble_mean('photoC_TOT_zint', ann_avg, exp, new_units, final_units)

        print(f'Computing full depth primary production from diatoms for {exp}')
        try:
            diagnostic_values[exp][NPP_diat_key] = 100*(get_time_and_ensemble_mean('photoC_diat_zint', ann_avg, exp, new_units, final_units) /
                                                        diagnostic_values[exp][NPP_key])
        except:
            print(f'   * Can not compute primary production from diatoms for {exp}')

        print(f'Computing top 100m net primary production for {exp}')
        diagnostic_values[exp][NPP_100m_key] = get_time_and_ensemble_mean('photoC_TOT_zint_100m', ann_avg, exp, new_units, final_units)

        print(f'Computing top 100m primary production from diatoms for {exp}')
        try:
            diagnostic_values[exp][NPP_diat_100m_key] = 100*(get_time_and_ensemble_mean('photoC_diat_zint_100m', ann_avg, exp, new_units, final_units) /
                                                        diagnostic_values[exp][NPP_100m_key])
        except:
            print(f'   * Can not compute primary production from diatoms for {exp}')

        print(f'Computing Nfixation for {exp}')
        diagnostic_values[exp][Nfix_key] = get_time_and_ensemble_mean('diaz_Nfix', ann_avg, exp, new_units, final_units)

        print(f'Computing Ndep for {exp}')
        diagnostic_values[exp][Ndep_key] = (get_time_and_ensemble_mean('NOx_FLUX', ann_avg, exp, new_units, final_units) +
                                            get_time_and_ensemble_mean('NHy_FLUX', ann_avg, exp, new_units, final_units))

        print(f'Computing Water Column Denitrif for {exp}')
        diagnostic_values[exp][denitrif_key] = get_time_and_ensemble_mean('DENITRIF', ann_avg, exp, new_units, final_units)

        print(f'Computing Sediment Denitrif for {exp}')
        diagnostic_values[exp][denitrif2_key] = get_time_and_ensemble_mean('SedDenitrif', ann_avg, exp, new_units, final_units)

        print(f'Computing Nitrogen River Flux for {exp}')
        diagnostic_values[exp][rivflux_key] = (get_time_and_ensemble_mean('DON_RIV_FLUX', ann_avg, exp, new_units, final_units) +
                                               get_time_and_ensemble_mean('DONr_RIV_FLUX', ann_avg, exp, new_units, final_units))

        print(f'Computing Nitrogen Cycle imbalance for {exp}')
        table_key = 'N cycle imbalance* (TgN yr$^{-1}$)'
        try:
            diagnostic_values[exp][Ncycle_key] = (diagnostic_values[exp][Ndep_key] +
                                                  diagnostic_values[exp][Nfix_key] -
                                                  diagnostic_values[exp][denitrif_key])
            try:
                diagnostic_values[exp][Ncycle_key] = (diagnostic_values[exp][Ncycle_key] -
                                                      diagnostic_values[exp][denitrif2_key] - 
                                                      diagnostic_values[exp][rivflux_key])
            except:
                print(f'   * No additional denitrification terms for {exp}')
                pass
        except:
            print(f'   * Can not compute Ncycle imbalance for {exp}')

        print(f'Computing air-sea CO2 Flux for {exp}')
        diagnostic_values[exp][CO2_key] = get_time_and_ensemble_mean('FG_CO2', ann_avg, exp, new_units, final_units)

        # Update O2 units to account for fact that we are dividing my total volume
        print(f'Computing O2 concentration for {exp}')
        try:
            diagnostic_values[exp][O2_key] = get_time_and_ensemble_mean('O2', ann_avg, exp, new_units, final_units)
        except:
            print(f'   * Can not compute O2 concentration for {exp}')

        try:
            if exp in ann_avg['O2_under_thres']:
                for n, o2_thres in enumerate(ann_avg['O2_under_thres'][exp]['o2_thres'].data):
                    print(f'Computing volume where O2 < {o2_thres} uM for {exp}')
                    diagnostic_values[exp][O2_vol_keys(o2_thres)] = get_time_and_ensemble_mean('O2_under_thres', ann_avg, exp, new_units, final_units)[n]
        except:
            print(f'   * Can not compute O2 volumes under thresholds for {exp}')

        if exp != experiments[model_version][-1]:
            print('\n----\n')


Computing 100m POC flux for cesm1_PI
Computing 100m CaCO3 flux for cesm1_PI
Computing 100m rain rate for cesm1_PI
Computing full depth net primary production for cesm1_PI
Computing full depth primary production from diatoms for cesm1_PI
Computing top 100m net primary production for cesm1_PI
Computing top 100m primary production from diatoms for cesm1_PI
Computing Nfixation for cesm1_PI
Computing Ndep for cesm1_PI
Computing Water Column Denitrif for cesm1_PI
Computing Sediment Denitrif for cesm1_PI
   * Can not compute SedDenitrif for cesm1_PI
Computing Nitrogen River Flux for cesm1_PI
   * Can not compute DON_RIV_FLUX for cesm1_PI
   * Can not compute DONr_RIV_FLUX for cesm1_PI
Computing Nitrogen Cycle imbalance for cesm1_PI
   * No additional denitrification terms for cesm1_PI
Computing air-sea CO2 Flux for cesm1_PI
Computing O2 concentration for cesm1_PI
Computing volume where O2 < 5 uM for cesm1_PI
Computing volume where O2 < 20 uM for cesm1_PI
Computing volume where O2 < 60 uM for 

#### Actually make the tables

In [11]:
def make_table(diag_columns, test_exps):
    table_dict = dict()
    table_dict['Flux or Concentration'] = []
    for table_key in diag_columns:
        table_dict['Flux or Concentration'].append(table_key)
        for exp in test_exps:
            if experiment_longnames[exp] not in table_dict:
                table_dict[experiment_longnames[exp]] = []
            try:
                # Workaround to drop decimal place when rounding to nearest integer
                if exp != 'diff':
                    round_to = rounding[table_key]
                else:
                    round_to = 3
                format = f'0.{round_to}f'
                rounded_val = f'{diagnostic_values[exp][table_key].magnitude:{format}}'
                table_dict[experiment_longnames[exp]].append(rounded_val)
                # Add asterisk denoting CESM1 integrals are 150m, not full depth
                if ('cesm1' in exp) and (table_key in [NPP_key, NPP_diat_key]):
                    table_dict[experiment_longnames[exp]][-1] = table_dict[experiment_longnames[exp]][-1] + '*'
            except:
                table_dict[experiment_longnames[exp]].append('-')
    return(table_dict)

In [12]:
if 'cesm1_PI_esm' in diagnostic_values:
    print('Comparison of cesm1_PI_esm')
#     let var = fg_co2[d=2]
#     show var var
#      VAR = FG_CO2[D=2]
#     list var_integral_PgC_year
#                  VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#                  X        : 0.5 to 320.5
#                  Y        : 0.5 to 384.5
#                  ENSEMBLE : 0421
#              -0.02491
    if CO2_key in diagnostic_values["cesm1_PI_esm"]:
        print(f'FG_CO2: {diagnostic_values["cesm1_PI_esm"][CO2_key].magnitude:0.5f} (should be -0.02491)')

#     let var = POC_FLUX_IN_100m[d=2]
#     show var var
#      VAR = POC_FLUX_IN_100M[D=2]
#     list/prec=6 var_integral_PgC_year
#                  VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#                  X        : 0.5 to 320.5
#                  Y        : 0.5 to 384.5
#                  ENSEMBLE : 0421
#               8.06490
    if POC_key in diagnostic_values["cesm1_PI_esm"]:
        print(f'POC_FLUX_IN_100m: {diagnostic_values["cesm1_PI_esm"][POC_key].magnitude:0.5f} (should be 8.06490)')

# let var = photoC_diat_zint_100m[d=2]+photoC_sp_zint_100m[d=2]+photoC_diaz_zint_100m[d=2]
# show var var
#  VAR = PHOTOC_DIAT_ZINT_100M[D=2]+PHOTOC_SP_ZINT_100M[D=2]+PHOTOC_DIAZ_ZINT_100M[D=2]
# list/prec=6 var_integral_PgC_year
#              VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#              X        : 0.5 to 320.5
#              Y        : 0.5 to 384.5
#              ENSEMBLE : 0421
#           55.4878
    if NPP_100m_key in diagnostic_values["cesm1_PI_esm"]:
        print(f'photoC_diat_zint_100m: {diagnostic_values["cesm1_PI_esm"][NPP_100m_key].magnitude:0.4f} (should be 55.4878)')
else:
    print('No comparisons done, since cesm1_PI_esm experiment not included')

Comparison of cesm1_PI_esm
FG_CO2: -0.02491 (should be -0.02491)
POC_FLUX_IN_100m: 8.06490 (should be 8.06490)
photoC_diat_zint_100m: 55.4878 (should be 55.4878)


In [13]:
# Keith L's table
rows = [CO2_key, NPP_100m_key, POC_key]
# Match number of digits in orginal paper
rounding[CO2_key] = 3
rounding[NPP_100m_key] = 2

# Add difference column
new_exp = 'diff'
diagnostic_values[new_exp] = dict()
experiment_longnames[new_exp] = 'Difference'
for key in rows:
    if ('cesm1_hist' in diagnostic_values) and ('cesm1_PI' in diagnostic_values):
        try:
            # If key has not been populated, we want a dash here
            diagnostic_values[new_exp][key] = diagnostic_values['cesm1_hist'][key] - diagnostic_values['cesm1_PI'][key]
        except:
            diagnostic_values[new_exp][key] = '-'
    else:
        diagnostic_values[new_exp][key] = '-'

pd.DataFrame(make_table(rows, ['cesm1_PI', 'cesm1_hist', new_exp]))

Unnamed: 0,Flux or Concentration,preindustrial (CESM1),1981-2005 (CESM1),Difference
0,Air–sea CO2 flux (PgC/yr),-0.024,1.774,1.798
1,"Net primary production, top 100m (PgC/yr)",55.55,55.73,0.173
2,Sinking POC at 100 m (PgC/yr),8.08,8.01,-0.075


In [14]:
# Keith M's original table
# We use a different set of preindustrial years
# Also, maybe he uses equal weighting for month -> year instead of number of days per month?

rounding[CO2_key] = 2
rounding[NPP_100m_key] = 1

test_exps = ['cesm1_PI_esm', 'cesm1_hist_esm', 'cesm1_RCP45', 'cesm1_RCP85']
diagnostic_columns = [NPP_key,
                      NPP_100m_key,
                      POC_key,
                      CaCO3_key,
                      rain_key,
                      Nfix_key,
                      Ndep_key,
                      denitrif_key,
                      Ncycle_key,
                      CO2_key,
                      NPP_diat_key,
                      NPP_diat_100m_key,
                      O2_key,
                      O2_vol_keys(20)
                     ]
pd.DataFrame(make_table(diagnostic_columns, test_exps))

Unnamed: 0,Flux or Concentration,"preindustrial (CESM1, BPRP)",1990s (CESM1),RCP 4.5 2090s (CESM1),RCP 8.5 2090s (CESM1)
0,"Net primary production, full depth (PgC/yr)",56.0*,56.5*,-,54.1*
1,"Net primary production, top 100m (PgC/yr)",55.5,56.0,-,53.7
2,Sinking POC at 100 m (PgC/yr),8.06,8.06,-,7.21
3,Sinking CaCO$_3$ at 100 m (PgC/yr),0.758,0.751,-,0.724
4,Rain ratio (CaCO$_3$/POC) at 100 m,0.094,0.093,-,0.100
5,Nitrogen fixation (TgN/yr),177,174,-,144
6,Nitrogen deposition (TgN/yr),6.7,30.0,-,30.9
7,Water Column Denitrification (TgN/yr),190,193,-,188
8,N cycle imbalance* (TgN/yr),-6,10,-,-13
9,Air–sea CO2 flux (PgC/yr),-0.02,2.19,-,4.72


In [15]:
# Updated table for our paper
rounding[CO2_key] = 2
rounding[NPP_100m_key] = 1

# test_exps = ['cesm2_PI', 'cesm2_hist', 'cesm2_SSP1-2.6', 'cesm2_SSP2-4.5', 'cesm2_SSP3-7.0', 'cesm2_SSP5-8.5']
test_exps = ['cesm1_PI', 'cesm1_hist_RCP85', 'cesm1_RCP85', 'cesm2_PI', 'cesm2_hist', 'cesm2_SSP5-8.5']
diagnostic_columns = [NPP_key,
#                       NPP_100m_key,
                      POC_key,
                      CaCO3_key,
                      rain_key,
                      Nfix_key,
                      Ndep_key,
                      denitrif_key,
                      denitrif2_key,
                      rivflux_key,
                      Ncycle_key,
                      CO2_key,
                      NPP_diat_key,
#                       NPP_diat_100m_key,
                      O2_key,
                      O2_vol_keys(20),
                      O2_vol_keys(5),
                      O2_vol_keys(60),
                      O2_vol_keys(80)
                     ]
our_table = pd.DataFrame(make_table(diagnostic_columns, test_exps))
our_table

Unnamed: 0,Flux or Concentration,preindustrial (CESM1),1990 - 2014 (CESM1),RCP 8.5 2090s (CESM1),preindustrial (CESM2),1990-2014 (CESM2),RCP85 2090s (CESM2)
0,"Net primary production, full depth (PgC/yr)",56.1*,56.3*,54.1*,48.4,48.9,50.2
1,Sinking POC at 100 m (PgC/yr),8.08,7.99,7.21,7.0,7.07,6.73
2,Sinking CaCO$_3$ at 100 m (PgC/yr),0.758,0.749,0.724,0.769,0.769,0.813
3,Rain ratio (CaCO$_3$/POC) at 100 m,0.094,0.094,0.100,0.11,0.109,0.121
4,Nitrogen fixation (TgN/yr),176,169,144,242.0,244.0,287.0
5,Nitrogen deposition (TgN/yr),6.7,30.4,30.9,13.4,37.8,38.8
6,Water Column Denitrification (TgN/yr),190,194,188,185.0,192.0,265.0
7,Sediment Denitrification (TgN/yr),-,-,-,68.0,72.0,70.0
8,Nitrogen River Flux (TgN/yr),-,-,-,5.0,9.0,9.0
9,N cycle imbalance* (TgN/yr),-8,6,-13,-3.0,9.0,-19.0


In [16]:
print(our_table.to_latex(index=False))

\begin{tabular}{lllllll}
\toprule
                             Flux or Concentration & preindustrial (CESM1) & 1990 - 2014 (CESM1) & RCP 8.5 2090s (CESM1) & preindustrial (CESM2) & 1990-2014 (CESM2) & RCP85 2090s (CESM2) \\
\midrule
       Net primary production, full depth (PgC/yr) &                 56.1* &               56.3* &                 54.1* &                  48.4 &              48.9 &                50.2 \\
                     Sinking POC at 100 m (PgC/yr) &                  8.08 &                7.99 &                  7.21 &                  7.00 &              7.07 &                6.73 \\
                Sinking CaCO\$\_3\$ at 100 m (PgC/yr) &                 0.758 &               0.749 &                 0.724 &                 0.769 &             0.769 &               0.813 \\
                Rain ratio (CaCO\$\_3\$/POC) at 100 m &                 0.094 &               0.094 &                 0.100 &                 0.110 &             0.109 &               0.121 \\
 