## Generating a Table of Global Fluxes

This notebook reads time series data from `Make Timeseries.ipynb` and
generates a table containing values listed in [issue #6](https://github.com/marbl-ecosys/cesm2-marbl/issues/6)

### This notebook uses several python packages

The watermark package shows the version number used to help others recreate this environment.

In [1]:
import pandas as pd
import ann_avg_utils as aau

# Set pandas option and get pint units
pd.set_option('display.max_colwidth', 75)
units, final_units = aau.get_pint_units()

%load_ext watermark
%watermark -d -iv -m -g -h

pandas 1.0.3
2020-05-27 

compiler   : GCC 7.3.0
system     : Linux
release    : 3.10.0-693.21.1.el7.x86_64
machine    : x86_64
processor  : x86_64
CPU cores  : 72
interpreter: 64bit
host name  : casper21
Git hash   : 9dc441caa085bb3518587c2d38737477a2f4e641


### Copy data that is shared with time series plotting from aau

In [2]:
global_vars = aau.global_vars()

xp_dir = global_vars['xp_dir']
vars = global_vars['vars']
experiments = global_vars['experiments']
experiment_longnames = global_vars['experiment_longnames']
experiment_dict = global_vars['experiment_dict']
time_slices = global_vars['time_slices']

#### Read output from Make Timeseries.ipynb

Files were written by `xpersist` and are read in using `xr.open_dataset`

In [3]:
%%time

ann_avg, cesm_units = aau.get_ann_means_and_units(xp_dir, vars, experiments, experiment_longnames, units)


CPU times: user 1.27 s, sys: 117 ms, total: 1.39 s
Wall time: 1.72 s


## Reduce Data Sets

Data has been reduced to annual means, but the netcdf files contain every year in the dataset.
For generating tables, we want to look at specific time periods.

In [4]:
# aau.print_exp_time_bounds(ann_avg[vars[0]], time_slices)

#### Define labels for rows in each table

Also determine correct number of digits to write each value out to

In [5]:
table_specs = aau.get_table_specs(final_units, o2_levs=[5, 20, 60, 80])

#### Average over all ensemble members and time (for proper time period)

In [6]:
diagnostic_values = aau.compute_diagnostic_values(experiments, table_specs, ann_avg, time_slices, cesm_units, final_units, verbose=False)

   * Can not compute NHx_SURFACE_EMIS for cesm1_PI
   * Can not compute SedDenitrif for cesm1_PI
   * Can not compute DON_RIV_FLUX for cesm1_PI
   * Can not compute DONr_RIV_FLUX for cesm1_PI
   * No additional denitrification terms for cesm1_PI
   * Can not compute NHx_SURFACE_EMIS for cesm1_PI_esm
   * Can not compute SedDenitrif for cesm1_PI_esm
   * Can not compute DON_RIV_FLUX for cesm1_PI_esm
   * Can not compute DONr_RIV_FLUX for cesm1_PI_esm
   * No additional denitrification terms for cesm1_PI_esm
   * Can not compute NHx_SURFACE_EMIS for cesm1_hist
   * Can not compute SedDenitrif for cesm1_hist
   * Can not compute DON_RIV_FLUX for cesm1_hist
   * Can not compute DONr_RIV_FLUX for cesm1_hist
   * No additional denitrification terms for cesm1_hist
   * Can not compute NHx_SURFACE_EMIS for cesm1_hist_esm
   * Can not compute SedDenitrif for cesm1_hist_esm
   * Can not compute DON_RIV_FLUX for cesm1_hist_esm
   * Can not compute DONr_RIV_FLUX for cesm1_hist_esm
   * No addition

#### Actually make the tables

In [7]:
def make_table(table_specs, table_vars, test_exps):
    table_dict = dict()
    table_dict['Flux or Concentration'] = []
    for varname in table_vars:
        # Unpack dictionaries in diag_columns
        table_key = table_specs[varname]['key']
        units = table_specs[varname]['units']
        rounding = table_specs[varname]['rounding']
        # Get row header
        if units:
            row_header = f'{table_key} ({units})'
        else:
            row_header = table_key
        table_dict['Flux or Concentration'].append(row_header)

        # Get values rounded to correct number of digits
        for exp in test_exps:
            if experiment_longnames[exp] not in table_dict:
                table_dict[experiment_longnames[exp]] = []
            # Workaround to drop decimal place when rounding to nearest integer
            if exp != 'diff':
                round_to = rounding
            else:
                round_to = 3
            format = f'0.{round_to}f'
            try:
                rounded_val = f'{diagnostic_values[exp][table_key].magnitude:{format}}'
                table_dict[experiment_longnames[exp]].append(rounded_val)
                # Add asterisk denoting CESM1 integrals are 150m, not full depth
                if ('cesm1' in exp) and (varname in ['NPP', 'NPP_diat']):
                    table_dict[experiment_longnames[exp]][-1] = table_dict[experiment_longnames[exp]][-1] + '*'
            except:
                table_dict[experiment_longnames[exp]].append('-')
    return(table_dict)

In [8]:
if 'cesm1_PI_esm' in diagnostic_values:
    print('Comparison of cesm1_PI_esm')
#     let var = fg_co2[d=2]
#     show var var
#      VAR = FG_CO2[D=2]
#     list var_integral_PgC_year
#                  VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#                  X        : 0.5 to 320.5
#                  Y        : 0.5 to 384.5
#                  ENSEMBLE : 0421
#              -0.02491
    if table_specs['CO2']['key'] in diagnostic_values["cesm1_PI_esm"]:
        print(f'FG_CO2: {diagnostic_values["cesm1_PI_esm"][table_specs["CO2"]["key"]].magnitude:0.5f} (should be -0.02491)')

#     let var = POC_FLUX_IN_100m[d=2]
#     show var var
#      VAR = POC_FLUX_IN_100M[D=2]
#     list/prec=6 var_integral_PgC_year
#                  VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#                  X        : 0.5 to 320.5
#                  Y        : 0.5 to 384.5
#                  ENSEMBLE : 0421
#               8.06490
    if table_specs['POC']['key'] in diagnostic_values["cesm1_PI_esm"]:
        print(f'POC_FLUX_IN_100m: {diagnostic_values["cesm1_PI_esm"][table_specs["POC"]["key"]].magnitude:0.5f} (should be 8.06490)')

# let var = photoC_diat_zint_100m[d=2]+photoC_sp_zint_100m[d=2]+photoC_diaz_zint_100m[d=2]
# show var var
#  VAR = PHOTOC_DIAT_ZINT_100M[D=2]+PHOTOC_SP_ZINT_100M[D=2]+PHOTOC_DIAZ_ZINT_100M[D=2]
# list/prec=6 var_integral_PgC_year
#              VARIABLE : (1E-9 * 12 * 1E-15 * 86400 * 365) * VAR_MUL_AREA[I=@SUM,J=@SUM]
#              X        : 0.5 to 320.5
#              Y        : 0.5 to 384.5
#              ENSEMBLE : 0421
#           55.4878
    if table_specs['NPP_100m']['key'] in diagnostic_values["cesm1_PI_esm"]:
        print(f'photoC_diat_zint_100m: {diagnostic_values["cesm1_PI_esm"][table_specs["NPP_100m"]["key"]].magnitude:0.4f} (should be 55.4878)')
else:
    print('No comparisons done, since cesm1_PI_esm experiment not included')

Comparison of cesm1_PI_esm
FG_CO2: -0.02491 (should be -0.02491)
POC_FLUX_IN_100m: 8.06490 (should be 8.06490)
photoC_diat_zint_100m: 55.4878 (should be 55.4878)


In [9]:
# Keith L's table
table_vars = ['CO2', 'NPP_100m', 'POC']

# Match number of digits in orginal paper
table_specs['CO2']['rounding'] = 3
table_specs['NPP_100m']['rounding'] = 2

# Add difference column
new_exp = 'diff'
diagnostic_values[new_exp] = dict()
experiment_longnames[new_exp] = 'Difference'
for table_key in table_specs:
    table_spec = table_specs[table_key]
    if ('cesm1_hist' in diagnostic_values) and ('cesm1_PI' in diagnostic_values):
        try:
            # If key has not been populated, we want a dash here
            diagnostic_values[new_exp][table_spec['key']] = diagnostic_values['cesm1_hist'][table_spec['key']] - diagnostic_values['cesm1_PI'][table_spec['key']]
        except:
            diagnostic_values[new_exp][table_spec['key']] = '-'
    else:
        diagnostic_values[new_exp][table_spec['key']] = '-'

pd.DataFrame(make_table(table_specs, table_vars, ['cesm1_PI', 'cesm1_hist', new_exp]))

Unnamed: 0,Flux or Concentration,preindustrial (CESM1),1981-2005 (CESM1),Difference
0,Air–sea CO2 flux (PgC/yr),-0.024,1.774,1.798
1,"Net primary production, top 100m (PgC/yr)",55.55,55.73,0.173
2,Sinking POC at 100 m (PgC/yr),8.08,8.01,-0.075


In [10]:
# Keith M's original table
# We use a different set of preindustrial years
# Also, maybe he uses equal weighting for month -> year instead of number of days per month?

table_specs['CO2']['rounding'] = 2
table_specs['NPP_100m']['rounding'] = 1

test_exps = ['cesm1_PI_esm', 'cesm1_hist_esm', 'cesm1_RCP45', 'cesm1_RCP85']
table_vars = ['NPP', 'NPP_100m', 'POC', 'CaCO3', 'rain', 'Nfix', 'Ndep', 'denitrif',
              'Ncycle', 'CO2', 'NPP_diat', 'NPP_diat_100m', 'O2', 'o2_under_20']

pd.DataFrame(make_table(table_specs, table_vars, test_exps))

Unnamed: 0,Flux or Concentration,"preindustrial (CESM1, BPRP)",1990s (CESM1),RCP 4.5 2090s (CESM1),RCP 8.5 2090s (CESM1)
0,"Net primary production, full depth (PgC/yr)",56.0*,56.5*,-,54.1*
1,"Net primary production, top 100m (PgC/yr)",55.5,56.0,-,53.7
2,Sinking POC at 100 m (PgC/yr),8.06,8.06,-,7.21
3,Sinking CaCO$_3$ at 100 m (PgC/yr),0.758,0.751,-,0.724
4,Rain ratio (CaCO$_3$/POC) at 100 m,0.094,0.093,-,0.100
5,Nitrogen fixation (TgN/yr),177,174,-,144
6,Nitrogen deposition (TgN/yr),6.7,30.0,-,30.9
7,Water Column Denitrification (TgN/yr),190,193,-,188
8,N cycle imbalance (TgN/yr),-6,10,-,-13
9,Air–sea CO2 flux (PgC/yr),-0.02,2.19,-,4.72


In [11]:
# Updated table for our paper
table_specs['CO2']['rounding'] = 2
table_specs['NPP_100m']['rounding'] = 1

# test_exps = ['cesm2_PI', 'cesm2_hist', 'cesm2_SSP1-2.6', 'cesm2_SSP2-4.5', 'cesm2_SSP3-7.0', 'cesm2_SSP5-8.5']
test_exps = ['cesm1_PI', 'cesm1_hist_RCP85', 'cesm1_RCP85', 'cesm2_PI', 'cesm2_hist', 'cesm2_SSP5-8.5']
table_vars = ['NPP',
#               'NPP_100m',
              'POC',
              'CaCO3',
              'rain',
              'Nfix',
              'Ndep',
              'denitrif',
              'denitrif2',
              'Nemis',
              'rivflux',
              'Ncycle',
              'CO2',
              'NPP_diat_100m',
              'NPP_diat',
#               'NPP_diat_100m',
#               'O2',
#               'o2_under_20',
#               'o2_under_5',
#               'o2_under_60',
#               'o2_under_80'
             ]
our_table = pd.DataFrame(make_table(table_specs, table_vars, test_exps))
our_table

Unnamed: 0,Flux or Concentration,preindustrial (CESM1),1990 - 2014 (CESM1),RCP 8.5 2090s (CESM1),preindustrial (CESM2),1990-2014 (CESM2),SSP5-8.5 2090s (CESM2)
0,"Net primary production, full depth (PgC/yr)",56.1*,56.3*,54.1*,48.4,48.9,50.0
1,Sinking POC at 100 m (PgC/yr),8.08,7.99,7.21,7.0,7.07,6.71
2,Sinking CaCO$_3$ at 100 m (PgC/yr),0.758,0.749,0.724,0.769,0.769,0.81
3,Rain ratio (CaCO$_3$/POC) at 100 m,0.094,0.094,0.100,0.11,0.109,0.121
4,Nitrogen fixation (TgN/yr),176,169,144,242.0,244.0,286.0
5,Nitrogen deposition (TgN/yr),6.7,30.4,30.9,13.4,37.8,38.8
6,Water Column Denitrification (TgN/yr),190,194,188,185.0,192.0,258.0
7,Sediment Denitrification (TgN/yr),-,-,-,68.0,72.0,70.0
8,Nitrogen surface emissions (TgN/yr),-,-,-,6.0,5.0,3.0
9,Nitrogen River Flux (TgN/yr),-,-,-,5.0,9.0,9.0


In [12]:
print(our_table.to_latex(index=False, escape=False))

\begin{tabular}{lllllll}
\toprule
                       Flux or Concentration & preindustrial (CESM1) & 1990 - 2014 (CESM1) & RCP 8.5 2090s (CESM1) & preindustrial (CESM2) & 1990-2014 (CESM2) & SSP5-8.5 2090s (CESM2) \\
\midrule
 Net primary production, full depth (PgC/yr) &                 56.1* &               56.3* &                 54.1* &                  48.4 &              48.9 &                   50.0 \\
               Sinking POC at 100 m (PgC/yr) &                  8.08 &                7.99 &                  7.21 &                  7.00 &              7.07 &                   6.71 \\
          Sinking CaCO$_3$ at 100 m (PgC/yr) &                 0.758 &               0.749 &                 0.724 &                 0.769 &             0.769 &                  0.810 \\
          Rain ratio (CaCO$_3$/POC) at 100 m &                 0.094 &               0.094 &                 0.100 &                 0.110 &             0.109 &                  0.121 \\
                  Nitr