## Determine the fidelity of offline rebinning compared to online regridding
MOM6 has the capacity to interpolate tracer fields online onto "diagnostic" vertical coordinates, e.g. rho2 or z. However, truncation errors in the interpolation mean that there are issues with closing the budgets in the new vertical coordinates (see calc_budget_tracer_regridded, and work by Andrew Shao). Furthermore, it is not always possible to do such online regridding (e.g. for CMIP6 data). Thus, we wish to determine whether an offline rebinning approach (using xhistogram) can reproduce budget closure with accuracy at least comparable to the online regridding approach.  

There are two main difficulties with the offline binning approach. First, defining bin widths can result in data gaps (if no grid cell exists with a tracer value that fits into that bin). Secondly, the binning procedure necessarily assigns the budget terms to a bin according to the _time-mean_ tracer value for that grid cell.  

We explore the sensitivity of the binning approach to both these issues, using daily output from the ESM4 model.

In [3]:
%load_ext autoreload
%autoreload 2

In [4]:
import xarray as xr
from matplotlib import pyplot as plt
import wmt_bgc.budgetcalcs as bc
import wmt_bgc.rebin_functions as rb
from xhistogram.xarray import histogram
import numpy as np
from dask.diagnostics import ProgressBar

In [3]:
rootdir = '/archive/gam/ESM4/DECK/ESM4_piControl_D/gfdl.ncrc4-intel16-prod-openmp/history/'
config = '08990101.ocean_'
filename_grid = '08990101.ocean_static_no_mask_table.nc'
ds_daily_native = xr.open_dataset(rootdir+config+'daily.nc',decode_times=False)
ds_monthly_native = xr.open_dataset(rootdir+config+'month.nc',decode_times=False)
ds_daily_rho2 = xr.open_dataset(rootdir+config+'daily_rho2.nc',decode_times=False)
ds_monthly_rho2 = xr.open_dataset(rootdir+config+'month_rho2.nc',decode_times=False)
ds_daily_rho2h = xr.open_dataset(rootdir+config+'daily_rho2h.nc',decode_times=False)
ds_monthly_rho2h = xr.open_dataset(rootdir+config+'month_rho2h.nc',decode_times=False)
grid = xr.open_dataset(rootdir+filename_grid)

In [None]:
# Rebin whole dataset in vertical dimension (retain x, y, and time)
# Takes a long time, save to netcdf after binning
ds_daily_native_rebinned = rb.vertical_rebin_wrapper(ds_daily_native,"rhopot2",ds_daily_rho2.rho2_i.values,dz_name="thkcello",vert_dim="zl")
ds_daily_native_rebinned.to_netcdf('data/processed/ESM4_daily_08990101_verticalrebin_rho2')
#ds_monthly_native_rebinned = rb.vertical_rebin_wrapper(ds_monthly_native,"rhopot2",ds_monthly_rho2.rho2_i.values,dz_name="thkcello",vert_dim="zl")
#ds_monthly_native_rebinned.to_netcdf('data/processed/ESM4_monthly_08990101_verticalrebin_rho2')

In [4]:
ds_monthly_native_rebinned_rho2 = rb.total_rebin_layerintegral(ds_monthly_native,
                             ds_monthly_native['rhopot2'],
                             bins=ds_monthly_rho2['rho2_i'].values,
                             dim=['xh','yh','zl'],
                             area=grid['areacello'])

ZeroDivisionError: integer division or modulo by zero

In [36]:
# Perform binning on individual terms
term = 'opottempdiff'
var = ds_daily_native[term].isel(time=slice(0,10)).squeeze()
binvar = ds_daily_native['rhopot2'].isel(time=slice(0,10)).squeeze()
nanmask = np.isnan(var)
term_daily_native_rebinned = histogram(binvar.where(~nanmask),
                                       bins=[ds_daily_rho2.rho2_i.values],
                                       dim=['xh','yh','zl'],
                                       weights=(var*grid['areacello']).where(~nanmask),
                                       block_size=10)

In [None]:
term = 'opottempdiff'
daily_rebinned_timemean = (ds_daily_native_rebinned[term]*grid.areacello).sum(dim=['xh','yh']).mean(dim='time').squeeze()
monthly_rebinned = (ds_monthly_native_rebinned[term]*grid.areacello).sum(dim=['xh','yh']).squeeze()
daily_rho2_timemean = (ds_daily_rho2[term]*grid.areacello).sum(dim=['xh','yh']).mean(dim='time').squeeze()
monthly_rho2 = (ds_monthly_rho2[term]*grid.areacello).sum(dim=['xh','yh']).squeeze()
daily_rho2h_timemean = (ds_daily_rho2h[term]*grid.areacello).sum(dim=['xh','yh']).mean(dim='time').squeeze()
monthly_rho2h = (ds_monthly_rho2h[term]*grid.areacello).sum(dim=['xh','yh']).squeeze()

In [None]:
daily_rebinned_timemean.plot(label='daily_rebinned_timemean')
monthly_rebinned.plot(label='monthly_rebinned',linestyle='--')
daily_remapped_timemean.plot(label='daily_remapped_timemean',linestyle='-.')
monthly_remapped.plot(label='monthly_remapped',linestyle=':')
plt.legend()
plt.gca().set_xlim([1028,1038])