### xESMF - problems occuring when creating a `xarray.Dataset` by using `xesmf.Regridder.__call__`

* `xesmf.Regridder.__call__` does not forward some coordinate variables to the regridded `xarray.Dataset` that are registered as `xarray.Dataset.data_vars` and not as `xarray.Dataset.coords` while it forwards others (like the input horizontal bounds) that should rather be replaced.
* Trying to use `xarray.open_dataset(path_to_ds, decode_coords='all')` registers all coordinate variables properly BUT removes significant metadata, not allowing `cf_xarray.accessor._get_item.drop_bounds` to distinguish between the coordinate variable and its bounds (fixed in >0.6.1, yet unreleased). 
* xarrays `decode_coords='all'` registers for example `ps` (surface pressure), that is required for a sigma hybrid vertical axis, under `xarray.Dataset.coords`, preventing `xESMF` from remapping it (it gets dropped).
* I am not aware of a `cf_xarray.CFaccessor.method` / `cf_xarray`-function that redefines / resets `xarray.Dataset.data_vars` and `xarray.Dataset.coords` of an `xarray.Dataset` in the desired manner. So I set up custom methods within `clisops.core.Grid`.

-> Which of the problems are considered problems of xESMF and therefore should be treated in xESMF (rather than by the user or other libraries)?

In [1]:
import numpy as np
import xarray as xr
import cf_xarray as cfxr
import xesmf as xe
import clisops as cl
from clisops.core import Grid
print("Using cf-xarray in version %s" % cfxr.__version__)
print("Using xESMF in version %s" % xe.__version__)
print("Using clisops in version %s" % cl.__version__)

from pathlib import Path
from git import Repo
import os

import warnings
warnings.simplefilter("ignore") 
#with warnings.catch_warnings():
#        warnings.simplefilter("ignore")

xr.set_options(display_style='html');

Using cf-xarray in version 0.6.2.dev2+g1fde526
Using xESMF in version 0.6.1
Using clisops in version 0.6.5


##### Initialize test data

In [2]:
# Initialize mini-esgf-data
MINIESGF_URL="https://github.com/roocs/mini-esgf-data"
branch = "master"
MINIESGF = Path(Path.home(),".mini-esgf-data", branch)

# Retrieve mini-esgf test data
if not os.path.isdir(MINIESGF):
    repo = Repo.clone_from(MINIESGF_URL, MINIESGF)
    repo.git.checkout(branch)
else:
    repo = Repo(MINIESGF)
    repo.git.checkout(branch)
    repo.remotes[0].pull()
    
MINIESGF=Path(MINIESGF,"test_data")

## 1 Default approach

#### Load the datasets

In [3]:
# Load the dataset
ds_path_o3 = Path(MINIESGF, "badc/cmip6/data/CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/AERmon/"
                            "o3/gn/v20190710/o3_AERmon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_185001.nc")
ds_o3 = xr.open_dataset(ds_path_o3)
ds_o3

In [4]:
ds_path_tos = Path(MINIESGF, "badc/cmip6/data/CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/historical/r1i1p1f1/Omon/"
                             "tos/gn/v20190710/tos_Omon_MPI-ESM1-2-LR_historical_r1i1p1f1_gn_185001-186912.nc")
ds_tos = xr.open_dataset(ds_path_tos)
ds_tos

#### Calculate the regridding weights

In [5]:
# Specify a global 1 deg grid as target grid
ds_out = xe.util.grid_global(1,1)

In [6]:
# Create regridding weights
def regrid(ds_in, ds_out, method='nearest_s2d', locstream_in=False):
    """Convenience function for calculating regridding weights"""
    return xe.Regridder(ds_in, ds_out, method, locstream_in)

In [7]:
regridder_o3 = regrid(ds_o3, ds_out)
regridder_tos = regrid(ds_tos, ds_out)

#### Perform remapping

Important vertical coordinate variables are lost in the regridded dataset `ds_o3_g1`!
While the old bounds are kept in the regridded dataset `ds_tos_g1`.

In [8]:
ds_o3_g1 = regridder_o3(ds_o3, keep_attrs=True)
ds_tos_g1 = regridder_tos(ds_tos, keep_attrs=True)

In [9]:
ds_o3_g1

In [10]:
ds_tos_g1

### What can be done?

* Get all coordinate variables and auxiliary coordinate variables to be recognized as coordinates by xarray.
  * by using the option decode_coordinates when loading the dataset
  * by making use the clisops Grid methods (that could be implemented into xesmf itself)
* Store the remapped variables in ds_out and transfer all necessary (eg. non-horizontal) coordinate variables to ds_out

## 2 Approach using `xarray.open_dataset(path_to_ds, decode_coords='all')`

That causes the variable `ps` in `ds_o3` to be defined under `xarray.Dataset.coords`, leading to xESMF dropping this variable when calling `xesmf.Regridder.__call__`.
For `ds_tos` the `cf_xarray.CFAccessor` cannot uniquely identify the horizontal coordinates until incl. v0.6.1 (https://github.com/xarray-contrib/cf-xarray/pull/264). Here, the old vertices are not dropped, but the new bounds are not assigned to the dataset.

In [11]:
# Load datasets with decode_coords="all" parameter
# for ds_o3 it will cause ps to be identified as coordinate (and therefore dropped during remapping)
ds_o3 = xr.open_dataset(ds_path_o3, decode_coords='all')
ds_o3

In [12]:
ds_tos = xr.open_dataset(ds_path_tos, decode_coords='all')
ds_tos

In [13]:
regridder_o3 = regrid(ds_o3, ds_out)
regridder_tos = regrid(ds_tos, ds_out)

#### All coordinates are kept, but the variable ps is dropped. However, it should have been remapped as well.

In [14]:
ds_o3_g1 = regridder_o3(ds_o3, keep_attrs=True)
ds_o3_g1

#### The old vertices are dropped, but the new bounds are not assigned to the dataset

In [15]:
ds_tos_g1 = regridder_tos(ds_tos, keep_attrs=True)
ds_tos_g1

## 3 Approach using clisops Grid methods to (re)set `xarray.Dataset.data_vars` and `xarray.Dataset.coords` and storing remapped data in ds_out

Variable `ps` in `ds_o3` is not remapped as well, all necessary coordinate variables are kept.
For `ds_tos` the horizontal bounds are dropped.

#### Build Grid objects

In [16]:
grid_o3 = Grid(ds = ds_o3)
grid_o3.ds

In [17]:
grid_tos = Grid(ds = ds_tos)
grid_tos.ds

In [18]:
grid_out_o3 = Grid(ds = ds_out)
grid_out_tos = Grid(ds = ds_out)
grid_out_o3.ds

#### Manually transfer coordinates and attributes

`xarray.Dataset.attrs` and essential `xarray.Dataset.coords` have to be moved manually.

In [19]:
grid_out_o3._transfer_coords(grid_o3)
grid_out_tos._transfer_coords(grid_tos)

#### Build regridder and regrid

In [20]:
regridder_o3 = regrid(grid_o3.ds, grid_out_o3.ds)
regridder_tos = regrid(grid_tos.ds, grid_out_tos.ds)

In [21]:
grid_out_o3.ds["o3"] = regridder_o3(grid_o3.ds["o3"])
grid_out_o3.ds["ps"] = regridder_o3(grid_o3.ds["ps"])
grid_out_o3.ds

In [22]:
grid_out_tos.ds["tos"] = regridder_tos(grid_tos.ds["tos"])
grid_out_tos.ds