## Reproject Datasets

To reprojected/regrid the ET datasets, we fill follow the reprojectiong method from a HyTEST notebook: [conus404_regrid.ipynb](https://github.com/hytest-org/hytest/blob/main/dataset_access/conus404_regrid.ipynb)

This method uses xESMF to reproject rectilinear grids, which is what we have. See this [notebook](https://xesmf.readthedocs.io/en/latest/notebooks/Rectilinear_grid.html) for details. Uses one of six algorithms listed [here](https://xesmf.readthedocs.io/en/latest/notebooks/Compare_algorithms.html). They recommend `conservative` with upscaling (increased pixel size). An inportant note is that extra dimensions must be on the left, i.e. `(time, lev, lat, lon)` is correct but `(lat, lon, time, lev)` would not work. Our ET datasets have this format. So, we are ready to implement.

> Note: This requires the dev version of xESMF (v0.7.2), which allows for datasets to have different chunks.

In [None]:
import hvplot.xarray
import cartopy.crs as ccrs
import numpy as np
import xarray as xr
import os
from pathlib import Path

if 'ESMFMKFILE' not in os.environ:
    os.environ['ESMFMKFILE'] = str(Path(os.__file__).parent.parent / 'esmf.mk')

import xesmf as xe

Let's read in our datasets that we compiled in the `compile_datasets.ipynb` notebook.

In [None]:
terra = xr.open_dataset('terraclimate/terraclimate_aet.nc', engine='netcdf4', chunks={'lat': -1, 'lon': -1, 'time': 30})
era5 = xr.open_dataset('era5/era5_aet.nc', engine='netcdf4', chunks={'lon':-1, 'lat':-1, 'time': 219})
nldas = xr.open_dataset('nldas/nldas_aet.nc', engine='netcdf4', chunks={'lat': -1, 'lon': -1, 'time': -1})
gleam = xr.open_dataset('gleam/gleam_aet.nc', engine='netcdf4', chunks={'lat': -1, 'lon': -1, 'time': -1})
wbet = xr.open_dataset('wbet/wbet_aet.nc', engine='netcdf4', chunks={'lat': -1, 'lon': -1, 'time': 2})
ssebop = xr.open_dataset('ssebop/ssebop_aet.nc', engine='netcdf4', chunks={'lon': -1, 'lat': -1, 'time': 2})

Either by looking at the datasets or from general knowledge of the data, GLEAM has the lowest reolution of the six datasets. So, we will upscale the other datasets to match its resolution. To do this, we just need to make a `Regridder` object with our dataset and the target dataset. We then regrid. Since we have loaded our datasets as dask arrays, this should be a lazy computation. So, it won't compute until we need it.

> Note: Since there are `NaN`s in the maps where there is open water, we need to account for this in our regridding. Setting `skipna=True` ignores `NaN`s in the computations. Without it, any regridded pixel that had a `NaN` in the calculation from the original image would result in a `NaN` (i.e., the `NaN` area would increase).

In [None]:
ds_to_regrid = [terra, era5, nldas, wbet, ssebop]
ds_names = ['terraclimate', 'era5', 'nldas', 'wbet', 'ssebop']
for ds, name in zip(ds_to_regrid, ds_names):
    regrid = xe.Regridder(ds, gleam, "conservative")
    ds_regridded = regrid(ds, keep_attrs=True, skipna=True).chunk({'lat': -1, 'lon': -1, 'time': -1})
    ds_regridded.to_netcdf(path=name+'/'+name+'_aet_regridded.nc', format='NETCDF4', engine='netcdf4')
    # Delete variables to reduce memory pile-up
    del regrid, ds_regridded

Now that we have regridded, let's see how the regridded datasets compare with the GLEAM dataset.

In [None]:
terra_regrid = xr.open_dataset('terraclimate/terraclimate_aet_regridded.nc', engine='netcdf4')
era5_regrid = xr.open_dataset('era5/era5_aet_regridded.nc', engine='netcdf4')
nldas_regrid = xr.open_dataset('nldas/nldas_aet_regridded.nc', engine='netcdf4')
wbet_regrid = xr.open_dataset('wbet/wbet_aet_regridded.nc', engine='netcdf4')
ssebop_regrid = xr.open_dataset('ssebop/ssebop_aet_regridded.nc', engine='netcdf4')

In [None]:
plt = terra_regrid.hvplot(groupby='time', geo=True, coastline=True, title='TerraClimate Regridded').opts(width=400) + \
      era5_regrid.hvplot(groupby='time', geo=True, coastline=True, title='ERA5 Regridded').opts(width=400) + \
      nldas_regrid.hvplot(groupby='time', geo=True, coastline=True, title='NLDAS Regridded').opts(width=400) + \
      wbet_regrid.hvplot(groupby='time', geo=True, coastline=True, title='WBET Regridded').opts(width=400) + \
      ssebop_regrid.hvplot(groupby='time', geo=True, coastline=True, title='SSEBop Regridded').opts(width=400) + \
      gleam.hvplot(groupby='time', geo=True, coastline=True, title='GLEAM Original').opts(width=400)

import panel as pn
pn.panel(plt.cols(2), widget_location='top')