# Relative Bias Estimates between Data Sets

Now that we have computed the error variances in the [TC notebook](2_TC_application.ipynb) and covariances in the [EC notebook](3_EC_application.ipynb), let's compare the differences (i.e., relative bias) between data sets over all time/seasons. We can then compare this relative bias to the error variances. If the bias is on a smaller scale than errors, it would then show that bias is not as important as the error variance in ET data sets.

In [1]:
import hvplot.xarray
import holoviews as hv
import panel as pn
import cartopy.crs as ccrs
import numpy as np
import xarray as xr
import itertools
import warnings

## Combine Data Sets in Xarray
First, we need to load in our ET data sets and limit them to a common date range. Since biases will be between two data sets, we will restrict the data ranges of all data sets to have the beginning date of the second oldest starting date and ending data of the second most recent ending date. This choice allows us to save some memory usage, while also utilizing the largest amount of data. When computing biases for data sets with a more restricted date range, the missing data should propagate and not give us a bias on those dates.

In [2]:
files = ['../Data/ssebop/ssebop_aet_regridded.nc',
         '../Data/gleam/gleam_aet.nc',
         '../Data/era5/era5_aet_regridded.nc',
         '../Data/nldas/nldas_aet_regridded.nc',
         '../Data/terraclimate/terraclimate_aet_regridded.nc',        
         '../Data/wbet/wbet_aet_regridded.nc',
         ]
dataset_name = ['SSEBop', 'GLEAM', 'ERA5', 'NLDAS', 'TerraClimate', 'WBET']

date_ranges = {}
for file, name in zip(files, dataset_name):
    ds_temp = xr.open_dataset(file, engine='netcdf4', chunks={'lon': -1, 'lat': -1, 'time': -1})
    date_ranges[name] = [ds_temp.time.min().values, ds_temp.time.max().values]

# Take the third oldest start and third most recent end dates
date_range = [np.sort(np.array(list(date_ranges.values()))[:, 0])[1],
              np.sort(np.array(list(date_ranges.values()))[:, 1])[-2]]
date_range

[numpy.datetime64('1950-01-01T00:00:00.000000000'),
 numpy.datetime64('2022-12-01T00:00:00.000000000')]

Using the date range, we can now combine all of the data sets into a single `Xarray` `DataSet` for easy computations.

In [3]:
def preprocess(ds):
    """
    Keep only the specified time range for each file.
    """
    return ds.sel(time=slice(date_range[0], date_range[1]))

ds = xr.open_mfdataset(files, engine='netcdf4', preprocess=preprocess, combine='nested', concat_dim='dataset_name')
ds = ds.assign_coords({'dataset_name': dataset_name})
ds.dataset_name.attrs['description'] = 'Dataset name'

# Need time as first index for TC computation
ds = ds.transpose('time', ...)
# The data set is less than 1GiB, so let's read it into memory vs keeping as a dask array
ds = ds.compute()
ds

## Relative Bias Estimation

Next, we will want to compute the relative bias for all 15 possible pairs of our six data sets. So, let's generate those pairs or combinations.

In [4]:
# Generate a list of the combinations
combos = list(itertools.combinations(dataset_name, 2))
combos = [list(combo) for combo in combos]
combos

[['SSEBop', 'GLEAM'],
 ['SSEBop', 'ERA5'],
 ['SSEBop', 'NLDAS'],
 ['SSEBop', 'TerraClimate'],
 ['SSEBop', 'WBET'],
 ['GLEAM', 'ERA5'],
 ['GLEAM', 'NLDAS'],
 ['GLEAM', 'TerraClimate'],
 ['GLEAM', 'WBET'],
 ['ERA5', 'NLDAS'],
 ['ERA5', 'TerraClimate'],
 ['ERA5', 'WBET'],
 ['NLDAS', 'TerraClimate'],
 ['NLDAS', 'WBET'],
 ['TerraClimate', 'WBET']]

Now that we have our data set combinations, let's compute the relative biases!

In [5]:
ds_diff = []
for combo in combos:
    ds_combo = ds.sel(dataset_name=combo)

    da_combo_diff = ds_combo.aet.diff('dataset_name')
    da_combo_diff = da_combo_diff.squeeze('dataset_name').drop_vars('dataset_name')

    ds_combo_diff = xr.Dataset(data_vars={'difference': da_combo_diff},
                               coords={'dataset_pairs': [' '.join(combo)],
                                       'time': ds.time, 'lat': ds.lat, 'lon': ds.lon})
    ds_diff.append(ds_combo_diff)

ds_diff = xr.concat(ds_diff, dim='dataset_pairs')

ds_diff.difference.attrs['description'] = 'Difference between two data sets listed in dataset_pairs'
ds_diff.dataset_pairs.attrs['description'] = 'Dataset pairs used in difference.'
ds_diff.difference.attrs['units'] = 'mm.month-1'

ds_diff

Now, let's see how the resulting biases look.

In [6]:
plt = ds_diff.difference.hvplot(groupby=['dataset_pairs', 'time'], geo=True, coastline=True,
                                clim=(-75, 75), cmap='PuOr').opts(frame_width=500)

pn.panel(plt, widget_location='top')

## Relative Bias Discussion

Looking at the biases, we can see a large temporal variation in each estimate. However, while being able to check this relative bias temporally is in itself interesting, our goal is to compare the bias with the error variances estimated from TC. Therefore, we need a single bias product that does not vary with time. To that end, we will temporally average the bias estimates and use these averages to compare with the errors.

In [7]:
# We want to ignore all of the sqrt and log warnings with negative values
warnings.filterwarnings("ignore", category=RuntimeWarning)

# Create list of seasons
seasons = ['All'] + list(np.unique(ds.time.dt.season))

ds_mean_bias = []
ds_median_bias = []
ds_std_bias = []
ds_count_bias = []
for season in seasons:
    if season == 'All':
        ds_season = ds_diff
    else:
        ds_season = ds_diff.isel(time=(ds.time.dt.season == season))

    mean_bias = ds_season.difference.mean(dim='time', skipna=True, keep_attrs=True).expand_dims(season=[season])
    mean_bias.name = 'mean_bias'
    mean_bias.attrs['description'] = 'Mean bias estimate for all common time steps between data sets.'
    mean_bias.attrs['units'] = 'mm.month-1'
    ds_mean_bias.append(mean_bias)

    median_bias = ds_season.difference.median(dim='time', skipna=True, keep_attrs=True).expand_dims(season=[season])
    median_bias.name = 'median_bias'
    median_bias.attrs['description'] = 'Median bias estimate for all common time steps between data sets.'
    median_bias.attrs['units'] = 'mm.month-1'
    ds_median_bias.append(median_bias)

    std_bias = ds_season.difference.std(dim='time', ddof=1, skipna=True, keep_attrs=True).expand_dims(season=[season])
    std_bias.name = 'std_bias'
    std_bias.attrs['description'] = 'Standard deviation of the bias estimates for all common time steps between data sets.'
    std_bias.attrs['units'] = 'mm.month-1'
    ds_std_bias.append(std_bias)

    count_bias = np.isfinite(ds_season.difference).sum(dim='time').expand_dims(season=[season])
    count_bias.name = 'Counts'
    count_bias.attrs['description'] = ('Number of datasets used in the average bias '
                                         'estimates (i.e., number of finite time values in a given pixel).')
    count_bias.attrs['units'] = 'counts'
    ds_count_bias.append(count_bias)

ds_mean_bias = xr.concat(ds_mean_bias, dim='season')
ds_median_bias = xr.concat(ds_median_bias, dim='season')
ds_std_bias = xr.concat(ds_std_bias, dim='season')
ds_count_bias = xr.concat(ds_count_bias, dim='season')

# Compile these DataSets into one and save
bias_averages = xr.merge([ds_mean_bias, ds_median_bias, ds_std_bias, ds_count_bias], join='exact')
_ = bias_averages.to_netcdf(path='../Data/compiled_avg_bias.nc', format='NETCDF4', engine='netcdf4')

PermissionError: [Errno 13] Permission denied: '/home/kdoore/git_repos/workflow-2023-doore-triple-collocation/Data/compiled_avg_bias.nc'

Now that we have our average biases, let's generate some plots that show how the biases compare to the error standard deviations. If we see that the bias is commonly much lower than the errors, then it indicates that the uncertainty in ET data sets are more important than the relative biases between them. Conversely, if the bias is larger, then the choice of ET data set could have implications and propagated biases on resulting products modeled from the ET data.

To show this comparison clearly, we will use a scaled fraction difference between the error and bias. This will follow the simple formula of:

$$\textrm{frac\_diff} = \frac{\textrm{error} - \textrm{bias}}{\textrm{error} + \textrm{bias}}.$$

This scales the difference to be between -1 and 1, where negative values indicate that the bias is larger than the errors and vice versa for the positive values.

In [9]:
tc_est_averages = xr.open_dataset('../Data/compiled_TC_avg_errs.nc', engine='netcdf4')

def median_diff_plots(dataset_pairs='SSEBop GLEAM', season='All'):
    tc_avg_season = tc_est_averages.sel(season=season)
    bias_avg_season = bias_averages.sel(season=season)
    
    ds_median = bias_avg_season.median_bias.sel(dataset_pairs=dataset_pairs)
    ds_median_abs = abs(ds_median)
    ds_median_abs.name = 'absolute difference'
    ds1_error = tc_avg_season.median_error.sel(dataset_name=dataset_pairs.split()[0])
    ds2_error = tc_avg_season.median_error.sel(dataset_name=dataset_pairs.split()[1])
    ds1_bias_var_diff = (ds1_error - ds_median_abs)/(ds1_error + ds_median_abs)
    ds2_bias_var_diff = (ds2_error - ds_median_abs)/(ds2_error + ds_median_abs)

    plt = (ds_median.hvplot(geo=True, coastline=True, clim=(-50, 50), cmap='PuOr',
                            title='Median Difference of '+dataset_pairs.split()[0]+' - '+dataset_pairs.split()[1]+' (Bias)').opts(frame_width=500)
           + ds_median_abs.hvplot(geo=True, coastline=True, clim=(0, 50), cmap='Purples',
                                  title='Median Absolute Difference of '+dataset_pairs.split()[0]+' - '+dataset_pairs.split()[1]+' (Absolute Bias)').opts(frame_width=500)
           + ds1_bias_var_diff.hvplot(geo=True, coastline=True, clim=(-1, 1), cmap='PuOr',
                                      title='Scaled Fractional Difference of Median Error of '+dataset_pairs.split()[0]+' and Absolute Bias').opts(frame_width=500)
           + ds2_bias_var_diff.hvplot(geo=True, coastline=True, clim=(-1, 1), cmap='PuOr',
                                      title='Scaled Fractional Difference of Median Error of '+dataset_pairs.split()[1]+' and Absolute Bias').opts(frame_width=500))

    return plt.cols(2)

# Limit combo options to have W as the common base
dataset_pairs_widget = pn.widgets.Select(name="dataset_pairs", value="SSEBop GLEAM", options=list(ds_diff.dataset_pairs.data))
season_widget = pn.widgets.Select(name="season", value="All", options=['All', 'DJF', 'MAM', 'JJA', 'SON'])

bound_plot = pn.bind(median_diff_plots, dataset_pairs=dataset_pairs_widget, season=season_widget)

pn.Column(dataset_pairs_widget, season_widget, bound_plot)

From these results, we can notice several things. One is the gerneral trends in average bias. When looking at all the pairs, we can see that SSEBop typically estimates larger ET values compare to all other data sets except in the Southeast, whereas ERA5 typically has lower ET estimates overall. This general comparison can help us understand the overall spatial difference between the six data sets. While we could continue to check this out in more detail, we want to focus on the comparison between the bias and the error standard deviations. One thing to reiterate when comparing the biases to their error estimates is that the errors are likely lower limits on the uncertainty and the bias is a median estimate. Therefore, if we see that the errors are larger than the bias, this is likely true for the majority of the monthly ET data (i.e., 50% of the monthly relative bias is below the median). So, looking at the percent differences between the errors and the bias for each data set we can see that:

1. SSEBop - The errors are typically larger than the bias when compared to all data sets in the central US, whereas the bias seems to be larger in the Rocky Mountain region. As for the eastern US, depending on the data set pair, the bias or the error could be larger. Looking at seasonal data, the bias is almost always larger throughout CONUS, besides in the Fall, when the errors can be larger in the central US.
2. GLEAM - The errors are typically larger than the bias when compared to all data sets in the western US (excluding the Pacific Northwest), whereas the bias is generally larger in the eastern half of CONUS. Looking at seasonal data, there is no general trends of the bias or uncertainty being larger in throughout each season besides winter, as the visual trends are quite noisy. For winter, the bias is clearly larger for most of CONUS.
3. ERA5 - The errors are typically larger than the bias for all data sets throughout CONUS. Some data set pairs show the bias being larger near coastal regions, but not the entire CONUS coast. As for seasonal trends, the bias is larger throughout for both winter and spring, but summer and fall show the errors are larger in the southern US for most data set pairs and sometimes larger in the northern US.
4. NLDAS - The bias is larger for the the eastern US, especially the southeastern US. Otherwise, the errors are typically larger in the centeral and western US. In terms of seasonal variation, the bias dominates in the winter and spring. During the summer, the southwest and centeral US can have larger errors compared to the bias, and same goes with fall, but the bias typically dominates more compared to the summer.
5. TerraClimate - The errors are larger than the bias for all of CONUS except for the southeast and the coastal Pacific northwest. When looking at the seasons, the errors dominate throughout all seasons in the centeral US and Great Plains. In summer and fall, the error dominating the bias also occurs in the southwest. Besides these, the bias dominates, especially in the eastern US.
6. WBET - On average, the errors dominate the bias, but there is some variation between data sets. For the seasons, the bias is larger for the winter and spring months excluding the north central US. For the summer and fall, the errors can be larger depending on the data set pair, with all but SSEBop and TerraClimate pairs showing larger errors throughout most of CONUS.

From these summaries, we can draw a few conclusions about the comparison between the bias and uncertainty. In general, most data sets do well when looking at all the data. Each have certain regions where the bias between all data sets is larger than the given data set's uncertainty. These regions indicate one of two things with the data set, either 1) the lack of consensus with the other data sets shows the ET data set is likely truly biased in these regions, or 2) the uncertainty is underestimated in these regions. Either way, having a commonly larger bias compared to the uncertainty indicates that the ET data set is not optimally performing in these given regions. As for seasonalities, winter normally has larger bias compared to uncertainty. However, this is to be expected as the winter uncertainties should be smaller compared to other seasons due to winter typically having low to no ET. For the other seasons, it varies whether the bias or uncertainty is larger based on data set and the corresponding bias pair. Therefore, broad conclusions for these months are harder to draw. Overall though, these bias and error comparisons can help us better understand the strengths and commonalities of each data set.