A demonstration of the MaskMaker class to build and use regional masking

MaskMasker is a class of methods to assist with making regional masks within COAsT.
Presently the mask generated are external to MaskMaker.
Masks are constructed as gridded boolean numpy array for each region, which are stacked over a dim_mask dimension. 
The mask arrays are generated on a supplied horizontal grid. The masks are then stored in xarray objects along with regions names.

Examples are given working with Gridded and Profile data.

### Relevant imports and filepath configuration

In [None]:
import coast
import numpy as np
from os import path
import matplotlib.pyplot as plt
import matplotlib.colors as colors  # colormap fiddling
import xarray as xr

In [None]:
# set some paths
root = "./"
dn_files = root + "./example_files/"
fn_nemo_grid_t_dat = dn_files + "nemo_data_T_grid_Aug2015.nc"
fn_nemo_dom = dn_files + "coast_example_nemo_domain.nc"
config_t = root + "./config/example_nemo_grid_t.json"

### Loading data

In [None]:
# Create a Gridded object and load in the data:
nemo = coast.Gridded(fn_nemo_grid_t_dat, fn_nemo_dom, config=config_t)

# Initialise MaskMaker and define target grid


In [None]:
mm = coast.MaskMaker()

# Define Regional Masks
regional_masks = []

# Define convenient aliases based on nemo data
lon = nemo.dataset.longitude.values
lat = nemo.dataset.latitude.values
bathy = nemo.dataset.bathymetry.values


# Use MaskMaker to define new regions

MaskMaker can build a stack of boolean masks in an xarray dataset for regional analysis. Regions can be supplied by providing vertices coordiates to the `make_region_from_vertices` method. (Vertices coordinates can be passed as xarray objects or as numpy arrays).
The method returns a numpy array of booleans.

In [None]:
# Draw and fill a square
vertices_lon = [-5, -5, 5, 5]
vertices_lat = [40, 60, 60, 40]

# input lat/lon as xr.DataArray
filled1 = mm.make_region_from_vertices(nemo.dataset.longitude, nemo.dataset.latitude, vertices_lon, vertices_lat)
# input lat/lon as np.ndarray
filled2 = mm.make_region_from_vertices(
    nemo.dataset.longitude.values, nemo.dataset.latitude.values, vertices_lon, vertices_lat
)

check = (filled1 == filled2).all()
print(f"numpy array outputs are the same? {check}")

The boolean numpy array can then be converted to an xarray object using `make_mask_dataset()` for improved interactions with other xarray objects. 

In [None]:
mask_xr = mm.make_mask_dataset(nemo.dataset.longitude.values, nemo.dataset.latitude.values, filled1)

# Use MaskMaker for predefined regions

The NWS has a number of predefined regions. These are numpy boolean arrays as functions of the specified latitude, longitude and bathymetry. They can be appended into a list of arrays, which can be similarly converted into an xarray object.

In [None]:
masks_list = []

# Add regional mask for whole domain
masks_list.append(np.ones(lon.shape))

# Add regional mask for English Channel
masks_list.append(mm.region_def_nws_north_sea(lon, lat, bathy))
masks_list.append(mm.region_def_nws_outer_shelf(lon, lat, bathy))
masks_list.append(mm.region_def_nws_norwegian_trench(lon, lat, bathy))
masks_list.append(mm.region_def_nws_english_channel(lon, lat, bathy))
masks_list.append(mm.region_def_south_north_sea(lon, lat, bathy))
masks_list.append(mm.region_def_off_shelf(lon, lat, bathy))
masks_list.append(mm.region_def_irish_sea(lon, lat, bathy))
masks_list.append(mm.region_def_kattegat(lon, lat, bathy))

masks_names = ["whole domain", "north sea", "outer shelf", "norwegian trench",
                "english_channel", "southern north sea", "off shelf",
                "irish sea", "kattegat",]

As before the numpy arrays (here as a list) can be converted into an xarray dataset where each mask is separated along the `dim_mask` dimension

In [None]:
mask_xr = mm.make_mask_dataset(lon, lat, masks_list, masks_names)

In [None]:
# Inspect mask xarray object structure
mask_xr

## Plot masks

Inspect the mask with a `quick_plot()` method.

In [None]:
mm.quick_plot(mask_xr)


NB overlapping regions are not given special treatment, the layers are blindly superimposed on each other. E.g. as demonstrated with "Norwegian Trench" and "off shelf", or "whole domain" and any other region.

In [None]:
plt.subplot(2,2,1)
mm.quick_plot(mask_xr.sel(dim_mask=[0,3]))

plt.subplot(2,2,2)
mm.quick_plot(mask_xr.sel(dim_mask=[1,2,4,5,6,7,8]))

plt.tight_layout()

In [None]:
# Show overlap
mask_xr.mask.sum(dim='dim_mask').plot(levels=(1,2,3,4))

# Save if required
#plt.savefig('tmp.png')

# Regional analysis with Profile data

Apply the regional masks to average SST

In [None]:
# Read EN4 data into profile object
fn_prof = path.join(dn_files, "coast_example_en4_201008.nc")
fn_cfg_prof = path.join("config","example_en4_profiles.json")
profile = coast.Profile(config=fn_cfg_prof)
profile.read_en4( fn_prof )


Then we use `ProfileAnalysis.determine_mask_indices()` to figure out which profiles in a Profile object lie within each regional mask:

In [None]:
analysis = coast.ProfileAnalysis()
mask_indices = analysis.determine_mask_indices(profile, mask_xr)

This returns an object called `mask_indices`, which is required to pass to `ProfileAnalysis.mask_means()`. This routine will return a new xarray dataset containing averaged data for each region:

In [None]:
profile_mask_means = analysis.mask_means(profile, mask_indices)

This routine operates over all variables in the `profile` object. It calculates means by region preserving depth information (`profile_mean_*`) and also averaging over depth information (`all_mean_*`). The variables are returned with these prefixes accordingly. 

In [None]:
profile_mask_means

Notices that the number of mask dimensions is not necessarily preserved between the mask and the mask averaged variables. This happens if, for example, there are no profiles in one of the mask regions

In [None]:
check1 = mask_indices.dims["dim_mask"] == profile_mask_means.dims["dim_mask"]
print(check1)

The mean profiles can be visualised or further processed (notice the Irish Sea region is missing because there were no profiles in the example dataset)

In [None]:
for count_region in range(profile_mask_means.sizes['dim_mask']):    
    plt.plot( 
            profile_mask_means.profile_mean_temperature.isel(dim_mask=count_region),
            profile_mask_means.profile_mean_depth.isel(dim_mask=count_region),
            label=profile_mask_means.region_names[count_region].values,
            marker=".", linestyle='none')

plt.ylim([10,1000])
plt.yscale("log")
plt.gca().invert_yaxis()
plt.xlabel('temperature'); plt.ylabel('depth')
plt.legend()


# Regional analysis with Gridded data

Apply the regional masks to average SST. This is done manually as there are not yet COAsT methods to broadcast the operations across all variables.

In [None]:
# Syntax: xr.where(if <first>, then <2nd>, else <3rd>) 
mask_SST = xr.where( mask_xr.mask, nemo.dataset.temperature.isel(z_dim=0), np.NaN)

# Take the mean over space for each region
mask_mean_SST = mask_SST.mean(dim="x_dim").mean(dim="y_dim")

In [None]:
# Inspect the processed data
mask_mean_SST.plot()

In [None]:
# Plot timeseries per region

for count_region in range(mask_mean_SST.sizes['dim_mask']):
    
    plt.plot( 
        mask_mean_SST.isel(dim_mask=count_region),
        label=mask_mean_SST.region_names[count_region].values,
        marker=".", linestyle='none')

plt.xlabel('time'); plt.ylabel('SST')
plt.legend()