## Adding Spatial Metadata to AORC Forcing

**Authors**: Tony Castronova <acastronova@cuahsi.org>, Irene Garousi-Nejad <igarousi@cuahsi.org>  
**Last Updated**: 03.31.2023

**Description**:  

This notebook demonstrates how to add spatial metadata to the AORC v1.0 forcing data that is stored on HydroShare's THREDDs. The original AORC v1.0 data contains `east_west` and `south_north` which allows us to slice the gridded data via `x` and `y` indices. It is necessary to add additional spatially-related metadata (e.g. coordinate reference system) to enable spatial querying and visualization of these data. This notebook demonstrates one method for doing this.

**Software Requirements**

This notebook was developed using the following software and operating system versions.

OS: MacOS Ventura 13.0.1  
Python: 3.10.0
Zarr: 2.13.2  
NetCDF4: 1.6.1  
xarray: 0.17.0  
fsspec: 0.8.7  
dask: 2021.3.0  
numpy: 1.24.1
rioxarray: 0.13.3

---

In [None]:
import re
import numpy
import xarray
import rioxarray 
import matplotlib.pyplot as plt

Load the AORC v1.0 data via HydroShare's THREDDS

In [None]:
# load a single month of data
ds_aorc = xarray.open_dataset('http://thredds.hydroshare.org/thredds/dodsC/aorc/data/16/201001.nc',
                              chunks={'Time': 10, 'west_east': 285, 'south_north':275},
                              decode_coords="all" )

Notice that the `south_north` and `west_east` dimensions contain indices and there do not exist coordinates containing values for these dimensions.

In [None]:
ds_aorc

In [None]:
ds_aorc.south_north

Load the GeoSpatial Metadata for NWM v2.0 that is stored in HydroShare. The `WRF_Hydro_NWM_geospatial_data_template_land_GIS.nc` file is part of the NWM v2.0 domain files and contains spatial metadata that we can add to the AORC dataset. We can access this via HydroShare's THREDDS too.

https://www.hydroshare.org/resource/2a8a3566e1c84b8eb3871f30841a3855/

In [None]:
ds_meta = xarray.open_dataset('http://thredds.hydroshare.org/thredds/dodsC/hydroshare/resources/2a8a3566e1c84b8eb3871f30841a3855/data/contents/WRF_Hydro_NWM_geospatial_data_template_land_GIS.nc')
ds_meta

The AORC v1.0 data that we're using only covers the Great Basin, whereas `ds_meta` covers the entire CONUS. We'll use the offsets defined in the AORC v1.0 history to subset the `ds_meta` coordinates.

In [None]:
def pattern_lookup(pattern, s):
    
    # use the re.search() function to search for the pattern in the string
    match = re.search(pattern, s)

    # check if a match was found
    if match:
        # extract the matched values and concatenate them into the desired string format
        result = f'{match.group(0)}'
        return result
    else:
        print('No match found.')

In [None]:
# define the regular expression pattern to match the substring
pattern_we = r'west_east,(\d+),(\d+)'
pattern_sn = r'south_north,(\d+),(\d+)'

GSL_westeast = pattern_lookup(pattern_we, ds_aorc.attrs['history'])
GSL_southnorth = pattern_lookup(pattern_sn, ds_aorc.attrs['history'])

y_index = GSL_southnorth.split(',')[1:]
x_index = GSL_westeast.split(',')[1:]

In [None]:
# select the x,y values from ds_meta that correspond with the subset indices in ds_aorc.
leny = len(ds_meta.y)
x = ds_meta.x[int(x_index[0]) : int(x_index[1]) + 1].values
y = ds_meta.y[leny - int(y_index[1]) - 1 : leny - int(y_index[0])].values

Add these values to the AORC v1.0 dataset

In [None]:
# rename the existing dimensions so they are CF compliant
ds_aorc = ds_aorc.rename_dims(south_north='y', west_east='x', Time='time')

In [None]:
# add these x, y values to the AORC dataset
ds_aorc = ds_aorc.assign_coords(y=y)
ds_aorc = ds_aorc.assign_coords(x=x)

Add the WRF-Hydro coordinate reference system to AORC v1.0. This `WKT` string can be found within the WRF-Hydro `geo_em.d01_1km.nc` file. 

In [None]:
# add crs to netcdf file
ds_aorc.rio.write_crs(ds_meta.crs.attrs['spatial_ref'], inplace=True);

Add spatial metadata to the `x` and `y` coordinates.

In [None]:

ds_aorc.x.attrs['standard_name'] = "projection_x_coordinate"
ds_aorc.x.attrs['long_name'] = "x coordinate of projection"
ds_aorc.x.attrs['units'] = "m"
ds_aorc.x.attrs['_CoordinateAxisType'] = "GeoX"
ds_aorc.x.attrs['resolution'] = 1000.

ds_aorc.y.attrs['standard_name'] = "projection_y_coordinate"
ds_aorc.y.attrs['long_name'] = "y coordinate of projection"
ds_aorc.y.attrs['units'] = "m"
ds_aorc.y.attrs['_CoordinateAxisType'] = "GeoY"
ds_aorc.y.attrs['resolution'] = 1000.

In [None]:
ds_aorc