# Harmony Regridding Service (HRS) [regridder] regression tests

This Jupyter notebook runs a suite of regression tests against some requests against the Harmony Regridding Service.

These tests use the a MERRA-2 hourly collection of time averaged data ([M2T1NXSLV](https://cmr.earthdata.nasa.gov/search/concepts/C1276812863-GES_DISC.html)) and the ICESat-2/ATL16 Weekly Gridded Atmosphere collection ([ATL16](https://cmr.earthdata.nasa.gov/search/concepts/C2153575232-NSIDC_CPRD.html)) available from GES DISC and NSIDC, respectively.


## Set the Harmony environment:

The cell below sets the `harmony_host_url` to one of the following valid values:

* Production: <https://harmony.earthdata.nasa.gov>
* UAT: <https://harmony.uat.earthdata.nasa.gov>
* SIT: <https://harmony.sit.earthdata.nasa.gov>
* Local: <http://localhost:3000>

The default value is for the UAT environment. When using this notebook there are two ways to use the non-default environment:

* Run this notebook in a local Jupyter notebook server and change the value of `harmony_host_url` in the cell below to the value for the environment you require from the above list.
* Use the `run_notebooks.sh` script, which requires you to declare an environment variable `HARMONY_HOST_URL`. Set that environment variable to the value above that corresponds to the environment you want to test. That environment variable will take precedence over the default value in the cell below.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-regridder`

A `.netrc` file must also be located in the `test` directory of this repository.

### Import required packages:

In [None]:
from harmony import Collection, Environment, Client, Request
from utility import  print_success 
from tempfile import TemporaryDirectory
import xarray as xr

from pathlib import Path

### Set up environment dependent variables:

This includes the Harmony `Client` object and `Collection` objects for each of the collections for which there are regression tests. The local, SIT and UAT Harmony instances all utilise resources from CMR UAT, meaning any non-production environment will use the same resources.

When adding a production entry to the dictionary below, the collection instances can be included directly in the production dictionary entry, as they do not need to be shared.

In [None]:
non_production_collection = {
    'merra_collection': Collection(id='C1245662776-EEDTEST'),
    'atl16_collection': Collection(id='C1238589498-EEDTEST')
}

non_prod_granule_data = {
    'merra_granules': ['G1245662793-EEDTEST', 'G1245662791-EEDTEST'],
    'atl16_granules': ['G1245614996-EEDTEST', 'G1245614968-EEDTEST']
}


collection_data = {
    'https://harmony.uat.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.UAT
    },
    'https://harmony.sit.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.SIT
    },
    'http://localhost:3000': {
        **non_production_collection,
        **non_prod_granule_data,        
        'env': Environment.LOCAL
    },
}

environment_information = collection_data.get(harmony_host_url)

if environment_information is not None:
    harmony_client = Client(env=environment_information['env'])
    endpoint_url = environment_information.get('endpoint_url', None)

## Test for full earth downsample of MERRA-2 data  (M2T1NXSLV)

Make a request specifying a single granule selecting parameters to create a 2.5 degree output grid (72x144):

This test will regrid all grids ['global', 'npolar', and 'spolar'] to the same grid, explicitly stating the `scaleExtent`, `scaleSize`, `height`, `width` and `crs` for the Harmony request, instead of using a UMM-Grid object.

In [None]:
if environment_information is not None:
    merra_request = Request(collection=environment_information['merra_collection'],
                           granule_id=environment_information['merra_granules'][0],
                           interpolation='Elliptical Weighted Averaging',
                           scale_size=(2.5, 2.5), scale_extent=(-180, -90, 180, 90),
                           crs='+proj=latlong +datum=WGS84 +no_defs', height=72, width=144)                       
    merra_job_id = harmony_client.submit(merra_request)
    harmony_client.wait_for_processing(merra_job_id, show_progress=True)
    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future
            in harmony_client.download_all(merra_job_id, overwrite=True, directory=temp_dir)
        ]
        reference_ds = xr.open_dataset('reference_data/MERRA2_400.tavg1_2d_slv_Nx.20210605_regridded.nc4', engine='netcdf4')
        output_ds =  xr.open_dataset(Path(temp_dir, downloaded_grid_outputs[0]), engine='netcdf4')
        assert output_ds.equals(reference_ds), 'Generated data did not match reference dataset'

    print_success('Regrid MERRA2 Data Success')
else:
    print('Skipping test: regridder regression tests not configured for this environment.')

# Test ATL16

The ATL16 data has multiple grids stored in the file.

This test will regrid all grids ['global', 'npolar', and 'spolar'] to the same grid, explicitly stating the `scaleExtent`, `scaleSize`, `height`, `width` and `crs` for the Harmony request, instead of using a UMM-Grid object.

A small 5 degree grid is chosen to save space.

In [None]:
if environment_information is not None:
    atl16_request = Request(collection=environment_information['atl16_collection'],
                           granule_id=environment_information['atl16_granules'][0],
                           interpolation='Elliptical Weighted Averaging',
                           scale_size=(5.0, 5.0),
                           scale_extent=(-180, -90, 180, 90),
                           crs='EPSG:4326', height=36, width=72) 
    atl16_job_id = harmony_client.submit(atl16_request)
    harmony_client.wait_for_processing(atl16_job_id, show_progress=True)

    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future
            in harmony_client.download_all(atl16_job_id, overwrite=True, directory=temp_dir)
        ]
        reference_ds = xr.open_dataset('reference_data/ATL16_20200308000030_11040601_004_01_regridded.h5', engine='netcdf4')
        output_ds =  xr.open_dataset(Path(temp_dir, downloaded_grid_outputs[0]), engine='netcdf4')

        assert output_ds.equals(reference_ds), 'Regridded ATL16 data did not match reference dataset'

    print_success('Regrid ATL16 Data Success')
else:
    print('Skipping test: regridder regression tests not configured for this environment.')