# Harmony Regridding Service (HRS) [regridder] regression tests

This Jupyter notebook runs a suite of regression tests against some requests against the Harmony Regridding Service.

These tests use a variety of input data to exercise the service:
 - MERRA-2 hourly collection of time averaged data ([M2T1NXSLV](https://cmr.earthdata.nasa.gov/search/concepts/C1276812863-GES_DISC.html)) 
 - ICESat-2/ATL16 Weekly Gridded Atmosphere collection ([ATL16](https://cmr.earthdata.nasa.gov/search/concepts/C2153575232-NSIDC_CPRD.html))
 - SMAP L4 Global 3-hourly 9 km EASE-Grid Surface and Root Zone Soil Moisture Analysis Update V007 ([SPL4MAU](https://search.uat.earthdata.nasa.gov/search/granules?p=C1268612113-EEDTEST))
 - SMAP L3 Radiometer Global and Northern Hemisphere Daily 36 km EASE-Grid Freeze/Thaw State V004 ([SPL3FTP](https://search.uat.earthdata.nasa.gov/search/granules?p=C1268617120-EEDTEST))



## Set the Harmony environment:

The cell below sets the `harmony_host_url` to one of the following valid values:

* Production: <https://harmony.earthdata.nasa.gov>
* UAT: <https://harmony.uat.earthdata.nasa.gov>
* SIT: <https://harmony.sit.earthdata.nasa.gov>
* Local: <http://localhost:3000>

The default value is for the UAT environment. When using this notebook there are two ways to use the non-default environment:

* Run this notebook in a local Jupyter notebook server and change the value of `harmony_host_url` in the cell below to the value for the environment you require from the above list.
* Use the `run_notebooks.sh` script, which requires you to declare an environment variable `HARMONY_HOST_URL`. Set that environment variable to the value above that corresponds to the environment you want to test. That environment variable will take precedence over the default value in the cell below.

In [None]:
harmony_host_url = "https://harmony.uat.earthdata.nasa.gov"

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-regridder`

A `.netrc` file must also be located in the `test` directory of this repository.

### Import required packages:

In [None]:
from harmony import Collection, Environment, Client, Request, BBox
from tempfile import TemporaryDirectory

from pathlib import Path

set up shared utilities

In [None]:
import sys

sys.path.append("../shared_utils")
from utilities import print_success, submit_and_download
from compare import compare_results_to_reference_file

### Set up environment dependent variables:

This includes the Harmony `Client` object and `Collection` objects for each of the collections for which there are regression tests. The local, SIT and UAT Harmony instances all utilise resources from CMR UAT, meaning any non-production environment will use the same resources.

When adding a production entry to the dictionary below, the collection instances can be included directly in the production dictionary entry, as they do not need to be shared.

In [None]:
non_production_collection = {
    "merra_collection": Collection(id="C1245662776-EEDTEST"),
    "atl16_collection": Collection(id="C1238589498-EEDTEST"),
    "spl4smau_collection": Collection(id="C1268612113-EEDTEST"),
    "spl3ftp_collection": Collection(id="C1268617120-EEDTEST"),
}

non_prod_granule_data = {
    "merra_granules": ["G1245662793-EEDTEST", "G1245662791-EEDTEST"],
    "atl16_granules": ["G1245614996-EEDTEST", "G1245614968-EEDTEST"],
    "spl4smau_granules": ["G1268612119-EEDTEST"],
    "spl3ftp_granules": ["G1268617163-EEDTEST"],
}


collection_data = {
    "https://harmony.uat.earthdata.nasa.gov": {
        **non_production_collection,
        **non_prod_granule_data,
        "env": Environment.UAT,
    },
    "https://harmony.sit.earthdata.nasa.gov": {
        **non_production_collection,
        **non_prod_granule_data,
        "env": Environment.SIT,
    },
    "http://localhost:3000": {
        **non_production_collection,
        **non_prod_granule_data,
        "env": Environment.LOCAL,
    },
}

environment_information = collection_data.get(harmony_host_url)

if environment_information is not None:
    harmony_client = Client(env=environment_information["env"])
    endpoint_url = environment_information.get("endpoint_url", None)

## Test for full earth downsample of MERRA-2 data  (M2T1NXSLV)

Make a request specifying a single granule selecting parameters to create a 2.5 degree output grid (72x144):

This test will regrid all grids ['global', 'npolar', and 'spolar'] to the same grid, explicitly stating the `scaleExtent`, `scaleSize`, `height`, `width` and `crs` for the Harmony request, instead of using a UMM-Grid object.

In [None]:
if environment_information is not None:
    with TemporaryDirectory() as temp_dir:
        merra_request = Request(
            collection=environment_information["merra_collection"],
            granule_id=environment_information["merra_granules"][0],
            interpolation="Elliptical Weighted Averaging",
            scale_size=(2.5, 2.5),
            scale_extent=(-180, -90, 180, 90),
            crs="+proj=latlong +datum=WGS84 +no_defs",
            height=72,
            width=144,
        )

        test_output = temp_dir / Path("MERRA2_test.h5")
        test_reference = Path(
            "reference_data/MERRA2_400.tavg1_2d_slv_Nx.20210605_regridded.nc4"
        )

        submit_and_download(harmony_client, merra_request, test_output)

        assert test_output.exists(), "Unsuccessful Harmony Request: MERRA2"

        compare_results_to_reference_file(test_output, test_reference, identical=False)

    print_success("Regrid MERRA2 Data Success")
else:
    print(
        "Skipping test: regridder regression tests not configured for this environment."
    )

# Test ATL16

The ATL16 data has multiple grids stored in the file.

This test will regrid all grids ['global', 'npolar', and 'spolar'] to the same grid, explicitly stating the `scaleExtent`, `scaleSize`, `height`, `width` and `crs` for the Harmony request, instead of using a UMM-Grid object.

A small 5 degree grid is chosen to save space.

In [None]:
if environment_information is not None:
    with TemporaryDirectory() as temp_dir:
        atl16_request = Request(
            collection=environment_information['atl16_collection'],
            granule_id=environment_information['atl16_granules'][0],
            interpolation='Elliptical Weighted Averaging',
            scale_size=(5.0, 5.0),
            scale_extent=(-180, -90, 180, 90),
            crs='EPSG:4326',
            height=36,
            width=72,
        )

        test_output = temp_dir / Path("ATL16_test.h5")
        test_reference = Path(
            'reference_data/ATL16_20200308000030_11040601_004_01_regridded.h5'
        )

        submit_and_download(harmony_client, atl16_request, test_output)

        assert test_output.exists(), 'Unsuccessful Harmony Request: ATL16'

        compare_results_to_reference_file(test_output, test_reference, identical=False)

    print_success('Regrid ATL16 Data Success')
else:
    print(
        'Skipping test: regridder regression tests not configured for this environment.'
    )

# Test SPL4SMAU

The SPL4SMAU test is designed to show that data with projected coordinates can be correctly resampled to geographic output.

A small 7 degree grid is chosen to save space.

In [None]:
if environment_information is not None:
    with TemporaryDirectory() as temp_dir:
        spl4mau_request = Request(
            collection=environment_information['spl4smau_collection'],
            granule_id=environment_information['spl4smau_granules'][0],
            interpolation='Elliptical Weighted Averaging',
            scale_size=(7.0, 7.0),
            scale_extent=(-180, -90, 180, 90),
            crs='EPSG:4326',
        )

        test_output = temp_dir / Path("SPL4SMAU_test.h5")
        test_reference = Path('reference_data/SPL4SMAU_test_reference.h5')

        submit_and_download(harmony_client, spl4mau_request, test_output)

        assert (
            test_output.exists()
        ), 'Unsuccessful Harmony Request: SPL4MAU (projected dimensions)'

        compare_results_to_reference_file(test_output, test_reference, identical=False)

    print_success('Regrid SPL4MAU Data Success')
else:
    print(
        'Skipping test: regridder regression tests not configured for this environment.'
    )

# Test SPL3FTP

The SPL3FTP test is designed to demonstrate implicit grid determination for projected data with muliple souce grids.

A small 7 degree grid is chosen to save space.

In [None]:
if environment_information is not None:
    with TemporaryDirectory() as temp_dir:
        spl3ftp_request = Request(
            collection=environment_information['spl3ftp_collection'],
            granule_id=environment_information['spl3ftp_granules'][0],
            interpolation='Elliptical Weighted Averaging',
            crs='EPSG:4326',
            variables=[
                "Freeze_Thaw_Retrieval_Data_Global/altitude_dem",
                "Freeze_Thaw_Retrieval_Data_Global/freeze_reference",
                "Freeze_Thaw_Retrieval_Data_Polar/altitude_dem",
                "Freeze_Thaw_Retrieval_Data_Polar/freeze_reference",
            ],
            spatial=BBox(w=-70, s=60, e=-10, n=85),
        )

        test_output = temp_dir / Path("SPL3FTP_test.h5")
        test_reference = Path('reference_data/SPL3FTP_test_reference.h5')

        submit_and_download(harmony_client, spl3ftp_request, test_output)

        assert (
            test_output.exists()
        ), 'Unsuccessful Harmony Request: SPL3FTP (Implicit Grids)'

        compare_results_to_reference_file(test_output, test_reference, identical=False)

    print_success('Regrid SPL3FTP Data Success')
else:
    print(
        'Skipping test: regridder regression tests not configured for this environment.'
    )