# NSIDC ICESat2 Regression tests

### This juypter notebook runs and verifies a series of test requests against NSIDC's ICESat2 data.

Requests are submitted and the retrieved data compared to a set of verified results.

Sample requests include:

- Subset by bounding box
- Subset by temporal range
- Subset by shapefile 


We test against ICESat2 v6 collections:
[ATL03](https://nsidc.org/data/atl03/versions/6), [ATL04](https://nsidc.org/data/atl04/versions/6), [ATL06](https://nsidc.org/data/atl06/versions/6), [ATL07](https://nsidc.org/data/atl07/versions/6), [ATL08](https://nsidc.org/data/atl08/versions/6), [ATL09](https://nsidc.org/data/atl09/versions/6), [ATL10](https://nsidc.org/data/atl10/versions/6), [ATL12](https://nsidc.org/data/atl12/versions/6) and [ATL13](https://nsidc.org/data/atl13/versions/6)


## Prerequisites

The dependencies for running this notebook are listed in the
[environment.yaml](https://github.com/nasa/harmony-regression-tests/blob/main/test/nsidc-icesat2/environment.yaml).

In order to test locally, run the following commands from the `test/nsidc-icesat2/` directory to create and activate the conda environment necessary to run the regression testing notebook.

```sh
conda env create -f ./environment.yaml && conda activate papermill-nsidc-icesat2
```

To use this environment within a shared Jupyter Hub, see [instructions](https://nasa-openscapes.github.io/earthdata-cloud-cookbook/contributing/workflow.html#create-a-jupyter-kernel-to-run-notebooks) in the NASA Earthdata Cloud Cookbook for how to create a new kernel based on this environment. 

## Authentication

To provide your credentials to harmony, a `.netrc` file must be located in the `test` directory of this repository.
Ensure the credentials in this .netrc belong to a user that can access the NSIDC data which is protected by ACLs in UAT and SIT.


## Set the Harmony environment:

The next cell below sets the `harmony_host_url` to one of the following valid values:

* Production: <https://harmony.earthdata.nasa.gov>
* UAT: <https://harmony.uat.earthdata.nasa.gov>
* SIT: <https://harmony.sit.earthdata.nasa.gov>
* Local: <http://localhost:3000>

By default, the value is set to use Harmony's UAT environment. You can modify the target environment in two ways when using this notebook.

* Run this notebook in a local Jupyter notebook server and simply edit the value of `harmony_host_url` in the cell below to be the desired value for your environment.

* Run the `run_notebooks.sh` script, which uses the papermill library to parameterize and run notebooks. Before running, set the environment variable `HARMONY_HOST_URL` to the desired environment's URL from the list above. This variable will override the default value in the cell below, allowing papermill to inject the correct URL into the notebook at runtime.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

### Import required packages

In [None]:
from harmony import BBox, Client, Collection, Environment, Request
from os.path import exists
from datetime import datetime
from pathlib import Path
from tempfile import TemporaryDirectory

#### Import shared utility functions:

In [None]:
import sys

sys.path.append('../shared_utils')
from utilities import (
    print_success,
    submit_and_download,
    compare_results_to_reference_file_new,
)

### Set up test information

In [None]:
non_production_configuration = {
    'subset_bounding_box': {
        'ATL03': {
            'collection_concept_id': Collection(id='C1256407609-NSIDC_CUAT'),
            'granule_id': 'G1262402516-NSIDC_CUAT',
            'spatial': BBox(-105.5, 40.0, -105.0, 40.005),
        },
        'ATL07': {
            'collection_concept_id': Collection(id='C1256535488-NSIDC_CUAT'),
            'granule_id': 'G1261684946-NSIDC_CUAT',
            'spatial': BBox(-112.0, 80.0, -93.0, 80.3),
        },
        'ATL08': {
            'collection_concept_id': Collection(id='C1256432189-NSIDC_CUAT'),
            'granule_id': 'G1260745539-NSIDC_CUAT',
            'spatial': BBox(-105.5, 40.0, -105.0, 40.25),
        },
        'ATL10': {
            'collection_concept_id': Collection(id='C1256535487-NSIDC_CUAT'),
            'granule_id': 'G1261681735-NSIDC_CUAT',
            'spatial': BBox(161.0, -75.0, 171.0, -74.0),
        },
        'ATL12': {
            'collection_concept_id': Collection(id='C1256476536-NSIDC_CUAT'),
            'granule_id': 'G1263992202-NSIDC_CUAT',
            'spatial': BBox(-79.0, 27.0, -77.0, 34.0),
        },
        'ATL13': {
            'collection_concept_id': Collection(id='C1257810199-NSIDC_CUAT'),
            'granule_id': 'G1261681705-NSIDC_CUAT',
            'spatial': BBox(-89.0, 43.0, -75.0, 45.0),
        },
    },
    'subset_by_temporal_range': {
        'ATL04': {
            'collection_concept_id': Collection(id='C1256535558-NSIDC_CUAT'),
            'granule_id': 'G1256952662-NSIDC_CUAT',
            'temporal': {
                'start': datetime.fromisoformat("2020-04-08T08:00:00.000Z"),
                'stop': datetime.fromisoformat("2020-04-08T08:05:00.000Z"),
            },
            'coords_to_rename': ['delta_time'],
        },
        # BLOCKED by https://bugs.earthdata.nasa.gov/browse/DAS-2233
        # 'ATL08': {
        #     'collection_concept_id': Collection(id='C1256432189-NSIDC_CUAT'),
        #     'granule_id': 'G1261385533-NSIDC_CUAT',
        #     'temporal': {
        #         'start': datetime.fromisoformat("2022-07-31T23:01:00.000Z"),
        #         'stop': datetime.fromisoformat("2022-07-31T23:01:10.000Z"),
        #     },
        #     'coords_to_rename': [],
        # },
    },
    'subset_by_shapefile': {
        'ATL06': {
            'collection_concept_id': Collection(id='C1256358217-NSIDC_CUAT'),
            'granule_id': 'G1260779121-NSIDC_CUAT',
            'shape': 'ancillary/Iceland_sliver.zip',
        },
        'ATL08': {
            'collection_concept_id': Collection(id='C1256432189-NSIDC_CUAT'),
            'granule_id': 'G1260498664-NSIDC_CUAT',
            'shape': 'ancillary/SriLanka_simple.kml',
        },
        'ATL09': {
            'collection_concept_id': Collection(id='C1256563776-NSIDC_CUAT'),
            'granule_id': 'G1262106425-NSIDC_CUAT',
            'shape': 'ancillary/Tasmania_sliver.geojson',
        },
        'ATL10': {
            'collection_concept_id': Collection(id='C1256535487-NSIDC_CUAT'),
            'granule_id': 'G1261681735-NSIDC_CUAT',
            'shape': 'ancillary/Ross_Sea_positive_lon_only.geojson',
        },
    },
}

In [None]:
production_configuration = {
    'subset_bounding_box': {
        'ATL03': {
            'collection_concept_id': Collection(id='C2596864127-NSIDC_CPRD'),
            'granule_id': 'G2632805836-NSIDC_CPRD',
            'spatial': BBox(-105.5, 40.0, -105.0, 40.005),
        },
        'ATL07': {
            'collection_concept_id': Collection(id='C2713030505-NSIDC_CPRD'),
            'granule_id': 'G2738665484-NSIDC_CPRD',
            'spatial': BBox(-112.0, 80.0, -93.0, 80.3),
        },
        'ATL08': {
            'collection_concept_id': Collection(id='C2613553260-NSIDC_CPRD'),
            'granule_id': 'G2645102344-NSIDC_CPRD',
            'spatial': BBox(-105.5, 40.0, -105.0, 40.25),
        },
        # Blocked by DAS-2244: and PROD need UMM-S configuration
        #        'ATL10': {
        #            'collection_concept_id': Collection(id='C2613553243-NSIDC_CPRD'),
        #            'granule_id': 'G2738637140-NSIDC_CPRD',
        #            'spatial': BBox(161.0, -75.0, 171.0, -74.0),
        #        },
        'ATL12': {
            'collection_concept_id': Collection(id='C2613553216-NSIDC_CPRD'),
            'granule_id': 'G2952685768-NSIDC_CPRD',
            'spatial': BBox(-79.0, 27.0, -77.0, 34.0),
        },
        'ATL13': {
            'collection_concept_id': Collection(id='C2684928243-NSIDC_CPRD'),
            'granule_id': 'G2720556827-NSIDC_CPRD',
            'spatial': BBox(-89.0, 43.0, -75.0, 45.0),
        },
    },
    'subset_by_temporal_range': {
        'ATL04': {
            'collection_concept_id': Collection(id='C2613553327-NSIDC_CPRD'),
            'granule_id': 'G2634053936-NSIDC_CPRD',
            'temporal': {
                'start': datetime.fromisoformat("2020-04-08T08:00:00.000Z"),
                'stop': datetime.fromisoformat("2020-04-08T08:05:00.000Z"),
            },
            'coords_to_rename': ['delta_time'],
        },
        # BLOCKED by https://bugs.earthdata.nasa.gov/browse/DAS-2233
        # 'ATL08': {
        #     'collection_concept_id': Collection(id='C2613553260-NSIDC_CPRD'),
        #     'granule_id': 'G1261385533-NSIDC_CUAT-TBD',
        #     'temporal': {
        #         'start': datetime.fromisoformat("2022-07-31T23:01:00.000Z"),
        #         'stop': datetime.fromisoformat("2022-07-31T23:01:10.000Z"),
        #     },
        #     'coords_to_rename': [],
        # },
    },
    'subset_by_shapefile': {
        'ATL06': {
            'collection_concept_id': Collection(id='C2670138092-NSIDC_CPRD'),
            'granule_id': 'G2674250298-NSIDC_CPRD',
            'shape': 'ancillary/Iceland_sliver.zip',
        },
        'ATL08': {
            'collection_concept_id': Collection(id='C2613553260-NSIDC_CPRD'),
            'granule_id': 'G2640057431-NSIDC_CPRD',
            'shape': 'ancillary/SriLanka_simple.kml',
        },
        'ATL09': {
            'collection_concept_id': Collection(id='C2649212495-NSIDC_CPRD'),
            'granule_id': 'G2666419430-NSIDC_CPRD',
            'shape': 'ancillary/Tasmania_sliver.geojson',
        },
        # Blocked by DAS-2244: and PROD need UMM-S configuration
        # 'ATL10': {
        #     'collection_concept_id': Collection(id='C2613553243-NSIDC_CPRD'),
        #     'granule_id': 'G2738637140-NSIDC_CPRD',
        #     'shape': 'ancillary/Ross_Sea_positive_lon_only.geojson',
        # },
    },
}

In [None]:
environment_configuration = {
    'https://harmony.earthdata.nasa.gov': {
        **production_configuration,
        'env': Environment.PROD,
    },
    'https://harmony.uat.earthdata.nasa.gov': {
        **non_production_configuration,
        'env': Environment.UAT,
    },
    'https://harmony.sit.earthdata.nasa.gov': {
        **non_production_configuration,
        'env': Environment.SIT,
    },
    'http://localhost:3000': {
        **non_production_configuration,
        'env': Environment.LOCAL,
    },
}

configuration = environment_configuration.get(harmony_host_url)

if configuration is not None:
    harmony_client = Client(env=configuration['env'])

### Run Bounding Box Tests

The next cell runs through each of the subset by bounding box tests forming requests that are submitted to Harmony and comparing the downloaded results against reference data files that have been verified.  This ensures that Harmony continues to return the expected binary files for expected requests.

In [None]:
test_name = 'subset_bounding_box'
with TemporaryDirectory() as tmp_dir:
    if configuration is not None:
        for shortname, test_config in configuration[test_name].items():
            test_request = Request(
                collection=test_config['collection_concept_id'],
                granule_id=[test_config['granule_id']],
                spatial=test_config['spatial'],
            )
            test_output = tmp_dir / Path(f'{shortname}_{test_name}.h5')
            test_reference = Path(
                f'reference_files/{test_output.stem}_reference{test_output.suffix}'
            )

            submit_and_download(harmony_client, test_request, test_output)

            assert exists(
                test_output
            ), 'Unsuccessful Harmony Request: {shortname}: {test_name}'
            compare_results_to_reference_file_new(
                test_output, test_reference, identical=False
            )
            print_success(f'{shortname} {test_name} test request.')

        print_success(f'{test_name} test suite.')
    else:
        print(
            f'Bounding box tests not configured for environment: {configuration["env"]} - skipping tests'
        )

## Run Temporal Range Tests

As with the previous cell, The next cell runs through each of the temporal range tests forming requests that are submitted to Harmony and comparing the downloaded results against reference data files that have been verified.  This ensures that Harmony continues to return the expected binary files for expected requests.

In [None]:
test_name = 'subset_by_temporal_range'
with TemporaryDirectory() as tmp_dir:
    if configuration is not None:
        for shortname, test_config in configuration[test_name].items():
            test_request = Request(
                collection=test_config['collection_concept_id'],
                granule_id=[test_config['granule_id']],
                temporal=test_config['temporal'],
            )
            test_output = tmp_dir / Path(f'{shortname}_{test_name}.h5')
            test_reference = Path(
                f'reference_files/{test_output.stem}_reference{test_output.suffix}'
            )

            submit_and_download(harmony_client, test_request, test_output)

            assert exists(
                test_output
            ), 'Unsuccessful Harmony Request: {shortname}: {test_name}'
            compare_results_to_reference_file_new(
                test_output,
                test_reference,
                identical=False,
                coordinates_to_fix=test_config['coords_to_rename'],
            )
            print_success(f'{shortname} {test_name} test request.')

        print_success(f'{test_name} test suite.')
    else:
        print(
            f'Bounding box tests not configured for environment: {configuration["env"]} - skipping tests'
        )

## Run Subset by Shapefile Tests

This next cell runs through each of the subset by shapefile tests forming requests that are submitted to Harmony and comparing the downloaded results against reference data files that have been verified.  This ensures that Harmony continues to return the expected binary files for expected requests.

In [None]:
test_name = 'subset_by_shapefile'
with TemporaryDirectory() as tmp_dir:
    if configuration is not None:
        for shortname, test_config in configuration[test_name].items():
            test_request = Request(
                collection=test_config['collection_concept_id'],
                granule_id=[test_config['granule_id']],
                shape=test_config['shape'],
            )
            test_output = tmp_dir / Path(f'{shortname}_{test_name}.h5')
            test_reference = Path(
                f'reference_files/{test_output.stem}_reference{test_output.suffix}'
            )

            submit_and_download(harmony_client, test_request, test_output)

            assert exists(
                test_output
            ), 'Unsuccessful Harmony Request: {shortname}: {test_name}'
            compare_results_to_reference_file_new(
                test_output, test_reference, identical=False, coordinates_to_fix=[]
            )
            print_success(f'{shortname} {test_name} test request.')

        print_success(f'{test_name} test suite.')
    else:
        print(
            f'Bounding box tests not configured for environment: {configuration["env"]} - skipping tests'
        )