# Regression test suite for the Harmony OPeNDAP SubSetter (HOSS):

This notebook provides condensed examples of using Harmony to make requests against the HOSS service developed and managed by the Data Services team on the Transformation Train. HOSS can process L3 and L4 data hosted in OPeNDAP, and is offered via two service chains - one for geographically gridded data and another for projection-gridded data. The features of HOSS include:

* Variable subsetting, including required variables.
* Temporal subsetting.
* Bounding box spatial subsetting.
* Shape file spatial subsetting.
* Named dimension subsetting.

Note, several configuration tips were gained from [this blog post](https://towardsdatascience.com/introduction-to-papermill-2c61f66bea30).

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-hoss`

A `.netrc` file must also be located in the `test` directory of this repository.

## Import requirements:

In [None]:
from datetime import datetime
from os.path import exists

from earthdata_hashdiff import nc4_matches_reference_hash_file
from harmony import BBox, Client, Collection, Dimension, Environment, Request
from hoss_utilities import test_is_configured

### Import shared utilities:

In [None]:
import sys

sys.path.append('../shared_utils')
from utilities import print_success, submit_and_download

## Set default parameters:

`papermill` requires default values for parameters used on the workflow. In this case, `harmony_host_url`.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

### Identify Harmony environment (for easier reference):

In [None]:
host_environment = {
    'http://localhost:3000': Environment.LOCAL,
    'https://harmony.sit.earthdata.nasa.gov': Environment.SIT,
    'https://harmony.uat.earthdata.nasa.gov': Environment.UAT,
    'https://harmony.earthdata.nasa.gov': Environment.PROD,
}


harmony_environment = host_environment.get(harmony_host_url)

if harmony_environment is not None:
    harmony_client = Client(env=harmony_environment)

# Begin regression tests:

## Harmony OPeNDAP SubSetter (HOSS), Geographic:

HOSS is currently deployed to Sandbox, SIT, UAT and production. However, it is only associated with collections in UAT. Requests will be made against the RSSMIF16D collection, as mirrored in the EEDTEST CMR provider in UAT.

In [None]:
hoss_non_prod_information = {
    'collection': Collection(id='C1238392622-EEDTEST'),
    'granule_id': 'G1245840464-EEDTEST',
    'temporal_collection': Collection(id='C1245662776-EEDTEST'),
    'temporal_granule_id': 'G1245662797-EEDTEST',
    'bounds_collection': Collection(id='C1245618475-EEDTEST'),
    'bounds_granule_id': 'G1255863984-EEDTEST',
}

hoss_prod_information = {
    'collection': Collection(id='C1996546500-GHRC_DAAC'),
    'granule_id': 'G2040739631-GHRC_DAAC',
}

hoss_env = {
    Environment.LOCAL: hoss_non_prod_information,
    Environment.SIT: hoss_non_prod_information,
    Environment.UAT: hoss_non_prod_information,
    Environment.PROD: hoss_prod_information,
}

if harmony_environment in hoss_env:
    hoss_info = hoss_env[harmony_environment]
else:
    hoss_info = None

### Metadata attributes that can vary between prod and non-prod environments.
These are ignored during hash generation and during testing.

In [None]:
skipped_metadata_attributes = {
    'references',
    'build_dmrpp_metadata.created',
    'build_dmrpp_metadata.build_dmrpp',
    'build_dmrpp_metadata.bes',
    'build_dmrpp_metadata.libdap',
    'build_dmrpp_metadata.invocation',
}

### HOSS bounding box and variable subsetter request

This is a request that exercises the full range of HOSS options: bounding box and variable subsetting.

Requested parameter:

* `/atmosphere_water_vapor_content`

Additional required parameters (grid dimensions):

* `/latitude`
* `/longitude`
* `/time`

In [None]:
if test_is_configured(hoss_info, 'collection'):
    hoss_var_bbox_file_name = 'hoss_var_bbox.nc4'
    hoss_var_bbox_bbox = BBox(w=-150, s=0, e=-105, n=15)
    hoss_var_bbox_request = Request(
        collection=hoss_info['collection'],
        granule_id=[hoss_info['granule_id']],
        variables=['atmosphere_cloud_liquid_water_content'],
        spatial=hoss_var_bbox_bbox,
        labels=['hoss-rtests', 'hoss-rtest-1'],
    )

    submit_and_download(harmony_client, hoss_var_bbox_request, hoss_var_bbox_file_name)
    assert exists(
        hoss_var_bbox_file_name
    ), 'Unsuccessful HOSS variable, bounding box request.'

    assert nc4_matches_reference_hash_file(
        hoss_var_bbox_file_name,
        'reference_files/hoss_var_bbox_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, bounding box request.'

    print_success('HOSS variable and bounding box request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS asynchronous request

This test is removed as `harmony-py` requests are all asynchronous.

### HOSS bounding box crosses grid edge:

For collections where the grid edge is the Prime Meridian (0 degrees east) rather than the Antimeridian (180 degrees east) HOSS needs to be able to function when a user requests a region crossing the Prime Meridian (for example a box containing the UK). It currently retrieves the specified latitude range, but the full longitude range, and fills outside the bounding box region.

The expected output will look like two vertical stripes of data, one each at the lefthand and righthand edge of the plot.

In [None]:
if test_is_configured(hoss_info, 'collection'):
    grid_edge_file_name = 'hoss_grid_edge.nc4'
    grid_edge_bbox = BBox(w=-15, s=-60, e=15, n=-30)
    grid_edge_request = Request(
        collection=hoss_info['collection'],
        granule_id=[hoss_info['granule_id']],
        variables=['atmosphere_cloud_liquid_water_content'],
        spatial=grid_edge_bbox,
        labels=['hoss-rtests', 'hoss-rtest-2'],
    )

    submit_and_download(harmony_client, grid_edge_request, grid_edge_file_name)
    assert exists(
        grid_edge_file_name
    ), 'Unsuccessful HOSS request crossing longitudinal edge.'

    assert nc4_matches_reference_hash_file(
        grid_edge_file_name,
        'reference_files/hoss_grid_edge_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, crossing longitudinal edge.'

    print_success('HOSS request crossing longitudinal edge.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS request no bounding box

If a bounding box is not specified for a HOSS-activated collection, a variable subset will still be performed. The requested variables will be returned, with their full original data.

In [None]:
if test_is_configured(hoss_info, 'collection'):
    no_bbox_file_name = 'hoss_no_bbox.nc4'
    no_bbox_request = Request(
        collection=hoss_info['collection'],
        granule_id=[hoss_info['granule_id']],
        variables=['sst_dtime', 'wind_speed'],
        labels=['hoss-rtests', 'hoss-rtest-3'],
    )

    submit_and_download(harmony_client, no_bbox_request, no_bbox_file_name)
    assert exists(no_bbox_file_name), 'Unsuccessful HOSS request without bounding box.'

    assert nc4_matches_reference_hash_file(
        no_bbox_file_name,
        'reference_files/hoss_no_bbox_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, no bounding box.'

    print_success('HOSS request without bounding box.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS request all variables

If there are no variables specified, HOSS should retrieve all variables. If the bounding box is specified, all gridded variables should still be constrained to the requested spatial region.

In [None]:
if test_is_configured(hoss_info, 'collection'):
    hoss_all_vars_file_name = 'hoss_all_vars.nc4'
    hoss_all_vars_bbox = BBox(w=-150, s=0, e=-105, n=15)
    hoss_all_vars_request = Request(
        collection=hoss_info['collection'],
        granule_id=[hoss_info['granule_id']],
        spatial=hoss_all_vars_bbox,
        labels=['hoss-rtests', 'hoss-rtest-4'],
    )

    submit_and_download(harmony_client, hoss_all_vars_request, hoss_all_vars_file_name)
    assert exists(hoss_all_vars_file_name), 'Unsuccessful HOSS all-variable request.'

    assert nc4_matches_reference_hash_file(
        hoss_all_vars_file_name,
        'reference_files/hoss_all_vars_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS all-variable request.'

    print_success('HOSS all-variable request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS request all variables, no bounding box

**This request has been migrated to the harmony-regression suite.**

[HARMONY-1649](https://bugs.earthdata.nasa.gov/browse/HARMONY-1649) added a service will return original download links for services that do not explicitly request a transformation in the request parameters. The all-variable, no-bounding-box request will now be routed to that service instead of placing load on HOSS and OPeNDAP.

### HOSS temporal subset request:

This request will combine a variable and temporal subset for a granule in the M2T1NXSLV collection (MERRA-2). The result will include the requested variable and the three associated dimension variables:

* `/H1000`
* `/lat`
* `/lon`
* `/time`

Furthermore, the temporal dimension and the science variable (`/H1000`) will be limited to the specified temporal range. MERRA-2 is gridded at hourly intervals, so the 4-hour time range (12pm - 4pm on 9th June 2021) will return 4 time values.

For the granule being tested, all time values are expressed as minutes since 2021-06-09T00:30:00Z, and each grid-cell spans 30 minutes in either direction from the stated value (cells are centre-aligned). As such:

* 12pm is the leading edge of the cell with a centre value of 12:30pm, which is 720 minutes since 00:30am.
* 4pm is the trailing edge of the cell with a centre value of 15:30pm, which is 900 minutes since 00:30am.

In [None]:
if test_is_configured(hoss_info, 'temporal_collection'):
    hoss_temporal_file_name = 'hoss_temporal.nc4'
    hoss_temporal_request = Request(
        collection=hoss_info['temporal_collection'],
        granule_id=hoss_info['temporal_granule_id'],
        variables=['H1000'],
        temporal={
            'start': datetime(2021, 6, 9, 12, 0, 0),
            'stop': datetime(2021, 6, 9, 16, 0, 0),
        },
        labels=['hoss-rtests', 'hoss-rtest-5'],
    )

    submit_and_download(harmony_client, hoss_temporal_request, hoss_temporal_file_name)
    assert exists(hoss_temporal_file_name), 'Unsuccessful HOSS temporal request.'

    assert nc4_matches_reference_hash_file(
        hoss_temporal_file_name,
        'reference_files/hoss_temporal_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS temporal request.'

    print_success('HOSS temporal request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### Named dimension subsetting:

The following test will recreate a bounding box subset, however, it will explicitly name the dimension variables, rather than relying on the generic bounding box request parameters, `subset=lat(a:b)&subset=lon(c:d)`. For the RSSMIF16D the longitude and latitude dimensions are named:

* `/latitude` and `/longitude`.

In [None]:
if test_is_configured(hoss_info, 'collection'):
    hoss_named_dims_file_name = 'hoss_named_dimensions.nc4'
    hoss_named_dims_request = Request(
        collection=hoss_info['collection'],
        granule_id=[hoss_info['granule_id']],
        variables=['atmosphere_cloud_liquid_water_content'],
        dimensions=[Dimension('latitude', -20, -5), Dimension('longitude', 70, 85)],
        labels=['hoss-rtests', 'hoss-rtest-6'],
    )

    submit_and_download(
        harmony_client, hoss_named_dims_request, hoss_named_dims_file_name
    )
    assert exists(
        hoss_named_dims_file_name
    ), 'Unsuccessful HOSS named dimensions request.'

    assert nc4_matches_reference_hash_file(
        hoss_named_dims_file_name,
        'reference_files/hoss_named_dimensions_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS named dimensions request.'

    print_success('HOSS named dimensions request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### Shape file spatial subsetting:

The following request will include a GeoJSON shape file, which HOSS will use to limit the extent of the output grid retrieved from OPeNDAP to a bounding box that minimally encompasses the specified shape. The output will then be passed to MaskFill, which will apply fill values to all pixels inside the bounding box, but outside the GeoJSON shape.

In [None]:
hoss_shape_non_prod_information = {
    'collection': Collection(id='C1245618475-EEDTEST'),
    'shape_file_path': 'amazon_basin.geo.json',
    'granule_id': 'G1245649517-EEDTEST',
}

hoss_shape_env = {
    Environment.LOCAL: hoss_shape_non_prod_information,
    Environment.SIT: hoss_shape_non_prod_information,
    Environment.UAT: hoss_shape_non_prod_information,
}

if harmony_environment in hoss_shape_env:
    hoss_shape_info = hoss_shape_env[harmony_environment]
else:
    hoss_shape_info = None

In [None]:
if test_is_configured(hoss_shape_info, 'collection'):
    hoss_shape_file_name = 'hoss_shape_file.nc4'
    hoss_shape_request = Request(
        collection=hoss_shape_info['collection'],
        granule_id=hoss_shape_info['granule_id'],
        shape=hoss_shape_info['shape_file_path'],
        variables=['Grid/precipitationCal'],
        labels=['hoss-rtests', 'hoss-rtest-7'],
    )
    submit_and_download(harmony_client, hoss_shape_request, hoss_shape_file_name)
    assert exists(
        hoss_shape_file_name
    ), 'Unsuccessful HOSS-Geographic polygon spatial subset request.'

    assert nc4_matches_reference_hash_file(
        hoss_shape_file_name,
        'reference_files/hoss_shape_file_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS-Geographic polygon spatial subset request.'

    print_success('HOSS-Geographic polygon spatial subset request.')
else:
    print(
        f'Skipping - HOSS-Geographic is not configured for this test in {harmony_environment}.'
    )

### Request for granule that has bounds variables:

This regression test will use the GPM/IMERGHH collection, which is geographically gridded, to ensure that a granule with bounds variables extracts the expected temporal and spatial range:

In [None]:
if test_is_configured(hoss_info, 'bounds_collection'):
    hoss_bounds_file_name = 'hoss_bounds.nc4'
    hoss_bounds_bbox = BBox(w=60, s=-15, e=75, n=0)
    hoss_bounds_request = Request(
        collection=hoss_info['bounds_collection'],
        granule_id=[hoss_info['bounds_granule_id']],
        variables=['Grid/precipitationCal'],
        spatial=hoss_bounds_bbox,
        temporal={
            'start': datetime(2020, 1, 18, 18, 45, 0),
            'stop': datetime(2020, 1, 18, 19, 45, 0),
        },
        labels=['hoss-rtests', 'hoss-rtest-8'],
    )

    submit_and_download(harmony_client, hoss_bounds_request, hoss_bounds_file_name)
    assert exists(hoss_bounds_file_name), 'Unsuccessful HOSS bounds request.'

    assert nc4_matches_reference_hash_file(
        hoss_bounds_file_name,
        'reference_files/hoss_bounds_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS bounds request.'

    print_success('HOSS bounds request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

## HOSS projection-gridded collections:

## Spatial subsetting of projected grid using a bounding box.

This request uses the chained service that combines HOSS with MaskFill to offer bounding box spatial subsetting of coordinate projected gridded data hosted in OPeNDAP.

The request will use the [ABoVE Tundra Vegetation Photosynthesis and Respiration Model (TVPRM) Simulated Net Ecosystem Exchange collection](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1245804308-EEDTEST.html). This collection uses an Albers Conical Equal Area projection.

The request also uses a temporal subset to limit the size of the result.


In [None]:
hoss_projected_non_prod_information = {
    'collection': Collection(id='C1245804308-EEDTEST'),
    'bbox': BBox(w=-160, s=68, e=-150, n=70),
    'shape_file_path': 'north_slope.geo.json',
    'granule_id': 'G1245804356-EEDTEST',
    'temporal_range': {
        'start': datetime(2008, 7, 2, 0, 0, 0),
        'stop': datetime(2008, 7, 2, 1, 0, 0),
    },
}

hoss_projected_env = {
    Environment.LOCAL: hoss_projected_non_prod_information,
    Environment.SIT: hoss_projected_non_prod_information,
    Environment.UAT: hoss_projected_non_prod_information,
}

if harmony_environment in hoss_projected_env:
    hoss_projected_info = hoss_projected_env[harmony_environment]
else:
    hoss_projected_info = None

In [None]:
if test_is_configured(hoss_projected_info, 'collection'):
    hoss_projected_file_name = 'hoss_projected_north_slope.nc4'
    hoss_projected_request = Request(
        collection=hoss_projected_info['collection'],
        granule_id=hoss_projected_info['granule_id'],
        spatial=hoss_projected_info['bbox'],
        variables=['NEE', 'lat', 'lon'],
        temporal=hoss_projected_info['temporal_range'],
        labels=['hoss-rtests', 'hoss-rtest-9'],
    )
    submit_and_download(
        harmony_client, hoss_projected_request, hoss_projected_file_name
    )
    assert exists(
        hoss_projected_file_name
    ), 'Unsuccessful HOSS-Projection-gridded bounding box request.'

    assert nc4_matches_reference_hash_file(
        hoss_projected_file_name,
        'reference_files/hoss_projected_north_slope_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS-Projection-gridded bounding box request.'

    print_success('HOSS-projection-gridded with bounding box.')
else:
    print(
        f'Skipping - HOSS Projection-Gridded is not configured for this test in {harmony_environment}.'
    )

## Spatial subsetting of projected grid using GeoJSON polygon.

This request uses the chained service that combines HOSS with MaskFill to offer shapefile subsetting of coordinate projected gridded data hosted in OPeNDAP.

The request will use the [ABoVE Tundra Vegetation Photosynthesis and Respiration Model (TVPRM) Simulated Net Ecosystem Exchange collection](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1245804308-EEDTEST.html). This collection uses an Albers Conical Equal Area projection.

The request also uses a temporal subset to limit the size of the result.

In [None]:
if test_is_configured(hoss_projected_info, 'collection'):
    hoss_projected_shape_file_name = 'hoss_projected_north_slope_shape.nc4'
    hoss_projected_request = Request(
        collection=hoss_projected_info['collection'],
        granule_id=hoss_projected_info['granule_id'],
        shape=hoss_projected_info['shape_file_path'],
        variables=['NEE', 'lat', 'lon'],
        temporal=hoss_projected_info['temporal_range'],
        labels=['hoss-rtests', 'hoss-rtest-10'],
    )
    submit_and_download(
        harmony_client, hoss_projected_request, hoss_projected_shape_file_name
    )
    assert exists(
        hoss_projected_shape_file_name
    ), 'Unsuccessful HOSS-Projection-Gridded shape file spatial subset request.'

    assert nc4_matches_reference_hash_file(
        hoss_projected_shape_file_name,
        'reference_files/hoss_projected_north_slope_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS-Projection-gridded shape file request.'

    print_success('Subsetting projected grid with shapefile.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS bounding box spatial subset:  <a id="spatial-subset"></a>

This request will limit the spatial extent of the returned output. This request will be fulfilled differently depending on the UMM-S record associated with the data.

SPL3FTP_E is associated with sds/HOSS-HRS-GeoTIFF service and will call:
  * `query-cmr` to filter granules to those with matching spatial coverage.
  * `harmony-opendap-subsetter` to perform HOSS operations and extract a rectangular portion of the longitude latitude grid. This will match the bounding box.
  * `maskfill-harmony` to fill any pixels in the rectangular array segment.
  * `harmony-metadata-annotator` to add missing metadata.

TODO: Add Environment.PROD once NSIDC updates SPL3FTP_E in the production environment.


In [None]:
hoss_bbox_spatial_non_prod_information = {
    'collection': Collection(id='C1268616149-EEDTEST'),
    'granule_id': 'G1268616175-EEDTEST',
}

hoss_bbox_spatial_prod_information = {
    'collection': Collection(id='C2776463920-NSIDC_ECS'),
    'granule_id': 'G2903366136-NSIDC_ECS',
}

hoss_bbox_spatial_env = {
    Environment.LOCAL: hoss_bbox_spatial_non_prod_information,
    Environment.SIT: hoss_bbox_spatial_non_prod_information,
    Environment.UAT: hoss_bbox_spatial_non_prod_information,
}

if harmony_environment in hoss_bbox_spatial_env:
    hoss_bbox_spatial_info = hoss_bbox_spatial_env[harmony_environment]
else:
    hoss_bbox_spatial_info = None

In [None]:
if test_is_configured(hoss_bbox_spatial_info, 'collection'):
    hoss_bbox_spatial_file_name = 'hoss_bbox_3d_no_dimensions_zyx_order.nc4'
    hoss_bbox_spatial_bbox = BBox(w=-70, s=60, e=-10, n=85)
    hoss_bbox_spatial_request = Request(
        collection=hoss_bbox_spatial_info['collection'],
        granule_id=[hoss_bbox_spatial_info['granule_id']],
        variables=[
            'Freeze_Thaw_Retrieval_Data_Global/altitude_dem',
            'Freeze_Thaw_Retrieval_Data_Global/transition_state_flag',
            'Freeze_Thaw_Retrieval_Data_Polar/altitude_dem',
            'Freeze_Thaw_Retrieval_Data_Polar/transition_state_flag',
        ],
        spatial=hoss_bbox_spatial_bbox,
        labels=['hoss-rtests', 'hoss-rtest-11'],
    )

    submit_and_download(
        harmony_client, hoss_bbox_spatial_request, hoss_bbox_spatial_file_name
    )
    assert exists(
        hoss_bbox_spatial_file_name
    ), 'Unsuccessful HOSS bounding box spatial, 3-D (well ordered) request.'

    assert nc4_matches_reference_hash_file(
        hoss_bbox_spatial_file_name,
        'reference_files/hoss_bbox_3d_no_dimensions_zyx_order_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS bounding box spatial, 3-D (well ordered) request.'

    print_success('HOSS bounding box 3d no dimensions nominal order request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS polygon spatial subset for 3D variables - not nominal order ('yxz' order):  <a id="spatial-subset"></a>

A polygon spatial subset for 3D variables:
 This request for a polygon subset of 3D variables in a SPL3SMP granule will be fulfilled using four steps. This can be observed via  https://harmony.uat.earthdata.nasa.gov/workflow-ui  endpoint of Harmony.

* `query-cmr` to filter granules to those with matching spatial coverage.
* `harmony-opendap-subsetter` to perform HOSS operations and extract a rectangular portion of the longitude latitude grid. This will match the bounding box.
* `sds/maskfill-harmony` to fill any pixels in the rectangular array segment, but outside the GeoJSON shape.
* `harmony-metadata-annotator` to add missing metadata.

TODO: Add Environment.PROD once NSIDC updates SPL3SMP in the production environment.


In [None]:
hoss_shape_file_polygon_spatial_3d_non_prod_information = {
    'collection': Collection(id='C1268452378-EEDTEST'),
    'granule_id': 'G1268452388-EEDTEST',
    'shape_file_path': 'usa-mainland.geojson',
}

hoss_shape_file_polygon_spatial_3d_prod_information = {
    'collection': Collection(id='C2776463935-NSIDC_ECS'),
    'granule_id': 'G2903715759-NSIDC_ECS',
}

hoss_shape_file_polygon_spatial_3d_env = {
    Environment.LOCAL: hoss_shape_file_polygon_spatial_3d_non_prod_information,
    Environment.SIT: hoss_shape_file_polygon_spatial_3d_non_prod_information,
    Environment.UAT: hoss_shape_file_polygon_spatial_3d_non_prod_information,
}

if harmony_environment in hoss_shape_file_polygon_spatial_3d_env:
    hoss_shape_file_polygon_spatial_3d_info = hoss_shape_file_polygon_spatial_3d_env[
        harmony_environment
    ]
else:
    hoss_shape_file_polygon_spatial_3d_info = None

In [None]:
if test_is_configured(hoss_shape_file_polygon_spatial_3d_info, 'collection'):
    hoss_shape_file_polygon_spatial_3d_file_name = (
        'hoss_shape_file_3d_no_dimensions_yxz_order.nc4'
    )
    hoss_shape_file_polygon_spatial_3d_request = Request(
        collection=hoss_shape_file_polygon_spatial_3d_info['collection'],
        granule_id=hoss_shape_file_polygon_spatial_3d_info['granule_id'],
        shape=hoss_shape_file_polygon_spatial_3d_info['shape_file_path'],
        variables=[
            'Soil_Moisture_Retrieval_Data_AM/landcover_class',
            'Soil_Moisture_Retrieval_Data_PM/landcover_class_pm',
        ],
        labels=['hoss-rtests', 'hoss-rtest-12'],
    )

    submit_and_download(
        harmony_client,
        hoss_shape_file_polygon_spatial_3d_request,
        hoss_shape_file_polygon_spatial_3d_file_name,
    )

    assert exists(
        hoss_shape_file_polygon_spatial_3d_file_name
    ), 'Unsuccessful HOSS polygon 3D variables no dimensions not nominal order request.'

    assert nc4_matches_reference_hash_file(
        hoss_shape_file_polygon_spatial_3d_file_name,
        'reference_files/hoss_shape_file_3d_no_dimensions_yxz_order_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS bounding box spatial, 3-D (poorly ordered) request.'

    print_success('HOSS polygon spatial subset for 3D variables request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')

### HOSS polygon spatial subset:  <a id="spatial-subset"></a>

This request will limit the spatial extent of the returned output. This request for a polygon subset in a SPL3SMP_E granule will be fulfilled using four steps. The SPL3SMP_E granule is an example of a 2D granule with global and polar variables without dimensions and CF attributes. The test does a spatial polygon subset of 2D global and polar variables. This can be observed via  https://harmony.uat.earthdata.nasa.gov/workflow-ui  endpoint of Harmony.

* `query-cmr` to filter granules to those with matching spatial coverage.
* `harmony-opendap-subsetter` to perform HOSS operations and extract a rectangular portion of the longitude latitude grid. This will match the bounding box.
* `sds/maskfill-harmony` to fill any pixels in the rectangular array segment, but outside the GeoJSON shape.
* `harmony-metadata-annotator` to add missing metadata.

TODO: Add Environment.PROD once NSIDC updates SPL3SMP_E in the production environment.


In [None]:
hoss_shape_file_polygon_spatial_non_prod_information = {
    'collection': Collection(id='C1268399578-EEDTEST'),
    'granule_id': 'G1268399648-EEDTEST',
    'shape_file_path': 'usa-mainland.geojson',
}

hoss_shape_file_polygon_spatial_prod_information = {
    'collection': Collection(id='C2776463943-NSIDC_ECS'),
    'granule_id': 'G2906217596-NSIDC_ECS',
}

hoss_shape_file_polygon_spatial_env = {
    Environment.LOCAL: hoss_shape_file_polygon_spatial_non_prod_information,
    Environment.SIT: hoss_shape_file_polygon_spatial_non_prod_information,
    Environment.UAT: hoss_shape_file_polygon_spatial_non_prod_information,
}

if harmony_environment in hoss_shape_file_polygon_spatial_env:
    hoss_shape_file_polygon_spatial_info = hoss_shape_file_polygon_spatial_env[
        harmony_environment
    ]
else:
    hoss_shape_file_polygon_spatial_info = None

In [None]:
if test_is_configured(hoss_shape_file_polygon_spatial_info, 'collection'):
    hoss_shape_file_polygon_spatial_file_name = 'hoss_shape_file_2d_no_dimensions.nc4'
    hoss_shape_file_polygon_spatial_request = Request(
        collection=hoss_shape_file_polygon_spatial_info['collection'],
        granule_id=hoss_shape_file_polygon_spatial_info['granule_id'],
        shape=hoss_shape_file_polygon_spatial_info['shape_file_path'],
        variables=[
            'Soil_Moisture_Retrieval_Data_AM/surface_flag',
            'Soil_Moisture_Retrieval_Data_PM/surface_flag_pm',
            'Soil_Moisture_Retrieval_Data_Polar_AM/surface_flag',
            'Soil_Moisture_Retrieval_Data_Polar_PM/surface_flag_pm',
        ],
        labels=['hoss-rtests', 'hoss-rtest-13'],
    )

    submit_and_download(
        harmony_client,
        hoss_shape_file_polygon_spatial_request,
        hoss_shape_file_polygon_spatial_file_name,
    )

    assert exists(
        hoss_shape_file_polygon_spatial_file_name
    ), 'Unsuccessful HOSS polygon spatial subset, no dimensions request.'

    assert nc4_matches_reference_hash_file(
        hoss_shape_file_polygon_spatial_file_name,
        'reference_files/hoss_shape_file_2d_no_dimensions_reference.json',
        skipped_metadata_attributes=skipped_metadata_attributes,
    ), 'Test output does not match reference file, HOSS polygon spatial subset no dimensions request.'

    print_success('HOSS polygon 2d no dimensions spatial subset request.')
else:
    print(f'Skipping - HOSS is not configured for this test in {harmony_environment}.')