# Regression test suite for services managed by the Data Services team:

This notebook provides condensed examples of using Harmony to make requests against the services developed and managed by the Data Services team on the Transformation Train. These services currently include:

* Swath Projector (a.k.a. SWOT Reprojector): `sds/swot-reproject`. A service that projects L2 swath data to a grid.
* Variable Subsetter: `sds/variable-subsetter`. A service that extracts a subset of granule variables from OPeNDAP to provide a smaller, specific output product.
* Harmony OPeNDAP SubSetter (HOSS): `sds/HOSS`. A service for geographic and projected gridded collections, allowing variable and bounding-box spatial subsetting.
* MaskFill: `sds/maskfill`. A service that sets values outside of a user-defined GeoJSON shape to a fill value.
* Trajectory Subsetter: `sds/trajectory-subsetter`. A service that performs variable, bounding box spatial, shape file spatial and temporal subsetting on segmented trajectory data.

Note, several configuration tips were gained from [this blog post](https://towardsdatascience.com/introduction-to-papermill-2c61f66bea30).

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill`

A `.netrc` file must also be located in the `test` directory of this repository.

## Import requirements:

In [None]:
from datetime import datetime
from os import listdir, remove, replace
from os.path import exists
from typing import List

from h5py import File as H5File
from harmony import BBox, Client, Collection, Dimension, Environment, Request
from harmony.harmony import ProcessingFailedException
from netCDF4 import Dataset
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

## Set default parameters:

`papermill` requires default values for parameters used on the workflow. In this case, `harmony_host_url`.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

### Identify Harmony environment (for easier reference):

In [None]:
host_environment = {'http://localhost:3000': Environment.LOCAL,
                    'https://harmony.sit.earthdata.nasa.gov': Environment.SIT,
                    'https://harmony.uat.earthdata.nasa.gov': Environment.UAT,
                    'https://harmony.earthdata.nasa.gov': Environment.PROD}


harmony_environment = host_environment.get(harmony_host_url)

if harmony_environment is not None:
    harmony_client = Client(env=harmony_environment)

## Helper functions:

In [None]:
def print_error(error_string: str) -> str:
    """Print an error, with formatting for red text. """
    print(f'\033[91m{error_string}\033[0m')


def print_success(success_string: str) -> str:
    """ Print a success message, with formatting for green text. """
    print(f'\033[92mSuccess: {success_string}\033[0m')

### Download helper function:

In [None]:
def submit_and_download(harmony_client: Client, request: Request, output_file_name: str):
    """ Submit a Harmony request via a `harmony-py` client. Wait for the Harmony job to
        finish, then download the results to the specified file path.

    """
    downloaded_filename = None
    
    try:
        job_id = harmony_client.submit(request)


        for filename in [file_future.result()
                         for file_future
                         in harmony_client.download_all(job_id, overwrite=True)]:

            print(f'Downloaded: {filename}')
            downloaded_filename = filename

        if downloaded_filename is not None:
            replace(downloaded_filename, output_file_name)
            print(f'Saved output to: {output_file_name}')

    except ProcessingFailedException:
        print_error(f'Harmony request failed to complete successfully.')

### Helper functions to check variables in output file:

In [None]:
def variable_in_dataset(dataset: Dataset, variable_name: str) -> bool:
    """ Check if a variable is present in a dataset. The variable name must
        be the full path, including groups it is nested in.

    """
    variable_bits = variable_name.lstrip('/').split('/')
    working_group = dataset
    
    while len(variable_bits) > 1:
        group = variable_bits.pop(0)
        if group in working_group.groups:
            working_group = working_group[group]
        else:
            return False

    variable_base_name = variable_bits.pop(0)
    return variable_base_name in working_group.variables
    

def all_variables_present(file_name: str, variable_list: List[str]) -> bool:
    """ Take a list of variable and ensure that all of them are present in the
        downloaded NetCDF-4 file.

    """
    with Dataset(file_name, 'r') as dataset:
        return all(variable_in_dataset(dataset, variable) for variable in variable_list)


def variable_values_all_in_range(file_name: str, variable_name: str,
                                  minimum_value: float, maximum_value: float) -> bool:
    """ Ensure that all values in a specified variable are within a specified range. """
    with Dataset(file_name, 'r') as dataset:
        variable_values = dataset[variable_name][:]

    return variable_values.max() <= maximum_value and variable_values.min() >= minimum_value

### Plotting helper functions

In [None]:
def create_plot(variable_data, x_values, y_values, title=None, colourbar_units=None,
                x_label=None, y_label=None, levels=20, fill_value=None):
    """ This helper function will display a contour plot of the requested data. This
        function assumes the variable data will be two, or three dimensionally gridded, with
        dimensions: (time, latitude, longitude) or (latitude, longitude).
        
        For 3-dimensional data, the first slice in time is extracted for plotting.

    """
    masked_variable = np.ma.masked_where(variable_data[:] == fill_value, variable_data)

    fig = plt.figure(figsize=(10, 10))

    if title is not None:
        fig.suptitle(title, fontsize=20)

    ax = plt.axes(xlabel=x_label, ylabel=y_label)

    if len(variable_data.shape) == 3:
        variable_slice = masked_variable[0][:]
    else:
        variable_slice = masked_variable

    # Some data are lon, lat rather than lat, lon (Python is row, column)
    if variable_slice.shape == (len(x_values[:]), len(y_values[:])):
        variable_slice = variable_slice.T

    # Plot masked data:
    colour_scale = ax.contourf(x_values[:], y_values[:], variable_slice, levels=levels)
    
    # Add colour bar for scaling
    colour_bar = plt.colorbar(colour_scale, ax=ax, orientation='horizontal', pad=0.05)

    if colourbar_units is not None:
        colour_bar.set_label(colourbar_units, fontsize=14)

    plt.tight_layout()
    plt.show()


def plot_variable(file_name, variable, x_variable, y_variable, title, colourbar_units,
                  x_label, y_label, levels=20, fill_value=None):
    """ Open the requested NetCDF-4 file and pass the variables through to the `create_plot`
        function.

    """
    if file_name.endswith('.nc4'):
        # Swath Projector, Variable Subsetter and HOSS
        with Dataset(file_name, 'r') as dataset:
            create_plot(dataset[variable], dataset[x_variable], dataset[y_variable],
                        title=title, colourbar_units=colourbar_units, x_label=x_label,
                        y_label=y_label, levels=levels, fill_value=fill_value)
    elif file_name.endswith('.h5'):
        # MaskFill
        with H5File(file_name, 'r') as h5_file:
            create_plot(h5_file[variable], h5_file[x_variable], h5_file[y_variable],
                        title=title, colourbar_units=colourbar_units, x_label=x_label,
                        y_label=y_label, levels=levels, fill_value=fill_value)
    else:
        print_error('Problem with request, not able to plot output.')

# Begin regression tests:

## Swath Projector:

The Swath Projector is currently only configured for collections in UAT.

In [None]:
swath_projector_non_prod_information = {'collection': Collection(id='C1233860183-EEDTEST'),
                                        'granule_id': 'G1233860549-EEDTEST'}

swath_projector_env = {Environment.LOCAL: swath_projector_non_prod_information,
                       Environment.SIT: swath_projector_non_prod_information,
                       Environment.UAT: swath_projector_non_prod_information}

if harmony_environment in swath_projector_env:
    swath_projector_info = swath_projector_env[harmony_environment]
else:
    swath_projector_info = None

### Swath Projector request with defaults:

Make a request that only specifies the collection and an appropriate granule. This should rely on the default target Coordinate Reference System (CRS) and interpolation method.

In [None]:
if swath_projector_info is not None:
    defaults_file_name = 'swath_projector_defaults.nc4'
    defaults_request = Request(collection=swath_projector_info['collection'],
                               granule_id=[swath_projector_info['granule_id']])
    
    submit_and_download(harmony_client, defaults_request, defaults_file_name)
    assert exists(defaults_file_name), 'Unsuccessful Swath Projector defaults request.'

    expected_variables = ['/lat', '/lon', '/latitude_longitude', '/time', '/alpha_var', '/blue_var', '/green_var', '/red_var']
    assert all_variables_present(defaults_file_name, expected_variables), 'Missing variables in downloaded output'

    plot_variable(defaults_file_name, 'alpha_var', 'lon', 'lat', title='Default parameters Africa granule',
                  colourbar_units='Land mask', x_label='Longitude (degrees east)', y_label='Latitude (degrees north)')

    print_success('Swath projector with default parameters.')
else:
    print(f'The Swath Projector is not configured for environment: "{harmony_environment}" - skipping test.')

### Swath Projector request for Madagascar:

Make a request to the Swath Projector specifying a target CRS using an EPSG code, and requested that the target grid covers only the area surrounding Madagascar, using the `scaleExtents` parameter.

In [None]:
if swath_projector_info is not None:
    epsg_file_name = 'swath_projector_epsg.nc4'
    epsg_request = Request(collection=swath_projector_info['collection'],
                           granule_id=[swath_projector_info['granule_id']],
                           crs='EPSG:4326', scale_extent=[42, -27, 52, -10],
                           temporal={'start': datetime(2020, 1, 15), 'stop': datetime(2020, 1, 16)})
    
    submit_and_download(harmony_client, epsg_request, epsg_file_name)
    assert exists(epsg_file_name), 'Unsuccessful Swath Projector EPSG code request.'

    expected_variables = ['/lat', '/lon', '/latitude_longitude', '/time', '/alpha_var', '/blue_var', '/green_var', '/red_var']
    assert all_variables_present(epsg_file_name, expected_variables), 'Missing variables in downloaded output'

    plot_variable(epsg_file_name, 'alpha_var', 'lon', 'lat', title='EPSG:4326 output Africa granule',
                  colourbar_units='Land mask', x_label='Longitude (degrees east)', y_label='Latitude (degrees north)')

    print_success('Swath Projector EPSG code request.')
else:
    print(f'The Swath Projector is not configured for environment: "{harmony_environment}" - skipping test.')

### Swath Projector, interpolation type and Proj4:

Use the `interpolation` and `outputCrs` parameters to ensure a raw Proj4 string is valid input and that the user can select a non-default interpolation type.

In [None]:
if swath_projector_info is not None:
    proj4_string_file_name = 'swath_projector_proj4.nc4'
    proj4_lcc = '+proj=lcc +lat_1=43 +lat_2=62 +lat_0=30 +lon_0=10 +x_0=0 +y_0=0 +ellps=intl +units=m +no_defs'
    proj4_string_request = Request(collection=swath_projector_info['collection'],
                                   granule_id=[swath_projector_info['granule_id']],
                                   crs=proj4_lcc, interpolation='near',
                                   temporal={'start': datetime(2020, 1, 15), 'stop': datetime(2020, 1, 16)})
    
    submit_and_download(harmony_client, proj4_string_request, proj4_string_file_name)
    assert exists(proj4_string_file_name), 'Unsuccessful Swath Projector interpolation and Proj4 request.'

    expected_variables = ['/x', '/y', '/lambert_conformal_conic', '/time', '/alpha_var', '/blue_var', '/green_var', '/red_var']
    assert all_variables_present(proj4_string_file_name, expected_variables), 'Missing variables in downloaded output'

    plot_variable(proj4_string_file_name, 'alpha_var', 'x', 'y', title='Lambert Conformal Conic CRS, Africa granule',
                  colourbar_units='Land mask', x_label='Longitude (degrees east)', y_label='Latitude (degrees north)')

    print_success('Swath Projector interpolation and Proj4 request')
else:
    print(f'The Swath Projector is not configured for environment: "{harmony_environment}" - skipping test.')

### Swath Projector asynchronous request:

This test has been removed: `harmony-py` requests are asynchronous by default.

## Variable Subsetter

The variable subsetter is currently only configured for collections in UAT.

The granule selected is the smallest in the ATL08 collection, to improve performance of tests.

In [None]:
var_subsetter_non_prod_information = {'collection': Collection(id='C1234714698-EEDTEST'),
                                      'granule_id': 'G1238479209-EEDTEST'}

var_subsetter_env = {Environment.LOCAL: var_subsetter_non_prod_information,
                     Environment.SIT: var_subsetter_non_prod_information,
                     Environment.UAT: var_subsetter_non_prod_information}

if harmony_environment in var_subsetter_env:
    var_subsetter_info = var_subsetter_env[harmony_environment]
else:
    var_subsetter_info = None

### Variable Subsetter request, no Int64 variables

This request should retrieve the requested `/gt1l/land_segments/dem_h` variable alongside the following supporting variables:

* `/gt1l/land_segments/delta_time`
* `/gt1l/land_segments/latitude`
* `/gt1l/land_segments/longitude`

In [None]:
if var_subsetter_info is not None:
    no_int64_file_name = 'var_subsetter_no_int64.nc4'
    no_int64_request = Request(collection=var_subsetter_info['collection'],
                               granule_id=[var_subsetter_info['granule_id']],
                               variables=['/gt1l/land_segments/dem_h'])

    submit_and_download(harmony_client, no_int64_request, no_int64_file_name)
    assert exists(no_int64_file_name), 'Unsuccessful non-Int64 Variable Subsetter request.'

    expected_variables = ['/gt1l/land_segments/dem_h', '/gt1l/land_segments/delta_time',
                          '/gt1l/land_segments/latitude', '/gt1l/land_segments/longitude']
    
    assert all_variables_present(no_int64_file_name, expected_variables), 'Missing variables in downloaded output'

    print_success('Variable subsetter synchronous request.')
else:
    print(f'The Variable Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Variable Subsetter request and Int64 variables

This request should retrieve the `/gt1l/signal_photons/classed_pc_flag` variable and 6 supporting variables, some of which are Int64, which is not supported by the DAP2 protocol:

* `/gt1l/signal_photons/delta_time`
* `/gt1l/land_segments/ph_ndx`
* `/gt1l/land_segments/n_seg_ph`
* `/gt1l/land_segments/delta_time`
* `/gt1l/land_segments/latitude`
* `/gt1l/land_segments/longitude`

In [None]:
if var_subsetter_info is not None:
    int64_file_name = 'var_subsetter_int64.nc4'
    int64_request = Request(collection=var_subsetter_info['collection'],
                            granule_id=[var_subsetter_info['granule_id']],
                            variables=['/gt1l/signal_photons/classed_pc_flag'])

    submit_and_download(harmony_client, int64_request, int64_file_name)
    assert exists(int64_file_name), 'Unsuccessful Int64 Variable Subsetter request.'

    expected_variables = ['/gt1l/signal_photons/classed_pc_flag', '/gt1l/signal_photons/delta_time',
                          '/gt1l/land_segments/ph_ndx_beg', '/gt1l/land_segments/n_seg_ph',
                          '/gt1l/land_segments/delta_time', '/gt1l/land_segments/latitude',
                          '/gt1l/land_segments/longitude']
    
    assert all_variables_present(int64_file_name, expected_variables), 'Missing variables in downloaded output'

    print_success('Variable Subsetter Int64 request.')
else:
    print(f'The Variable Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Variable Subsetter, all variables

Make a request for "all" variables. This should retrieve the entire file, with all the variables from the original source granule.

This test is now live, as improvements to OPeNDAP performance have mitigated a timeout. Progress!!!

In [None]:
if var_subsetter_info is not None:
    all_variables_file_name = 'var_subsetter_all_vars.nc4'
    all_variables_request = Request(collection=var_subsetter_info['collection'],
                                    granule_id=[var_subsetter_info['granule_id']])

    submit_and_download(harmony_client, all_variables_request, all_variables_file_name)
    assert exists(all_variables_file_name), 'Unsuccessful Variable Subsetter all variable request.'

    # TODO: Recursive check of all variables and groups from input granule.

    print_success('Variable Subsetter all variable request.')
else:
    print(f'The Variable Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

## Harmony OPeNDAP SubSetter (HOSS):

HOSS is currently deployed to Sandbox, SIT, UAT and production. However, it is only associated with collections in UAT. Requests will be made against the RSSMIF16D collection, as mirrored in the EEDTEST CMR provider in UAT.

In [None]:
hoss_non_prod_information = {'collection': Collection(id='C1238392622-EEDTEST'),
                             'granule_id': 'G1245840464-EEDTEST',
                             'temporal_collection': Collection(id='C1245662776-EEDTEST'),
                             'temporal_granule_id': 'G1245662797-EEDTEST',
                             'bounds_collection': Collection(id='C1245618475-EEDTEST'),
                             'bounds_granule_id': 'G1255863984-EEDTEST'}

hoss_env = {Environment.LOCAL: hoss_non_prod_information,
            Environment.SIT: hoss_non_prod_information,
            Environment.UAT: hoss_non_prod_information}

if harmony_environment in hoss_env:
    hoss_info = hoss_env[harmony_environment]
else:
    hoss_info = None

### HOSS bounding box and variable subsetter request

This is a request that exercises the full range of HOSS options: bounding box and variable subsetting.

Requested parameter:

* `/atmosphere_water_vapor_content`

Additional required parameters (grid dimensions):

* `/latitude`
* `/longitude`
* `/time`

In [None]:
if hoss_info is not None:
    hoss_var_bbox_file_name = 'hoss_var_bbox.nc4'
    hoss_var_bbox_bbox = BBox(w=-150, s=0, e=-105, n=15)
    hoss_var_bbox_request = Request(collection=hoss_info['collection'],
                                    granule_id=[hoss_info['granule_id']],
                                    variables=['atmosphere_cloud_liquid_water_content'],
                                    spatial=hoss_var_bbox_bbox)

    submit_and_download(harmony_client, hoss_var_bbox_request, hoss_var_bbox_file_name)
    assert exists(hoss_var_bbox_file_name), 'Unsuccessful HOSS variable, bounding box request.'

    expected_variables = ['/atmosphere_cloud_liquid_water_content', '/latitude', '/longitude', '/time']
    assert all_variables_present(hoss_var_bbox_file_name, expected_variables), 'Missing variables in HOSS output'

    plot_variable(hoss_var_bbox_file_name, '/atmosphere_cloud_liquid_water_content', '/longitude', '/latitude',
                  title='HOSS synchronous results.', colourbar_units='Columnar cloud liquid water (kg.m-2)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(-0.05, 2.45, 51))

    print_success('HOSS variable and bounding box request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### HOSS asynchronous request

This test is removed as `harmony-py` requests are all asynchronous.

### HOSS bounding box crosses grid edge:

For collections where the grid edge is the Prime Meridian (0 degrees east) rather than the Antimeridian (180 degrees east) HOSS needs to be able to function when a user requests a region crossing the Prime Meridian (for example a box containing the UK). It currently retrieves the specified latitude range, but the full longitude range, and fills outside the bounding box region.

The expected output will look like two vertical stripes of data, one each at the lefthand and righthand edge of the plot.

In [None]:
if hoss_info is not None:
    grid_edge_file_name = 'hoss_grid_edge.nc4'
    grid_edge_bbox = BBox(w=-15, s=-60, e=15, n=-30)
    grid_edge_request = Request(collection=hoss_info['collection'],
                                granule_id=[hoss_info['granule_id']],
                                variables=['atmosphere_cloud_liquid_water_content'],
                                spatial=grid_edge_bbox)

    submit_and_download(harmony_client, grid_edge_request, grid_edge_file_name)
    assert exists(grid_edge_file_name), 'Unsuccessful HOSS request crossing longitudinal edge.'

    expected_variables = ['/atmosphere_cloud_liquid_water_content', '/latitude', '/longitude', '/time']
    assert all_variables_present(grid_edge_file_name, expected_variables), 'Missing variables in grid-edge-crossing output'

    plot_variable(grid_edge_file_name, '/atmosphere_cloud_liquid_water_content', '/longitude', '/latitude',
                  title='HOSS request crossing grid edge.', colourbar_units='Columnar cloud liquid water (kg.m-2)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(-0.05, 2.45, 51))

    print_success('HOSS request crossing longitudinal edge.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### HOSS request no bounding box

If a bounding box is not specified for a HOSS-activated collection, a variable subset will still be performed. The requested variables will be returned, with their full original data.

In [None]:
if hoss_info is not None:
    no_bbox_file_name = 'hoss_no_bbox.nc4'
    no_bbox_request = Request(collection=hoss_info['collection'],
                              granule_id=[hoss_info['granule_id']],
                              variables=['sst_dtime', 'wind_speed'])

    submit_and_download(harmony_client, no_bbox_request, no_bbox_file_name)
    assert exists(no_bbox_file_name), 'Unsuccessful HOSS request without bounding box.'

    expected_variables = ['/sst_dtime', '/wind_speed', '/latitude', '/longitude', '/time']
    assert all_variables_present(no_bbox_file_name, expected_variables), 'Missing variables in no bounding box output'

    plot_variable(no_bbox_file_name, '/wind_speed', '/longitude', '/latitude',
                  title='HOSS request no bounding box.', colourbar_units='Wind speed (m/s)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(0, 50, 51))

    print_success('HOSS request without bounding box.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### HOSS request all variables

If there are no variables specified, HOSS should retrieve all variables. If the bounding box is specified, all gridded variables should still be constrained to the requested spatial region.

In [None]:
if hoss_info is not None:
    hoss_all_vars_file_name = 'hoss_all_vars.nc4'
    hoss_all_vars_bbox = BBox(w=-150, s=0, e=-105, n=15)
    hoss_all_vars_request = Request(collection=hoss_info['collection'],
                                    granule_id=[hoss_info['granule_id']],
                                    spatial=hoss_all_vars_bbox)

    submit_and_download(harmony_client, hoss_all_vars_request, hoss_all_vars_file_name)
    assert exists(hoss_all_vars_file_name), 'Unsuccessful HOSS all-variable request.'

    expected_variables = ['/atmosphere_cloud_liquid_water_content', '/atmosphere_water_vapor_content',
                          '/latitude', '/longitude', '/rainfall_rate', '/sst_dtime', '/time', '/wind_speed']
    assert all_variables_present(hoss_all_vars_file_name, expected_variables), 'Missing variables in HOSS all-variable output'

    plot_variable(hoss_all_vars_file_name, '/atmosphere_cloud_liquid_water_content', '/longitude', '/latitude',
                  title='HOSS all variable results.', colourbar_units='Columnar cloud liquid water (kg.m-2)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(-0.05, 2.45, 51))

    print_success('HOSS all-variable request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### HOSS request all variables, no bounding box

If no variables and no bounding box are specified, the entire original granule should be retrieved. (This will run the Variable Subsetter branch of the `sds/variable-subsetter` Docker image, skipping any spatial subsetting portion of the service)

The plotted image should cover the entire Earth (landmasses will be masked).

In [None]:
if hoss_info is not None:
    all_no_bbox_file_name = 'hoss_all_no_bbox.nc4'
    all_no_bbox_request = Request(collection=hoss_info['collection'],
                                  granule_id=hoss_info['granule_id'])


    submit_and_download(harmony_client, all_no_bbox_request, all_no_bbox_file_name)
    assert exists(all_no_bbox_file_name), 'Unsuccessful HOSS all-variable, no bounding box request.'

    expected_variables = ['/atmosphere_cloud_liquid_water_content', '/atmosphere_water_vapor_content',
                          '/latitude', '/longitude', '/rainfall_rate', '/sst_dtime', '/time', '/wind_speed']
    assert all_variables_present(all_no_bbox_file_name, expected_variables), 'Missing variables in HOSS all-variable, no bbox output'

    plot_variable(all_no_bbox_file_name, '/atmosphere_cloud_liquid_water_content', '/longitude', '/latitude',
                  title='HOSS all variables, no bounding box results.',
                  colourbar_units='Columnar cloud liquid water (kg.m-2)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(-0.05, 2.45, 51))

    print_success('HOSS all-variable, no bounding box request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### HOSS temporal subset request:

This request will combine a variable and temporal subset for a granule in the M2T1NXSLV collection (MERRA-2). The result will include the requested variable and the three associated dimension variables:

* `/H1000`
* `/lat`
* `/lon`
* `/time`

Furthermore, the temporal dimension and the science variable (`/H1000`) will be limited to the specified temporal range. MERRA-2 is gridded at hourly intervals, so the 4-hour time range (12pm - 4pm on 9th June 2021) will return 4 time values.

For the granule being tested, all time values are expressed as minutes since 2021-06-09T00:30:00Z, and each grid-cell spans 30 minutes in either direction from the stated value (cells are centre-aligned). As such:

* 12pm is the leading edge of the cell with a centre value of 12:30pm, which is 720 minutes since 00:30am.
* 4pm is the trailing edge of the cell with a centre value of 15:30pm, which is 900 minutes since 00:30am.

In [None]:
if hoss_info is not None:
    hoss_temporal_file_name = 'hoss_temporal.nc4'
    hoss_temporal_request = Request(collection=hoss_info['temporal_collection'],
                                    granule_id=hoss_info['temporal_granule_id'],
                                    variables=['H1000'],
                                    temporal={'start': datetime(2021, 6, 9, 12, 0, 0),
                                              'stop': datetime(2021, 6, 9, 16, 0, 0)})


    submit_and_download(harmony_client, hoss_temporal_request, hoss_temporal_file_name)
    assert exists(all_no_bbox_file_name), 'Unsuccessful HOSS temporal request.'

    expected_variables = ['/H1000', '/lat', '/lon', 'time']
    assert all_variables_present(hoss_temporal_file_name, expected_variables), 'Missing variables in HOSS temporal output'
    
    assert variable_values_all_in_range(hoss_temporal_file_name, '/time', 720.0, 900.0), 'Temporal dimension not correctly subsetted'

    print_success('HOSS temporal request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### Named dimension subsetting:

The following test will recreate a bounding box subset, however, it will explicitly name the dimension variables, rather than relying on the generic bounding box request parameters, `subset=lat(a:b)&subset=lon(c:d)`. For the RSSMIF16D the longitude and latitude dimensions are named:

* `/latitude` and `/longitude`.

In [None]:
if hoss_info is not None:
    hoss_named_dims_file_name = 'hoss_named_dimensions.nc4'
    hoss_named_dims_request = Request(collection=hoss_info['collection'],
                                      granule_id=[hoss_info['granule_id']],
                                      variables=['atmosphere_cloud_liquid_water_content'],
                                      dimensions=[Dimension('latitude', -20, -5),
                                                  Dimension('longitude', 70, 85)])

    submit_and_download(harmony_client, hoss_named_dims_request, hoss_named_dims_file_name)
    assert exists(hoss_named_dims_file_name), 'Unsuccessful HOSS named dimensions request.'

    expected_variables = ['/atmosphere_cloud_liquid_water_content', '/latitude', '/longitude', '/time']
    assert all_variables_present(hoss_named_dims_file_name, expected_variables), 'Missing variables in HOSS output'

    plot_variable(hoss_named_dims_file_name, '/atmosphere_cloud_liquid_water_content', '/longitude', '/latitude',
                  title='HOSS synchronous results.', colourbar_units='Columnar cloud liquid water (kg.m-2)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(-0.05, 2.45, 51))

    print_success('HOSS named dimensions request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

## HOSS/MaskFill chained service request:

This request uses the chained service that combines HOSS with MaskFill to offer polygon spatial subsetting for L3/L4 gridded data hosted in OPeNDAP. This request will use a GeoJSON shape file of the Amazon River basin and the GPM/IMERG test collection.

In [None]:
hoss_maskfill_non_prod_information = {'collection': Collection(id='C1245618475-EEDTEST'),
                                      'shape_file_path': 'amazon_basin.geo.json',
                                      'granule_id': 'G1245649517-EEDTEST'}

hoss_maskfill_env = {Environment.LOCAL: hoss_maskfill_non_prod_information,
                     Environment.SIT: hoss_maskfill_non_prod_information,
                     Environment.UAT: hoss_maskfill_non_prod_information}

if harmony_environment in hoss_maskfill_env:
    hoss_maskfill_info = hoss_maskfill_env[harmony_environment]
else:
    hoss_maskfill_info = None

In [None]:
if hoss_maskfill_info is not None:
    hoss_maskfill_file_name = 'hoss_maskfill_amazon.nc4'
    hoss_maskfill_request = Request(collection=hoss_maskfill_info['collection'],
                                    granule_id=hoss_maskfill_info['granule_id'],
                                    shape=hoss_maskfill_info['shape_file_path'],
                                    variables=['/Grid/precipitationCal'])
    submit_and_download(harmony_client, hoss_maskfill_request, hoss_maskfill_file_name)
    assert exists(hoss_maskfill_file_name), 'Unsuccessful HOSS/MaskFill polygon spatial subset request.'

    expected_variables = ['/Grid/lat', '/Grid/lat_bnds', '/Grid/lon', '/Grid/lon_bnds', '/Grid/time',
                          '/Grid/time_bnds', '/Grid/precipitationCal']
    assert all_variables_present(hoss_maskfill_file_name, expected_variables), 'Missing variables in HOSS/MaskFill output'


    plot_variable(hoss_maskfill_file_name, '/Grid/precipitationCal', '/Grid/lon', '/Grid/lat',
                  title='HOSS/MaskFill polygon spatial subset.',
                  colourbar_units='Calibrated precipitation ($\mathrm{mm\,hr}^{-1}$)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(0, 30, 31))

    print_success('HOSS/MaskFill polygon spatial subset request.')
else:
    print(f'HOSS/MaskFill chained service is not configured for environment: "{harmony_environment}" - skipping test.')


## Spatial subsetting of projected grid using a bounding box.

This request uses the chained service that combines HOSS with MaskFill to offer bounding box spatial subsetting of coordinate projected gridded data hosted in OPeNDAP.

The request will use the [ABoVE Tundra Vegetation Photosynthesis and Respiration Model (TVPRM) Simulated Net Ecosystem Exchange collection](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1245804308-EEDTEST.html). This collection uses an Albers Conical Equal Area projection.

The request also uses a temporal subset to limit the size of the result.


In [None]:
hoss_projected_non_prod_information = {
    'collection': Collection(id='C1245804308-EEDTEST'),
    'bbox': BBox(w=-160, s=68, e=-150, n=70),
    'shape_file_path': 'north_slope.geo.json',
    'granule_id': 'G1245804356-EEDTEST',
    'temporal_range': {
        'start': datetime(2008, 7, 2, 0, 0, 0),
        'stop': datetime(2008, 7, 2, 1, 0, 0)
    }
}

hoss_projected_env = {
    Environment.LOCAL: hoss_projected_non_prod_information,
    Environment.SIT: hoss_projected_non_prod_information,
    Environment.UAT: hoss_projected_non_prod_information
}

if harmony_environment in hoss_projected_env:
    hoss_projected_info = hoss_projected_env[harmony_environment]
else:
    hoss_projected_info = None

In [None]:
if hoss_projected_info is not None:
    hoss_projected_reference_filename = 'reference_images/hoss_projected_north_slope_reference_one.nc4'
    hoss_projected_filename = 'hoss_projected_north_slope.nc4'
    hoss_projected_request = Request(collection=hoss_projected_info['collection'],
                                     granule_id=hoss_projected_info['granule_id'],
                                     spatial=hoss_projected_info['bbox'],
                                     variables=['NEE', 'lat', 'lon'],
                                     temporal=hoss_projected_info['temporal_range'])
    submit_and_download(harmony_client, hoss_projected_request, hoss_projected_filename)
    assert exists(hoss_projected_filename), 'Unsuccessful HOSS spatial subset request.'
    
    # Load data and make comparison:
    reference_data =  xr.open_dataset(hoss_projected_reference_filename)
    output_data = xr.open_dataset(hoss_projected_filename)
    assert output_data.equals(reference_data), 'reference and output datasets did not match'
    print_success('Subsetting projected grid with bounding box.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### Set Cartopy features and plot images of data and control variables. 

In [None]:
if hoss_projected_info is not None:
    import cartopy.crs as ccrs
    import cartopy.feature as cfeature

    land = cfeature.NaturalEarthFeature(category='physical', name='land', scale='50m',
                                        facecolor=cfeature.COLORS['land'])
    ocean = cfeature.NaturalEarthFeature(category='physical', name='ocean', scale='50m',
                                         facecolor=cfeature.COLORS['water'])

    albers_projection = ccrs.AlbersEqualArea(central_longitude=-96.0, central_latitude=40.0,
                                             false_easting=0.0, false_northing=0.0,
                                             standard_parallels=(50.0, 70.0), globe=None)

In [None]:
if hoss_projected_info is not None:
    plt.figure(figsize=(16, 16))

    ax = plt.subplot(1, 2, 1, projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='gray', linestyle='dashed', draw_labels=True)
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.lon.plot.pcolormesh(ax=ax, transform=albers_projection,
                                    x='x', y='y', cmap=plt.cm.turbo,
                                    add_colorbar=False, zorder=2)
    plt.title('Longitude (degrees east)')

    ax = plt.subplot(1, 2, 2, projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='grey', linestyle='dashed', draw_labels=True)
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.lat.plot.pcolormesh(ax=ax, transform=albers_projection,
                                    x='x', y='y', cmap=plt.cm.turbo,
                                    add_colorbar=False, zorder=2)
    plt.title('Latitude (degrees north)')

In [None]:
if hoss_projected_info is not None:
    plt.figure(figsize=(18, 10))

    ax = plt.axes(projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='grey', linestyle='dashed')
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.NEE[0].plot.pcolormesh(ax=ax, transform=albers_projection,
                                       x='x', y='y', cmap=plt.cm.turbo, zorder=2)
    plt.title('Output data for ABoVE TVPRM simulated Net Ecosystem Exchange\n'
              '2008-01-01T00:30:00Z')
    plt.show()

### Clean up `xarray` objects:

Using a context manager for `xarray.Dataset` does not remove the variable assignment as part of the `.__exit__` method.

In [None]:
if hoss_projected_info is not None:
    del reference_data
    del output_data

## Spatial subsetting of projected grid using geojson polygon.

This request uses the chained service that combines HOSS with MaskFill to offer shapefile subsetting of coordinate projected gridded data hosted in OPeNDAP.

The request will use the [ABoVE Tundra Vegetation Photosynthesis and Respiration Model (TVPRM) Simulated Net Ecosystem Exchange collection](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1245804308-EEDTEST.html). This collection uses an Albers Conical Equal Area projection.

The request also uses a temporal subset to limit the size of the result.

In [None]:
if hoss_projected_info is not None:
    hoss_projected_reference_filename = 'reference_images/hoss_projected_north_slope_reference_one.nc4'
    hoss_projected_shape_filename = 'hoss_projected_north_slope_shape.nc4'
    hoss_projected_request = Request(collection=hoss_projected_info['collection'],
                                     granule_id=hoss_projected_info['granule_id'],
                                     shape=hoss_projected_info['shape_file_path'],
                                     variables=['NEE', 'lat', 'lon'],
                                     temporal=hoss_projected_info['temporal_range'])
    submit_and_download(harmony_client, hoss_projected_request, hoss_projected_shape_filename)
    assert exists(hoss_projected_shape_filename), 'Unsuccessful HOSS shapefile spatial subset request.'

    # Load output data and compare to a reference image:
    reference_data =  xr.open_dataset(hoss_projected_reference_filename)
    output_data = xr.open_dataset(hoss_projected_shape_filename)
    assert output_data.equals(reference_data), 'reference and output datasets did not match'
    print_success('Subsetting projected grid with shapefile.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

### Plot data and control variables 

In [None]:
if hoss_projected_info is not None:
    plt.figure(figsize=(16, 16))

    ax = plt.subplot(1, 2, 1, projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='gray', linestyle='dashed', draw_labels=True)
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.lon.plot.pcolormesh(ax=ax, transform=albers_projection,
                                    x='x', y='y', cmap=plt.cm.turbo,
                                    add_colorbar=False, zorder=2)
    plt.title('Longitude (degrees east)')

    ax = plt.subplot(1, 2, 2, projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='grey', linestyle='dashed', draw_labels=True)
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.lat.plot.pcolormesh(ax=ax, transform=albers_projection,
                                    x='x', y='y', cmap=plt.cm.turbo,
                                    add_colorbar=False, zorder=2)
    plt.title('Latitude (degrees north)')

In [None]:
if hoss_projected_info is not None:
    plt.figure(figsize=(18, 10))

    ax = plt.axes(projection=albers_projection)
    ax.add_feature(ocean)
    ax.add_feature(land)
    ax.gridlines(color='grey', linestyle='dashed')
    ax.set_extent([-2330000, -1800000, 3900000, 4500000], albers_projection)
    output_data.NEE[0].plot.pcolormesh(ax=ax, transform=albers_projection,
                                       x='x', y='y', cmap=plt.cm.turbo, zorder=2)
    plt.title('Output data for ABoVE TVPRM simulated Net Ecosystem Exchange\n'
              '2008-01-01T00:30:00Z')
    plt.show()

### Clean up `xarray` objects:

In [None]:
if hoss_projected_info is not None:
    del reference_data
    del output_data

### Request for granule that has bounds variables:

This regression test will use the GPM/IMERGHH collection, which is geographically gridded, to ensure that a granule with bounds variables extracts the expected temporal and spatial range:

In [None]:
if hoss_info is not None:
    hoss_bounds_file_name = 'hoss_bounds.nc4'
    hoss_bounds_reference_file_name = 'reference_images/hoss_bounds_reference_one.nc4'
    hoss_bounds_bbox = BBox(w=60, s=-15, e=75, n=0)
    hoss_bounds_request = Request(collection=hoss_info['bounds_collection'],
                                  granule_id=[hoss_info['bounds_granule_id']],
                                  variables=['/Grid/precipitationCal'],
                                  spatial=hoss_bounds_bbox,
                                  temporal={'start': datetime(2020, 1, 18, 18, 45, 0),
                                            'stop': datetime(2020, 1, 18, 19, 45, 0)})

    submit_and_download(harmony_client, hoss_bounds_request, hoss_bounds_file_name)
    assert exists(hoss_bounds_file_name), 'Unsuccessful HOSS bounds request.'

    expected_variables = ['/Grid/precipitationCal', '/Grid/lat', '/Grid/lat_bnds',
                          '/Grid/lon', '/Grid/lon_bnds', '/Grid/time', '/Grid/time_bnds']
    assert all_variables_present(hoss_bounds_file_name, expected_variables), 'Missing variables in HOSS output'

    plot_variable(hoss_bounds_file_name, '/Grid/precipitationCal', '/Grid/lon', '/Grid/lat',
                  title='HOSS bounds results.', colourbar_units='Calibrated precipitation (mm.hr-1)',
                  x_label='Longitude (degrees east)', y_label='Latitude (degrees north)',
                  levels=np.linspace(0, 60, 51))

    # Ensure that all variables returned are expected and are equal to those in a reference file:
    reference_data =  xr.open_dataset(hoss_bounds_reference_file_name, group='/Grid')
    output_data = xr.open_dataset(hoss_bounds_file_name, group='/Grid') 

    assert output_data.equals(reference_data), f'Output file does not match reference {hoss_bounds_reference_file_name}'

    reference_data.close()
    output_data.close()
    reference_data = None
    output_data = None
    
    print_success('HOSS bounds request.')
else:
    print(f'HOSS is not configured for environment: "{harmony_environment}" - skipping test.')

## MaskFill:

MaskFill is currently only activated for collections in the UAT environment. Requests will be made against granules in the SPL4CMDL collection, as this is the only currently active collection. The download of these granules may be slow, as they are over 100 MB in size.

In [None]:
maskfill_non_prod_information = {'collection': Collection(id='C1240150677-EEDTEST'),
                                 'shape_file_path': 'amazon_basin.geo.json',
                                 'granule_id': 'G1245558666-EEDTEST'}

maskfill_env = {Environment.LOCAL: maskfill_non_prod_information,
                Environment.SIT: maskfill_non_prod_information,
                Environment.UAT: maskfill_non_prod_information}

if harmony_environment in maskfill_env:
    maskfill_info = maskfill_env[harmony_environment]
else:
    maskfill_info = None

### MaskFill request:

This request uses a GeoJSON shape file of the Amazon River basin and the SPL4CMDL collection.

In [None]:
if maskfill_info is not None:
    amazon_file_name = 'maskfill_amazon.h5'
    amazon_request = Request(collection=maskfill_info['collection'],
                             granule_id=maskfill_info['granule_id'],
                             shape=maskfill_info['shape_file_path'])

    submit_and_download(harmony_client, amazon_request, amazon_file_name)
    assert exists(amazon_file_name), 'Unsuccessful MaskFill Amazon river basin request.'

    plot_variable(amazon_file_name, '/GPP/gpp_mean', '/x', '/y',
                  title='MaskFill synchronous results.',
                  colourbar_units='Gross Primary Productivity ($\mathrm{g.cm}^{-2}.\mathrm{day}^{-1}$)',
                  x_label='EASE-2 grid x coordinate (m)', y_label='EASE-2 grid y coordinate (m)',
                  levels=np.linspace(0, 30, 31))

    print_success('MaskFill synchronous request.')
else:
    print(f'MaskFill is not configured for environment: "{harmony_environment}" - skipping test.')

### MaskFill asynchronous request:

This test has been removed, as `harmony-py` requests are asynchronous by default.

## Segmented Trajectory Subsetter:

The Segmented Trajectory Subsetter is currently only activated for collections in the UAT environment. Requests will be made against granules in the GEDI L4A collection, as this is the only currently active collection. To minimize the size of the output, all requests will use a variable subset - the original granules are > 1 GB in size!

The specific granule used in the requests below was selected to have a trajectory that crosses the Amazon river basin GeoJSON shape used in the MaskFill regression tests above.

In [None]:
traj_sub_non_prod_information = {'collection': Collection(id='C1242267295-EEDTEST'),
                                 'granule_id': 'G1242274836-EEDTEST',
                                 'shape_file_path': 'amazon_basin.geo.json',
                                 'requested_variables': ['/BEAM0000/agbd'],
                                 'retrieved_variables': ['/BEAM0000/agbd', '/BEAM0000/delta_time',
                                                         '/BEAM0000/lat_lowestmode',
                                                         '/BEAM0000/lon_lowestmode',
                                                         '/BEAM0000/shot_number']}

trajectory_subsetter_env = {Environment.LOCAL: traj_sub_non_prod_information,
                            Environment.SIT: traj_sub_non_prod_information,
                            Environment.UAT: traj_sub_non_prod_information}

if harmony_environment in trajectory_subsetter_env:
    trajectory_subsetter_info = trajectory_subsetter_env[harmony_environment]
else:
    trajectory_subsetter_info = None

### Trajectory Subsetter variable subset request:

This is a request to retrieve a variable subset of a GEDI L4A granule. The request will ask for a single variable `/BEAM0000/agbd`, but will retrieve an additional four variables that are required to make the output viable for downstream processing. The five expected output variables are:

* `/BEAM0000/agbd` (above ground biomass density)
* `/BEAM0000/delta_time` (from the `coordinates` metadata attribute of `/BEAM0000/agbd`)
* `/BEAM0000/lat_lowestmode` (from the `coordinates` metadata attribute of `/BEAM0000/agbd`)
* `/BEAM0000/lon_lowestmode` (from the `coordinates` metadata attribute of `/BEAM0000/agbd`)
* `/BEAM0000/shot_number` (from the `ancillary_variables` metadata attribute of `/BEAM0000/agbd`, as configured by `sds-varinfo`)

In [None]:
if trajectory_subsetter_info is not None:
    ts_variable_file_name = 'trajectory_subsetter_variable.h5'
    ts_variable_request = Request(collection=trajectory_subsetter_info['collection'],
                                  granule_id=[trajectory_subsetter_info['granule_id']],
                                  variables=trajectory_subsetter_info['requested_variables'])

    submit_and_download(harmony_client, ts_variable_request, ts_variable_file_name)
    assert exists(ts_variable_file_name), 'Unsuccessful Trajectory Subsetter variable subset request.'

    assert all_variables_present(
        ts_variable_file_name, trajectory_subsetter_info['retrieved_variables']
    ), 'Missing variables in Trajectory Subsetter output'

    print_success('Trajectory Subsetter variable subset request.')
else:
    print(f'Trajectory Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Trajectory Subsetter temporal subset request:

This request will combine a variable subset with a temporal range - as defined via the `subset` request parameter. The requested data should fall between 1am and 2am on the 8th of July 2020.

In [None]:
if trajectory_subsetter_info is not None:
    ts_temporal_file_name = 'trajectory_subsetter_temporal.h5'
    ts_temporal_request = Request(collection=trajectory_subsetter_info['collection'],
                                  granule_id=[trajectory_subsetter_info['granule_id']],
                                  variables=trajectory_subsetter_info['requested_variables'],
                                  temporal={'start': datetime(2020, 7, 8, 1, 0, 0),
                                            'stop': datetime(2020, 7, 8, 2, 0, 0)})

    submit_and_download(harmony_client, ts_temporal_request, ts_temporal_file_name)
    assert exists(ts_temporal_file_name), 'Unsuccessful Trajectory Subsetter temporal subset request.'

    assert all_variables_present(
        ts_temporal_file_name, trajectory_subsetter_info['retrieved_variables']
    ), 'Missing variables in temporal subset Trajectory Subsetter output'

    print_success('Trajectory Subsetter temporal subset request.')
else:
    print(f'Trajectory Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Trajectory Subsetter bounding box spatial subset request:

This request combines the variable subset (for output size purposes) with a bounding box spatial subset. The bounding box has been selected to approximately encompass Brazil:

* -74 ≤ longitude (degrees east) ≤ -35
* -34 ≤ latitude (degress north) ≤ 5

In [None]:
if trajectory_subsetter_info is not None:
    ts_bbox_file_name = 'trajectory_subsetter_bbox.h5'
    ts_bbox_bbox = BBox(w=-74, s=-34, e=-35, n=5)
    ts_bbox_request = Request(collection=trajectory_subsetter_info['collection'],
                              granule_id=[trajectory_subsetter_info['granule_id']],
                              variables=trajectory_subsetter_info['requested_variables'],
                              spatial=ts_bbox_bbox)

    submit_and_download(harmony_client, ts_bbox_request, ts_bbox_file_name)
    assert exists(ts_bbox_file_name), 'Unsuccessful Trajectory Subsetter bounding box subset request.'

    assert all_variables_present(
        ts_bbox_file_name, trajectory_subsetter_info['retrieved_variables']
    ), 'Missing variables in bounding box spatial subset Trajectory Subsetter output'
    assert variable_values_all_in_range(
        ts_bbox_file_name, '/BEAM0000/lon_lowestmode', ts_bbox_bbox.w, ts_bbox_bbox.e
    ), 'Longitude values not all in expected range, Trajectory Subsetter bounding box request'
    assert variable_values_all_in_range(
        ts_bbox_file_name, '/BEAM0000/lat_lowestmode', ts_bbox_bbox.s, ts_bbox_bbox.n
    ), 'Latitude values not all in expected range, Trajectory Subsetter bounding box request'

    print_success('Trajectory Subsetter bounding box spatial subset request.')
else:
    print(f'Trajectory Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Trajectory Subsetter polygon spatial subset request:

The request below combines a variable subset with the Amazon river basin polygon. The output should constrained to be extent of this polygon.

In [None]:
if trajectory_subsetter_info is not None:
    ts_polygon_file_name = 'trajectory_subsetter_polygon.h5'
    ts_polygon_request = Request(collection=trajectory_subsetter_info['collection'],
                                 granule_id=[trajectory_subsetter_info['granule_id']],
                                 variables=trajectory_subsetter_info['requested_variables'],
                                 shape=trajectory_subsetter_info['shape_file_path'])

    submit_and_download(harmony_client, ts_polygon_request, ts_polygon_file_name)
    assert exists(ts_polygon_file_name), 'Unsuccessful Trajectory Subsetter polygon spatial subset request.'

    assert all_variables_present(
        ts_polygon_file_name, trajectory_subsetter_info['retrieved_variables']
    ), 'Missing variables in polygon spatial subset Trajectory Subsetter output'

    # Define the smallest bounding box that minimally encompasses the polygon area. All points should be
    # inside this bounding box:
    ts_polygon_bbox = BBox(w=-80, s=-19, e=-44, n=9)
    assert variable_values_all_in_range(
        ts_polygon_file_name, '/BEAM0000/lon_lowestmode', ts_polygon_bbox.w, ts_polygon_bbox.e
    ), 'Longitude values not all in expected range, Trajectory Subsetter polygon request'
    assert variable_values_all_in_range(
        ts_polygon_file_name, '/BEAM0000/lat_lowestmode', ts_polygon_bbox.s, ts_polygon_bbox.n
    ), 'Latitude values not all in expected range, Trajectory Subsetter polygon request'

    print_success('Trajectory Subsetter polygon spatial subset request.')
else:
    print(f'Trajectory Subsetter is not configured for environment: "{harmony_environment}" - skipping test.')

### Segmented Trajectory Subsetter additional tests:

Ideally, we should test that photon segment indices are correctly handled (e.g., they are all consecutive integers, even if a middle segment is excluded by a subset: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, ...]). Currently (2021-12-02), there are no Cloud-hosted collections with photon segment indices associated with the Segmented Trajectory Subsetter.

# Clean up test outputs:

In [None]:
directory_files = listdir()

for directory_file in directory_files:
    if directory_file.endswith(('.nc4', '.h5', '.sha256', )):
        remove(directory_file)