# Regression test suite for the Variable Subsetter backend Harmony service:

This notebook provides condensed examples of using Harmony to make requests against the Variable Subsetter services developed and managed by the Data Services team on the Transformation Train. This service makes use of CF-Conventions to retrieve all requested variable from OPeNDAP, along with all those other variables required to make the output product usable in downstream processing (e.g., coordinate and dimension variables). This service can be used with any OPeNDAP-enabled collection that adheres to the Climate and Forecast metadata conventions.

The data retrieved from OPeNDAP will be in a NetCDF-4 format.

Note, several configuration tips were gained from [this blog post](https://towardsdatascience.com/introduction-to-papermill-2c61f66bea30).

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-variable-subsetter`

A `.netrc` file must also be located in the `test` directory of this repository.

## Import requirements:

In [None]:
from os.path import exists

from harmony import Client, Collection, Environment, Request
import numpy as np

from utilities import (
    compare_results_to_reference_file,
    print_success,
    remove_results_files,
    submit_and_download,
)

## Set default parameters:

`papermill` requires default values for parameters used on the workflow. In this case, `harmony_host_url`.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

### Identify Harmony environment (for easier reference):

In [None]:
host_environment = {
    'http://localhost:3000': Environment.LOCAL,
    'https://harmony.sit.earthdata.nasa.gov': Environment.SIT,
    'https://harmony.uat.earthdata.nasa.gov': Environment.UAT,
    'https://harmony.earthdata.nasa.gov': Environment.PROD,
}

harmony_environment = host_environment.get(harmony_host_url)

if harmony_environment is not None:
    harmony_client = Client(env=harmony_environment)

## Variable Subsetter

The variable subsetter is currently only configured for collections in UAT.

**2023-05-11 - The tests in this notebook been disabled as there are currently no collections associated with the Variable Subsetter. ATL03 and ATL08 have recently been associated with the Trajectory Subsetter instead.**

To re-enable these tests:

* Associate a suitable UAT collection to the [UMM-S record for the variable-subsetter](https://mmt.uat.earthdata.nasa.gov/services/S1237976118-EEDTEST). This collection will likely be L2, and must be enabled for OPeNDAP (including related URLs in granule UMM-G records).
* Revert the change in the next cell to `var_subsetter_env`.
* Add the collection concept ID and granule concept ID for testing to the `var_subsetter_non_prod_information` cell.
* Enter a variable name in the single-variable `harmony.Request` instance.
* Create reference files for both requests and save them in the `reference_files` subdirectory (with the expected file names).
* Run the notebook locally before commiting any changes.

In [None]:
var_subsetter_non_prod_information = {
    'collection': Collection(id='<enter collection concept ID>'),
    'granule_id': '<enter granule concept ID>',
}

var_subsetter_env = {}
# var_subsetter_env = {Environment.LOCAL: var_subsetter_non_prod_information,
#                      Environment.SIT: var_subsetter_non_prod_information,
#                      Environment.UAT: var_subsetter_non_prod_information}

if harmony_environment in var_subsetter_env:
    var_subsetter_info = var_subsetter_env[harmony_environment]
else:
    var_subsetter_info = None

### Variable Subsetter, single-variable request:

**2023-05-11 - When this test is active a sample reference file will need to be created and added as `reference_files/var_subsetter_single_var_reference.nc4`.**

This request should retrieve the requested variable and any required variables (e.g., coordinates, or dimension variables). When this test is reactivated for a new collection, the expected output variables should be listed here.

In [None]:
if var_subsetter_info is not None:
    single_var_file_name = 'var_subsetter.nc4'
    single_var_request = Request(
        collection=var_subsetter_info['collection'],
        granule_id=[var_subsetter_info['granule_id']],
        variables=['<enter variable full name>'],
    )

    submit_and_download(harmony_client, single_var_request, single_var_file_name)
    assert exists(
        single_var_file_name
    ), 'Unsuccessful single-variable Variable Subsetter request.'

    compare_results_to_reference_file(
        single_var_file_name, 'reference_files/var_subsetter_single_var_reference.nc4'
    )

    print_success('Variable subsetter single variable request.')
else:
    print(
        f'The Variable Subsetter is not configured for environment: "{harmony_environment}" - skipping test.'
    )

### Variable Subsetter, all variables

Make a request for "all" variables. This should retrieve the entire file, with all the variables from the original source granule.

**2023-05-11 - When this test is active a sample reference file will need to be created and added as `reference_files/var_subsetter_all_vars_reference.nc4`.**

In [None]:
if var_subsetter_info is not None:
    all_variables_file_name = 'var_subsetter_all_vars.nc4'
    all_variables_request = Request(
        collection=var_subsetter_info['collection'],
        granule_id=[var_subsetter_info['granule_id']],
    )

    submit_and_download(harmony_client, all_variables_request, all_variables_file_name)
    assert exists(
        all_variables_file_name
    ), 'Unsuccessful Variable Subsetter all-variable request.'

    compare_results_to_reference_file(
        all_variables_file_name, 'reference_files/var_subsetter_all_vars_reference.nc4'
    )

    print_success('Variable Subsetter all variable request.')
else:
    print(
        f'The Variable Subsetter is not configured for environment: "{harmony_environment}" - skipping test.'
    )

## Remove results files:

In [None]:
remove_results_files()