# Regression test suite for the Harmony Subsetter with Multi-dimensional Concatenator backend Harmony service chain:

<!-- This notebook provides condensed examples of using Harmony to make requests against the Variable Subsetter services developed and managed by the Data Services team on the Transformation Train. This service makes use of CF-Conventions to retrieve all requested variable from OPeNDAP, along with all those other variables required to make the output product usable in downstream processing (e.g., coordinate and dimension variables). This service can be used with any OPeNDAP-enabled collection that adheres to the Climate and Forecast metadata conventions. -->

The data retrieved from the service chain will be in a NetCDF-4 format.

## Prerequisites

<!-- The dependencies for this notebook are listed in the environment.yaml. To test or install locally, create the papermill environment used in the automated regression testing suite:

conda env create -f ./environment.yaml && conda activate papermill-variable-subsetter -->

A .netrc file must also be located in the test directory of this repository.

## Import requirements:

In [2]:
from os.path import exists

from harmony import Client, Collection, Environment, Request
import numpy as np

from utilities import (compare_results_to_reference_file, print_success,
                       remove_results_files, submit_and_download)

In [3]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

## Identify Harmony environment (for easier reference):

In [20]:
host_environment = {'http://localhost:3000': Environment.LOCAL,
                    'https://harmony.sit.earthdata.nasa.gov': Environment.SIT,
                    'https://harmony.uat.earthdata.nasa.gov': Environment.UAT,
                    'https://harmony.earthdata.nasa.gov': Environment.PROD}

harmony_environment = host_environment.get(harmony_host_url)

if harmony_environment is not None:
    harmony_client = Client(env=harmony_environment)

## Concatenation Service Chain

In [27]:
concatenator_chain_non_prod_information = {'collection': 'C1254854453-LARC_CLOUD'}

concatenator_chain_env = {Environment.UAT: concatenator_chain_non_prod_information}

if harmony_environment in concatenator_chain_env:
    concatenator_chain_info = concatenator_chain_env[harmony_environment]
else:
    concatenator_chain_info = None

## Concatenation-alone request:

In [35]:
if concatenator_chain_info is not None:
    concatenator_chain_output_file_name = 'Concatenation_Result.nc4'
    concatenator_chain_request = Request(
        collection=Collection(id=concatenator_chain_info['collection']),
        concatenate="True",
        extend="mirror_step",
        max_results=12
    )

    assert concatenator_chain_request.is_valid()

    submit_and_download(harmony_client, concatenator_chain_request, concatenator_chain_output_file_name)
    assert exists(concatenator_chain_output_file_name), 'Unsuccessful Subsetter-Concatenation request.'

    # compare_results_to_reference_file(
    #     single_var_file_name,
    #     'reference_files/var_subsetter_single_var_reference.nc4'
    # )

    print_success('Subsetter-Concatenation request.')
else:
    print(f'The Subsetter-Concatenation is not configured for environment: "{harmony_environment}" - skipping test.')

Downloaded: C1254854453-LARC_CLOUD_merged.nc4
Saved output to: Concatenation_Result.nc4
[92mSuccess: Subsetter-Concatenation request.[0m


In [30]:
from netCDF4 import Dataset, Group, Variable

In [31]:
with Dataset(concatenator_chain_output_file_name) as results_ds:  #, Dataset(ref_file) as ref_ds:
        # compare_group_to_reference(results_ds, ref_ds)
    print(results_ds.ncattrs)

<bound method Dataset.ncattrs of <class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    tio_commit: abba4bbcf910f6b8213ce2dfcabad202a0152ea9
    product_type: NO2
    processing_level: 2
    processing_version: 1
    time_reference: 1980-01-06T00:00:00Z
    apriori_source: GEOSCF:forecast
    geospatial_bounds_crs: EPSG:4326
    version_id: 1
    project: TEMPO
    platform: Intelsat 40e
    source: UV-VIS hyperspectral imaging
    institution: Smithsonian Astrophysical Observatory
    creator_url: http://tempo.si.edu
    Conventions: CF-1.6, ACDD-1.3
    title: TEMPO Level 2 nitrogen dioxide product
    collection_shortname: TEMPO_NO2_L2
    collection_version: 1
    keywords: EARTH SCIENCE>ATMOSPHERE>AIR QUALITY>NITROGEN OXIDES, EARTH SCIENCE>ATMOSPHERE>ATMOSPHERIC CHEMISTRY>NITROGEN COMPOUNDS>NITROGEN DIOXIDE
    summary: Nitrogen dioxide Level 2 files provide trace gas information at TEMPO’s native spatial resolution, ~10 km^2 at the center of the

## Subsetting -> Concatenation request:

In [36]:
if concatenator_chain_info is not None:
    concatenator_chain_output_file_name = 'Subset-Concatenation_Result.nc4'
    concatenator_chain_request = Request(
        collection=Collection(id=concatenator_chain_info['collection']),
        variables=["/product/vertical_column_total", "/product/vertical_column_troposphere"],
        concatenate="True",
        extend="mirror_step",
        max_results=12
    )

    assert concatenator_chain_request.is_valid()

    submit_and_download(harmony_client, concatenator_chain_request, concatenator_chain_output_file_name)
    assert exists(concatenator_chain_output_file_name), 'Unsuccessful Subsetter-Concatenation request.'

    # compare_results_to_reference_file(
    #     single_var_file_name,
    #     'reference_files/var_subsetter_single_var_reference.nc4'
    # )

    print_success('Subsetter-Concatenation request.')
else:
    print(f'The Subsetter-Concatenation is not configured for environment: "{harmony_environment}" - skipping test.')

Downloaded: C1254854453-LARC_CLOUD_merged.nc4
Saved output to: Subset-Concatenation_Result.nc4
[92mSuccess: Subsetter-Concatenation request.[0m


In [39]:
with Dataset(concatenator_chain_output_file_name) as results_ds:  #, Dataset(ref_file) as ref_ds:
        # compare_group_to_reference(results_ds, ref_ds)
    # print(results_ds.ncattrs)
    print(results_ds.dimensions)

{'subset_index': <class 'netCDF4._netCDF4.Dimension'>: name = 'subset_index', size = 2, 'mirror_step': <class 'netCDF4._netCDF4.Dimension'>: name = 'mirror_step', size = 787, 'xtrack': <class 'netCDF4._netCDF4.Dimension'>: name = 'xtrack', size = 2048}
