# net2cog regression tests

This Jupyter notebook runs a suite of regression tests against the net2cog Harmony Service.

These tests use [SMAP_RSS_L3_SSS_SMI_8DAY-RUNNINGMEAN_V4](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1234410736-POCLOUD) as netcdf input data to test the net2cog service for the smap_sss variable.

## Set the Harmony environment:

The cell below sets the `harmony_host_url` to one of the following valid values:

* Production: <https://harmony.earthdata.nasa.gov>
* UAT: <https://harmony.uat.earthdata.nasa.gov>
* SIT: <https://harmony.sit.earthdata.nasa.gov>
* Local: <http://localhost:3000>

The default value is for the UAT environment. When using this notebook there are two ways to use the non-default environment:

* Run this notebook in a local Jupyter notebook server and change the value of `harmony_host_url` in the cell below to the value for the environment you require from the above list.

* Use the `run_notebooks.sh` script, which requires you to declare an environment variable `HARMONY_HOST_URL`. Set that environment variable to the value above that corresponds to the environment you want to test. That environment variable will take precedence over the default value in the cell below.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-net2cog`

A `.netrc` file must also be located in the `test` directory of this repository.

### Import required packages:

In [None]:
from pathlib import Path
from tempfile import TemporaryDirectory

from harmony import BBox, Collection, Environment, Client, Request
from harmony.harmony import ProcessingFailedException
from numpy.testing import assert_array_almost_equal
import rasterio
from rasterio.transform import Affine
from rasterio.crs import CRS

import utility


reference_dir = Path('./reference_data')

### Set up environment dependent variables:

This includes the Harmony `Client` object and `Collection` objects for each of the collections for which there are regression tests. The local, SIT and UAT Harmony instances all utilise resources from CMR UAT, meaning any non-production environment will use the same resources.

When adding a production entry to the dictionary below, the collection instances can be included directly in the production dictionary entry, as they do not need to be shared.

In [None]:
non_production_collection = {
    'smap_collection': Collection(id='C1234410736-POCLOUD'),
}

non_prod_granule_data = {
    'smap_granules': ['G1234601650-POCLOUD'],
}

collection_data = {
    'https://harmony.uat.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.UAT,
    },
    'https://harmony.sit.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.SIT,
    },
    'http://localhost:3000': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.LOCAL,
    },
}

environment_information = collection_data.get(harmony_host_url)

if environment_information is not None:
    harmony_client = Client(env=environment_information['env'])
    endpoint_url = environment_information.get('endpoint_url', None)

## Test conversion of sss_smap variable

Use SMAP data.

In [None]:
if environment_information is not None:

    smap_request = Request(
        collection=environment_information['smap_collection'],
        granule_id=environment_information['smap_granules'][0],
        variables=['sss_smap'],
        max_results=1,
        format='image/tiff',
    )
    print(harmony_client.request_as_curl(smap_request))

    smap_job_id = harmony_client.submit(smap_request)
    harmony_client.wait_for_processing(smap_job_id, show_progress=True)

In [None]:
with TemporaryDirectory() as temp_dir:
    downloaded_cog_outputs = [
        file_future.result()
        for file_future in harmony_client.download_all(
            smap_job_id, overwrite=True, directory=temp_dir
        )
    ]

    for cog_file in downloaded_cog_outputs:
        utility.validate_cog(cog_file)

        expected_metadata = {
            'driver': 'GTiff',
            'dtype': 'float32',
            'nodata': -9999.0,
            'width': 1440,
            'height': 720,
            'count': 1,
            'crs': CRS.from_epsg(4326),
            'transform': Affine(0.25, 0.0, 0.0, 0.0, 0.25, -90.0),
        }
        reference_file = Path(
            './reference_data',
            'RSS_smap_SSS_L3_8day_running_2020_005_FNL_v04.0_converted_sss_smap.tiff',
        )

        utility.assert_dataset_produced_correct_results(
            cog_file, expected_metadata, reference_file
        )

## Test conversion of ALL variables FAILS

net2cog only supports conversion of a single variable within a netcdf file. This tests that an appropriate error message is shown if more than one variable is specified as input.

In [None]:
if environment_information is not None:

    smap_request = Request(
        collection=environment_information['smap_collection'],
        granule_id=environment_information['smap_granules'][0],
        variables=['all'],
        max_results=1,
        format='image/tiff',
    )
    print(harmony_client.request_as_curl(smap_request))

    smap_job_id = harmony_client.submit(smap_request)
    raised_expected_error = False
    try:
        harmony_client.wait_for_processing(smap_job_id, show_progress=True)
    except ProcessingFailedException as error:
        assert (
            "net2cog harmony adapter currently only supports processing one variable at a time"
            in str(error)
        )
        raised_expected_error = True
        print(error)

    assert (
        raised_expected_error
    ), 'Expected request to raise an exception but it did not.'
    utility.print_success('All variables raised expected error')