# Harmony Browse Image Service (HyBIG) regression tests

This Jupyter notebook runs a suite of regression tests against some requests against the Harmony Browse Image Service.

These tests use ASTER Global Digital Elevation Model (GDEM) Version 3 ([ASTGTM](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1256584478-EEDTEST)) as GeoTIFF input data to test the HyBIG service for an image with no color information.

These tests use MEaSUREs Vegetation Continuous Fields (VCF) Yearly Global 0.05 Deg V001 ([VCF5KYR](https://cmr.uat.earthdata.nasa.gov/search/concepts/C1258119317-EEDTEST)) as GeoTIFF input data to test the HyBIG service against RGB color banded input.

Tests that make comparisons between reference files and the generated test output will compare the array values of the generated PNGs or JPEGs, and also the metadata that `rasterio` retrieves for the file. Some of this metadata is retrieved from the PNG or JPEG itself, such as array dimensions, whilst information such as the CRS and geotransform are retrieved by the accompanying `aux.xml` file that HyBIG also creates for each browse image.

## Set the Harmony environment:

The cell below sets the `harmony_host_url` to one of the following valid values:

* Production: <https://harmony.earthdata.nasa.gov>
* UAT: <https://harmony.uat.earthdata.nasa.gov>
* SIT: <https://harmony.sit.earthdata.nasa.gov>
* Local: <http://localhost:3000>

The default value is for the UAT environment. When using this notebook there are two ways to use the non-default environment:

* Run this notebook in a local Jupyter notebook server and change the value of `harmony_host_url` in the cell below to the value for the environment you require from the above list.

* Use the `run_notebooks.sh` script, which requires you to declare an environment variable `HARMONY_HOST_URL`. Set that environment variable to the value above that corresponds to the environment you want to test. That environment variable will take precedence over the default value in the cell below.

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'

## Prerequisites

The dependencies for this notebook are listed in the [environment.yaml](./environment.yaml). To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-hybig`

A `.netrc` file must also be located in the `test` directory of this repository.

### Import required packages:

In [None]:
from pathlib import Path
from tempfile import TemporaryDirectory

from harmony import Collection, Environment, Client, Request
from rasterio.transform import Affine
from rasterio.crs import CRS

from utility import (
    print_success,
    print_error,
    assert_dataset_produced_correct_results,
    build_file_list,
)


reference_dir = Path('./reference_data')

### Set up environment dependent variables:

This includes the Harmony `Client` object and `Collection` objects for each of the collections for which there are regression tests. The local, SIT and UAT Harmony instances all utilise resources from CMR UAT, meaning any non-production environment will use the same resources.

When adding a production entry to the dictionary below, the collection instances can be included directly in the production dictionary entry, as they do not need to be shared.

In [None]:
non_production_collection = {
    'aster_collection': Collection(id='C1256584478-EEDTEST'),
    'measures_collection': Collection(id='C1258119317-EEDTEST'),
    'prefire_collection': Collection(id='C1263096190-EEDTEST'),
}

non_prod_granule_data = {
    'aster_granules': ['G1256584570-EEDTEST'],
    'measures_granules': ['G1258119387-EEDTEST'],
    'prefire_granules': ['G1263096192-EEDTEST'],
}

non_prod_variable_data = {
    'prefire_variable': 'Flx/olr',
}

collection_data = {
    'https://harmony.uat.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        **non_prod_variable_data,
        'env': Environment.UAT,
    },
    'https://harmony.sit.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        **non_prod_variable_data,
        'env': Environment.SIT,
    },
    'http://localhost:3000': {
        **non_production_collection,
        **non_prod_granule_data,
        **non_prod_variable_data,
        'env': Environment.LOCAL,
    },
}

environment_information = collection_data.get(harmony_host_url)

if environment_information is not None:
    harmony_client = Client(env=environment_information['env'])
    endpoint_url = environment_information.get('endpoint_url', None)

## Test input GeoTIFF with no color information
use ASTER data.

In [None]:
common_aster_metadata = {
    'dtype': 'uint8',
    'nodata': None,
    'width': 3641,
    'height': 3641,
    'count': 1,
    'crs': CRS.from_epsg(4326),
    'transform': Affine(
        0.00027464982147761604, 0.0, 22.0, 0.0, -0.00027464982147761604, 1.0
    ),
}
aster_basename = 'ASTGTMV003_N00E022_dem'

### Test that specifies spatial extents overrides GIBS-compatible defaults:

This test will specify a scale extent in the request, which tells Harmony the spatial area of a browse imagery. Unlike a bounding box, this spatial extent should be in the coordinates of the granule (e.g., geographic or projected as appropriate).

The output from this test should be constrained to the specified area, in this case that of the ASTER tile, which has approximately the following extent:

* 22 ≤ longitude (degrees east) ≤ 23
* 0 ≤ latitude (degrees north) ≤ 1

In [None]:
if environment_information is not None:

    scale_extent = [22, 0, 23, 1]

    aster_request = Request(
        collection=environment_information['aster_collection'],
        granule_id=environment_information['aster_granules'][0],
        scale_extent=scale_extent,
        crs='EPSG:4326',
        format='image/png',
    )

    aster_job_id = harmony_client.submit(aster_request)
    harmony_client.wait_for_processing(aster_job_id, show_progress=True)

    reference_files = build_file_list(aster_basename, reference_dir, 'PNG')

    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                aster_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(aster_basename, Path(temp_dir), 'PNG')

        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'
        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'PNG'
        )

    print_success('Conversion of ASTER Geotiff to PNG Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Repeat previous test but request JPG output.

Forms a request for ASTER data as 'image/jpeg', ensures all files are created that the image has the correct metadata and that that the data in the JPG file matches the reference data in the test.



In [None]:
if environment_information is not None:

    scale_extent = [22, 0, 23, 1]
    aster_request = Request(
        collection=environment_information['aster_collection'],
        granule_id=environment_information['aster_granules'][0],
        scale_extent=scale_extent,
        crs='EPSG:4326',
        format='image/jpeg',
    )

    aster_job_id = harmony_client.submit(aster_request)
    harmony_client.wait_for_processing(aster_job_id, show_progress=True)

    reference_files = build_file_list(aster_basename, reference_dir, 'JPEG')

    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                aster_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(aster_basename, Path(temp_dir), 'JPEG')
        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'
        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'JPEG'
        )

    print_success('Conversion of ASTER Geotiff to JPEG Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

## Failure cases raised by bad input parameters.

### Test: Scale Extent has parameters in wrong order.
xmin > xmax

In [None]:
if environment_information is not None:

    reversed_xscale_extent = [23, 0, 22, 1]
    aster_request = Request(
        collection=environment_information['aster_collection'],
        granule_id=environment_information['aster_granules'][0],
        scale_extent=reversed_xscale_extent,
        crs='EPSG:4326',
        format='image/jpeg',
    )

    aster_job_id = harmony_client.submit(aster_request)
    try:
        harmony_client.wait_for_processing(aster_job_id, show_progress=True)
        print_error('Exception not rasied for bad Scale Extents')
        assert False, 'Fail Scale Extent test'
    except Exception as exception:
        assert 'Harmony ScaleExtents must be in order [xmin,ymin,xmax,ymax]' in str(
            exception
        ), 'Exception not raised correctly'
        print_success('Exception Raised Correctly for Scale Extents')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test: Scale Sizes are Positive

In [None]:
if environment_information is not None:

    scale_size = [0.002, -0.002]
    aster_request = Request(
        collection=environment_information['aster_collection'],
        granule_id=environment_information['aster_granules'][0],
        scale_size=scale_size,
        crs='EPSG:4326',
        format='image/jpeg',
    )

    try:
        aster_job_id = harmony_client.submit(aster_request)
        harmony_client.wait_for_processing(aster_job_id, show_progress=True)
        print_error('Exception not rasied for negative Scale Size.')
        assert False, 'Fail Scale sizes test'
    except Exception as exception:
        assert '"scaleSize[1]" should be >= 0' in str(
            exception
        ), 'Exception not raised correctly'
        print_success('Exception raised correctly for negative Scale Size.')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test request from 3-band RGB input GeoTIFF

Use MEaSUREs VCF5KYR data (3-band RGB GeoTIFF).

In [None]:
common_measures_metadata = {
    'dtype': 'uint8',
    'nodata': 255.0,
    'width': 7200,
    'height': 3600,
    'crs': CRS.from_epsg(4326),
    'transform': Affine(0.05, 0.0, -180.0, 0.0, -0.05, 90.0),
}
measures_basename = 'VCF5KYR_1991001_001_2018224205008'

### Test a request for PNG output from 3-Band RGB input data

Forms a request for MEaSUREs VCF5KYR data (3-band RGB GeoTIFF) as 'image/png', ensures all files are created that the image has the correct metadata and that that the data in the PNG file matches the reference data in the test.

In [None]:
if environment_information is not None:
    measures_request = Request(
        collection=environment_information['measures_collection'],
        granule_id=environment_information['measures_granules'][0],
        format='image/png',
    )

    measures_job_id = harmony_client.submit(measures_request)
    harmony_client.wait_for_processing(measures_job_id, show_progress=True)

    reference_files = build_file_list(measures_basename, reference_dir, 'PNG')

    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                measures_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(measures_basename, Path(temp_dir), 'PNG')
        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'

        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'PNG'
        )

    print_success('Conversion of MEaSUREs GeoTIFF to PNG Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test a request for JPEG output from 3-Band RGB input data

Forms a request for MEaSUREs VCF5KYR data (3-band RGB GeoTIFF) as 'image/jpg', ensures all files are created that the image has the correct metadata and that that the data in the JPEG file matches the reference data in the test.

In [None]:
if environment_information is not None:
    measures_request = Request(
        collection=environment_information['measures_collection'],
        granule_id=environment_information['measures_granules'][0],
        format='image/jpeg',
    )

    measures_job_id = harmony_client.submit(measures_request)
    harmony_client.wait_for_processing(measures_job_id, show_progress=True)

    reference_files = build_file_list(measures_basename, reference_dir, 'JPG')

    with TemporaryDirectory() as temp_dir:
        downloaded_grid_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                measures_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(measures_basename, Path(temp_dir), 'JPG')

        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'

        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'JPEG'
        )

    print_success('Conversion of MEaSUREs Geotiff to JPEG Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test that specifies spatial sizes (resolutions) overrides GIBS-compatible defaults:

This test will specify a scale size in the request, which tells Harmony the resolution of the produced browse imagery.  This test will specify two custom resolutions (one in each dimension), that are not GIBS-compatible defaults.

The expected output should be a single image that has the expected resolutions, best detected via both the affine transformation matrix and the dimensions of the output. The test specifies a 1 degree resolution in longitude and a 2 degree resolution in latitude, meaning the expected browse image dimensions should be (360, 90).

The resolutions picked have a y-dimension scale size that is twice as large as the x-dimension scale size, so the outputs will look squashed in the vertical direction.

In [None]:
if environment_information is not None:
    scale_sizes = [1.0, 2.0]
    scale_size_reference_dir = reference_dir / 'scale_size'

    scale_size_request = Request(
        collection=environment_information['measures_collection'],
        granule_id=environment_information['measures_granules'][0],
        scale_size=scale_sizes,
        crs='EPSG:4326',
        format='image/png',
    )

    scale_size_job_id = harmony_client.submit(scale_size_request)
    harmony_client.wait_for_processing(scale_size_job_id, show_progress=True)

    reference_files = build_file_list(
        measures_basename, scale_size_reference_dir, 'PNG'
    )

    with TemporaryDirectory() as temp_dir:
        downloaded_scale_size_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                scale_size_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(measures_basename, Path(temp_dir), 'PNG')

        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'

        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'PNG'
        )

    print_success('Conversion of MEaSUREs GeoTIFF to PNG specifying scaleSize. Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test that specifies dimensions overrides GIBS-compatible defaults:

This test will specify the size of the output browse image dimensions in the request. In this test, the output will be asked to be square. Because the input has twice as many pixels in the x-direction, the output browse image will look squashed in the horizontal direction.

In [None]:
if environment_information is not None:
    dimensions_reference_dir = reference_dir / 'dimensions'

    dimensions_request = Request(
        collection=environment_information['measures_collection'],
        granule_id=environment_information['measures_granules'][0],
        height=180,
        width=180,
        format='image/png',
    )

    dimensions_job_id = harmony_client.submit(dimensions_request)
    harmony_client.wait_for_processing(dimensions_job_id, show_progress=True)

    reference_files = build_file_list(
        measures_basename, dimensions_reference_dir, 'PNG'
    )

    with TemporaryDirectory() as temp_dir:
        downloaded_dimensions_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                dimensions_job_id, overwrite=True, directory=temp_dir
            )
        ]

        test_result_files = build_file_list(measures_basename, Path(temp_dir), 'PNG')

        for file_name in test_result_files:
            assert file_name.exists(), f'File does not exist {file_name.resolve()}'

        print_success('all test files generated')

        assert_dataset_produced_correct_results(
            test_result_files[0], reference_files[0], 'PNG'
        )

    print_success(
        'Conversion of MEaSUREs GeoTIFF to PNG specifying dimensions. Success'
    )
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test of tiled outputs:

This test will specify a combination of scale size and scale extent that will
cause HyBIG to tile the output imagery. At the same time, we choose values to
ensure a small number of tiles.

We choose the region over iceland: [-30, 60, -10, 70].

To trigger tiling, we have to choose extents so that the total number of cells is greater than 67108864


- The width of the area is 20 deg so a scale\_size of .001 will yield a total image width = 200000
- The height is 10 deg and we choose a scale\_size that will trigger tiling but still leave just one row (.0029).
  round(10 / .0029) = 3448

- total cells: 20000*3448 => 68960000.0


The expected output should be 4 contiguous tiles, each covering part of Iceland:

The width of all should be 4092 except the last tile which is only 3616.

**Tile 0 (r00c00):**
* -30 ≤ longitude (degrees east) ≤ -25.904
* 60 ≤ latitude (degrees north) ≤ 70

**Tile 1 (r00c01):**
* -25.904 ≤ longitude (degrees east) ≤ -21.808
* 60 ≤ latitude (degrees north) ≤ 70

**Tile 2 (r00c02):**
* -21.808 ≤ longitude (degrees east) ≤ -17.712
* 60 ≤ latitude (degrees north) ≤ 70

**Tile 3 (r00c03):**
* -17.712 ≤ longitude (degrees east) ≤ -13.616
* 60 ≤ latitude (degrees north) ≤ 70

**Tile 4 (r00c04):**
* -13.616 ≤ longitude (degrees east) ≤ -10
* 60 ≤ latitude (degrees north) ≤ 70


In [None]:
if environment_information is not None:
    iceland_tiles = [
        f'{measures_basename}.r00c00',
        f'{measures_basename}.r00c01',
        f'{measures_basename}.r00c02',
        f'{measures_basename}.r00c03',
        f'{measures_basename}.r00c04',
    ]

    iceland_extent = [-30, 60, -10, 70]
    iceland_scale_size = [0.001, 0.0029]
    tiled_reference_dir = reference_dir / 'tiled'

    tiled_request = Request(
        collection=environment_information['measures_collection'],
        granule_id=environment_information['measures_granules'][0],
        scale_extent=iceland_extent,
        scale_size=iceland_scale_size,
        crs='EPSG:4326',
        format='image/png',
    )

    tiled_job_id = harmony_client.submit(tiled_request)
    harmony_client.wait_for_processing(tiled_job_id, show_progress=True)

    with TemporaryDirectory() as temp_dir:
        downloaded_tiled_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                tiled_job_id, overwrite=True, directory=temp_dir
            )
        ]

        # Perform the same checks for all tiles:
        for tile_basename in iceland_tiles:
            tile_reference_files = build_file_list(
                tile_basename, tiled_reference_dir, 'PNG'
            )
            tile_result_files = build_file_list(tile_basename, Path(temp_dir), 'PNG')

            for file_name in tile_result_files:
                assert file_name.exists(), f'File does not exist {file_name.resolve()}'

            print_success(f'All {tile_basename} test files generated')

            assert_dataset_produced_correct_results(
                tile_result_files[0], tile_reference_files[0], 'PNG'
            )

    print_success('Conversion of MEaSUREs GeoTIFF to tiled PNGs. Success')
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')

### Test that demonstrates variable selection for custom colour maps:

This test will run a request that specifies a variable for HyBIG to use. It is a bit of a quirky way to ensure HyBIG has access to the related URL for a custom colour map. The output should be in colour rather than greyscale.

PREFIRE granules also cover the whole Earth at sufficient resolution such that HyBIG will create tiled output, so there will be 15 tiles generated (3 rows, 5 columns).

This test primarily seeks to ensure the values in the output images use the colour map specified in the related URLs of the specified [UMM-Var record](https://cmr.uat.earthdata.nasa.gov/search/variables.umm_json?concept_id=V1270130943-LARC_CLOUD).

In [None]:
if environment_information is not None:
    prefire_basename = 'PREFIRE_SAT2_2B-FLX_S07_R00_20210721013413_03040.nc.G00'

    prefire_tiles = [
        f'{prefire_basename}.r00c00',
        f'{prefire_basename}.r00c01',
        f'{prefire_basename}.r00c02',
        f'{prefire_basename}.r00c03',
        f'{prefire_basename}.r00c04',
        f'{prefire_basename}.r01c00',
        f'{prefire_basename}.r01c01',
        f'{prefire_basename}.r01c02',
        f'{prefire_basename}.r01c03',
        f'{prefire_basename}.r01c04',
        f'{prefire_basename}.r02c00',
        f'{prefire_basename}.r02c01',
        f'{prefire_basename}.r02c02',
        f'{prefire_basename}.r02c03',
        f'{prefire_basename}.r02c04',
    ]

    colour_map_reference_dir = reference_dir / 'colour_map'

    colour_map_request = Request(
        collection=environment_information['prefire_collection'],
        granule_id=environment_information['prefire_granules'][0],
        variables=[environment_information['prefire_variable']],
        format='image/png',
    )

    colour_map_job_id = harmony_client.submit(colour_map_request)
    harmony_client.wait_for_processing(colour_map_job_id, show_progress=True)

    with TemporaryDirectory() as temp_dir:
        downloaded_colour_map_outputs = [
            file_future.result()
            for file_future in harmony_client.download_all(
                colour_map_job_id, overwrite=True, directory=temp_dir
            )
        ]

        for tile_basename in prefire_tiles:
            tile_reference_files = build_file_list(
                tile_basename, colour_map_reference_dir, 'PNG'
            )
            tile_result_files = build_file_list(tile_basename, Path(temp_dir), 'PNG')

            for file_name in tile_result_files:
                assert file_name.exists(), f'File does not exist {file_name.resolve()}'

            print_success(f'All {tile_basename} test files generated')

            assert_dataset_produced_correct_results(
                tile_result_files[0], tile_reference_files[0], 'PNG'
            )

    print_success(
        'Conversion of PREFIRE GeoTIFF to PNG using custom colour map. Success'
    )
else:
    print('Skipping test: HyBIG regression tests not configured for this environment.')