# LAADS DAAC Geoloco Regression Tests

This notebook contains contains a suite of regression tests against LAADS DAAC Geoloco Harmony Service against reference data generated on premises. 

Geoloco ideally operates on Levels 3 & 4 data. Level 1B and 2 data can be operated, but the outputs will be automatically reprojected to `Geographic` and regridded to its raster resolution if no `proj4` string is supplied. 

Although geoloco can perform variable and location subsetting, reprojection, resampling and regridding, this regression test suite will focus on reprojection, resampling and regridding. 

## Prerequisites

The dependencies for this notebook are listed in the environment.yaml. To test or install locally, create the papermill environment used in the automated regression testing suite:

`conda env create -f ./environment.yaml && conda activate papermill-geoloco`

A `.netrc` file must also be located in the test directory of this repository.

In [None]:
from harmony import Client, Collection, Environment, Request

from utilities import (
    submit_and_download,
    remove_results_files,
    print_error,
    print_success,
    compare_dimensions,
    compare_data,
)

## Set Default Parameters

`papermill` requires default values for parameters used on the workflow. In this case, `harmony_host_url`

The following are the valid values
- Production: https://harmony.earthdata.nasa.gov
- UAT: https://harmony.uat.earthdata.nasa.gov
- SIT: https://harmony.sit.earthdata.nasa.gov
- Local: http://localhost:3000

In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'
# harmony_host_url = 'https://harmony.sit.earthdata.nasa.gov'
# harmony_host_url = 'http://localhost:3000'

## Identify Harmony Environment

In [None]:
host_environment = {
    'http://localhost:3000': Environment.LOCAL,
    'https://harmony.sit.earthdata.nasa.gov': Environment.SIT,
    'https://harmony.uat.earthdata.nasa.gov': Environment.UAT,
    'https://harmony.earthdata.nasa.gov': Environment.PROD,
}


harmony_environment = host_environment.get(harmony_host_url)

if harmony_environment is not None:
    harmony_client = Client(env=harmony_environment)

## Setting up Collection Environment Variables

The cell below sets up the Collection, Granule and other necessary variables for each tested dataset. The datasets provided are in the `UAT` environment. There is one dataset for Level 1B, Level 2 and Level 3.

- Level 1B: MOD021KM
- Level 2: MOD35_L2
- Level 3: MOD08_D3

Also provided are `proj4` strings for UTM and Geographic tranformations for reprojections and downscale sizing for regridding.

In [None]:
mod021km_non_production_info = {
    'collection': Collection(id='C1256826282-LAADSCDUAT'),
    'granule_id': 'G1259320275-LAADSCDUAT',
    'variable': ['EV_250_Aggr1km_RefSB'],
    'downscale_size': [0.01802, 0.01802],
}

mod35l2_non_production_info = {
    'collection': Collection(id='C1257437479-LAADSCDUAT'),
    'granule_id': 'G1261599141-LAADSCDUAT',
    'variable': ['Cloud_Mask'],
    'downscale_size': [0.01802, 0.01802],
}

mod08d3_non_production_info = {
    'collection': Collection(id='C1257773477-LAADSCDUAT'),
    'granule_id': 'G1259320277-LAADSCDUAT',
    'variable': ['Aerosol_Optical_Depth_Land_Ocean_Mean'],
    'downscale_size': [2, 2],
}

geo_proj4_string = '+a=6378137.0 +b=6356752.3142451793 +no_defs +proj=latlong'

resampling_string = 'NN'

file_indicators = {
    'MOD021KM': 'EV_250_Aggr1km_RefSB_1.hdf',
    'MOD35_L2': 'Cloud_Mask_1.hdf',
    'MOD08_D3': 'Aerosol_Optical_Depth_Land_Ocean_Mean.hdf',
}

reference_data = {
    'MOD021KM': 'reference_data/MOD021KM.A2023001.0020.061.psrpcs_001701802061.EV_250_Aggr1km_RefSB_1.hdf',
    'MOD35_L2': 'reference_data/MOD35_L2.A2023001.0020.061.psrpcs_001701881013.Cloud_Mask_1.hdf',
    'MOD08_D3': 'reference_data/MOD08_D3.A2023001.061.psrpcs_001701881265.Aerosol_Optical_Depth_Land_Ocean_Mean.hdf',
}

These selected collections and granules are only available in UAT environment. To minimize the output, all requests will utilize variable subsetting.

In [None]:
mod021km_geoloco_env = {
    Environment.LOCAL: mod021km_non_production_info,
    Environment.UAT: mod021km_non_production_info,
    Environment.SIT: mod021km_non_production_info,
}
mod35l2_geoloco_env = {
    Environment.LOCAL: mod35l2_non_production_info,
    Environment.UAT: mod35l2_non_production_info,
    Environment.SIT: mod35l2_non_production_info,
}
mod08d3_geoloco_env = {
    Environment.LOCAL: mod08d3_non_production_info,
    Environment.UAT: mod08d3_non_production_info,
    Environment.SIT: mod08d3_non_production_info,
}

if harmony_environment in mod021km_geoloco_env:
    mod021km_geoloco_info = mod021km_geoloco_env[harmony_environment]
else:
    mod021km_geoloco_info = None

if harmony_environment in mod35l2_geoloco_env:
    mod35l2_geoloco_info = mod35l2_geoloco_env[harmony_environment]
else:
    mod35l2_geoloco_info = None

if harmony_environment in mod08d3_geoloco_env:
    mod08d3_geoloco_info = mod08d3_geoloco_env[harmony_environment]
else:
    mod08d3_geoloco_info = None

## Reprojection/Resampling/Regridding Regression Tests

In the cell below, reprojection is tested using a Geographic `proj4` string, 2x resolution scale size, and nearest neighbor resampling for the Level 1B, Level 2 and Level 3 Collections/Granules. Outputs are considerably reduced using variable subsetting. Dimension sizes are checked between the reference data and the output data.

In [None]:
if (
    mod021km_geoloco_info is not None
    and mod35l2_geoloco_info is not None
    and mod08d3_geoloco_info is not None
):

    mod021km_request = Request(
        collection=mod021km_geoloco_info['collection'],
        granule_id=mod021km_geoloco_info['granule_id'],
        variables=mod021km_geoloco_info['variable'],
        scale_size=mod021km_geoloco_info['downscale_size'],
        crs=geo_proj4_string,
        interpolation=resampling_string,
    )

    mod35l2_request = Request(
        collection=mod35l2_geoloco_info['collection'],
        granule_id=mod35l2_geoloco_info['granule_id'],
        variables=mod35l2_geoloco_info['variable'],
        scale_size=mod35l2_geoloco_info['downscale_size'],
        crs=geo_proj4_string,
        interpolation=resampling_string,
    )

    mod08d3_request = Request(
        collection=mod08d3_geoloco_info['collection'],
        granule_id=mod08d3_geoloco_info['granule_id'],
        variables=mod08d3_geoloco_info['variable'],
        scale_size=mod08d3_geoloco_info['downscale_size'],
        crs=geo_proj4_string,
        interpolation=resampling_string,
    )

    mod021km_compare_file = submit_and_download(
        harmony_client, mod021km_request, file_indicators['MOD021KM']
    )
    mod35l2_compare_file = submit_and_download(
        harmony_client, mod35l2_request, file_indicators['MOD35_L2']
    )
    mod08d3_compare_file = submit_and_download(
        harmony_client, mod08d3_request, file_indicators['MOD08_D3']
    )

    mod021km_test = True
    mod35l2_test = True
    mod08d3_test = True

    if compare_dimensions(reference_data['MOD021KM'], mod021km_compare_file):
        if not compare_data(reference_data['MOD021KM'], mod021km_compare_file):
            print_error('MOD021KM data mismatch.')
            mod021km_test = False
    else:
        print_error('MOD021KM data dimension mismatch.')
        mod021km_test = False

    if compare_dimensions(reference_data['MOD35_L2'], mod35l2_compare_file):
        if not compare_data(reference_data['MOD35_L2'], mod35l2_compare_file):
            print_error('MOD35_L2 data mismatch.')
            mod35l2_test = False
    else:
        print_error('MOD35_L2 data dimension mismatch.')
        mod35l2_test = False

    if compare_dimensions(reference_data['MOD08_D3'], mod08d3_compare_file):
        if not compare_data(reference_data['MOD08_D3'], mod08d3_compare_file):
            print_error('MOD08_D3 data mismatch.')
            mod08d3_test = False
    else:
        print_error('MOD08_D3 data dimension mismatch.')
        mod08d3_test = False

    remove_results_files()

    geoloco_tests = mod021km_test and mod35l2_test and mod08d3_test

    if geoloco_tests:
        print_success('Geoloco Reprojection/Resampling/Regridding requests.')
    else:
        raise Exception('Geoloco test suite failed')

else:
    print(
        f'Geoloco is not configured for this environment: "{harmony_environment}" - skipping test.'
    )