# Filtering Regression Tests

This notebook runs a suite of regression tests against the Harmony Filtering Service.
These tests use a sample TEMPO NO₂ granule to verify that filtering logic works as expected and that output matches reference data in `reference_data/`.


In [None]:
harmony_host_url = 'https://harmony.uat.earthdata.nasa.gov'
# (Overridden if you set HARMONY_HOST_URL in the shell before running.)

## Prerequisites

- Create the `papermill-filtering` environment:
  ```bash
  conda env create -f ./environment.yaml && conda activate papermill-filtering
  ```
- Ensure a `.netrc` file with your Earthdata Login credentials is present here.


### Import required packages:


In [None]:
from harmony import Client, Collection, Environment, Request
from utilities import validate_filter_outputs

### Set up environment-dependent variables:

Define the collection and granule for testing in each environment.

In [None]:
non_production_collection = {
    'filter_collection': Collection(id='C1262899964-LARC_CLOUD')
}
non_prod_granule_data = {
    'filter_granules': ['G1273455903-LARC_CLOUD']
}

collection_data = {
    'https://harmony.uat.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.UAT
    },
    'https://harmony.sit.earthdata.nasa.gov': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.SIT
    },
    'http://localhost:3000': {
        **non_production_collection,
        **non_prod_granule_data,
        'env': Environment.LOCAL
    }
}

environment_information = collection_data.get(harmony_host_url)

if environment_information is not None:
    harmony_client = Client(env=environment_information['env'])
    filter_collection = environment_information['filter_collection']
    filter_granule = environment_information['filter_granules'][0]
else:
    print('Skipping tests: filtering is not configured for environment')


## Test: Filtering request on TEMPO NO2 vertical_column_stratosphere variable

Submit a single-granule request and validate output files.

In [None]:
if environment_information is not None:
    # Build the request
    request = Request(
        collection=filter_collection,
        granule_id=[filter_granule]
    )
    print(harmony_client.request_as_curl(request))
    
    # Submit and wait for completion
    job_id = harmony_client.submit(request)
    harmony_client.wait_for_processing(job_id, show_progress=True)
    
    # Validate against reference_data/
    validate_filter_outputs(harmony_client, job_id)
else:
    print('Skipping test: filtering not configured for environment')
