# CDAT Migration Regression Testing Notebook (`.png` files)

This notebook is used to perform regression testing between the development and
production versions of a diagnostic set.

## How to use

PREREQUISITE: The diagnostic set's netCDF stored in `.json` files in two directories
(dev and `main` branches).

1. Make a copy of this notebook under `auxiliary_tools/cdat_regression_testing/<DIR_NAME>`.
2. Run `mamba create -n cdat_regression_test -y -c conda-forge "python<3.12" xarray netcdf4 dask pandas matplotlib-base ipykernel`
3. Run `mamba activate cdat_regression_test`
4. Update `SET_DIR` and `SET_NAME` in the copy of your notebook.
5. Run all cells IN ORDER.


## Setup Code


In [1]:
import glob

from auxiliary_tools.cdat_regression_testing.utils import get_image_diffs

SET_NAME = "enso_diags"
SET_DIR = "663-enso-diags"

DEV_PATH = f"/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/{SET_DIR}/{SET_NAME}/**"
DEV_GLOB = sorted(glob.glob(DEV_PATH + "/*.png"))
DEV_NUM_FILES = len(DEV_GLOB)

MAIN_PATH = f"/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/{SET_NAME}/**"
MAIN_GLOB = sorted(glob.glob(MAIN_PATH + "/*.png"))
MAIN_NUM_FILES = len(MAIN_GLOB)

In [2]:
def _check_if_files_found():
    if DEV_NUM_FILES == 0 or MAIN_NUM_FILES == 0:
        raise IOError(
            "No files found at DEV_PATH and/or MAIN_PATH. "
            f"Please check {DEV_PATH} and {MAIN_PATH}."
        )


def _check_if_matching_filecount():
    if DEV_NUM_FILES != MAIN_NUM_FILES:
        raise IOError(
            "Number of files do not match at DEV_PATH and MAIN_PATH "
            f"({DEV_NUM_FILES} vs. {MAIN_NUM_FILES})."
        )

    print(f"Matching file count ({DEV_NUM_FILES} and {MAIN_NUM_FILES}).")


def _check_if_missing_files():
    missing_count = 0

    for fp_main in MAIN_GLOB:
        fp_dev = fp_main.replace(SET_DIR, "main")

        if fp_dev not in MAIN_GLOB:
            print(f"No production file found to compare with {fp_dev}!")
            missing_count += 1

    for fp_dev in DEV_GLOB:
        fp_main = fp_main.replace("main", SET_DIR)

        if fp_main not in DEV_GLOB:
            print(f"No development file found to compare with {fp_main}!")
            missing_count += 1

    print(f"Number of files missing: {missing_count}")

## 1. Check for matching and equal number of files


In [3]:
_check_if_files_found()

In [4]:
_check_if_missing_files()

Number of files missing: 0


In [5]:
_check_if_matching_filecount()

OSError: Number of files do not match at DEV_PATH and MAIN_PATH (24 vs. 12).

## 2 Compare the plots between branches

- Compare "ref" and "test" files
- "diff" files are ignored because getting relative diffs for these does not make sense (relative diff will be above tolerance)


In [None]:
MAIN_GLOB

['/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/FLNS-feedback/feedback-FLNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/FSNS-feedback/feedback-FSNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/LHFLX-feedback/feedback-LHFLX-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/LHFLX-response/regression-coefficient-lhflx-over-nino34.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/NET_FLUX_SRF-feedback/feedback-NET_FLUX_SRF-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/NET_FLUX_SRF-response/regression-coefficient-net_flux_srf-over-nino34.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/PRECT-response/regression-coefficient-prect-over-nino34.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/SHFLX-feedback/feedback-SHFLX-NINO3-TS-NINO3.png',
 '/global/cfs/c

In [6]:
DEV_GLOB

['/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FLNS-feedback/feedback-FLNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FLNS-feedback_diff/feedback-FLNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FSNS-feedback/feedback-FSNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FSNS-feedback_diff/feedback-FSNS-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/LHFLX-feedback/feedback-LHFLX-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/LHFLX-feedback_diff/feedback-LHFLX-NINO3-TS-NINO3.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/LHFLX-response/regression-coefficient-lhflx-over-nino34.png',
 '/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/LHFLX-response_diff/regre

In [7]:
dev_glob = [file for file in DEV_GLOB if "diff" not in file]
for main_path, dev_path in zip(MAIN_GLOB, dev_glob):
    print("Comparing:")
    print(f"    * {main_path}")
    print(f"    * {dev_path}")

    get_image_diffs(dev_path, main_path)

Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/FLNS-feedback/feedback-FLNS-NINO3-TS-NINO3.png
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FLNS-feedback/feedback-FLNS-NINO3-TS-NINO3.png
     * Difference path /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FLNS-feedback_diff/feedback-FLNS-NINO3-TS-NINO3.png
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/FSNS-feedback/feedback-FSNS-NINO3-TS-NINO3.png
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FSNS-feedback/feedback-FSNS-NINO3-TS-NINO3.png
     * Difference path /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/enso_diags/FSNS-feedback_diff/feedback-FSNS-NINO3-TS-NINO3.png
Comparing:
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/main/enso_diags/LHFLX-feedback/feedback-LHFLX-NINO3-TS-NINO3.png
    * /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/663-enso-diags/e

### Results

- All plots are really close. The two extra latitude points for `"ccb"` in the CDAT code
  influence the diffs. Specifically, the regression-coefficient plots for xCDAT show a missing
  line at the bottom which is most likely due to the two extra latitude points not being included.
