# `precip-dot` data test `02`:
## Consistent confidence bounds in final data

Verify that the confidence bounds in the final data (product of warped deltas Atlas 14 data) are consistent.

### inputs

The path to the directory containing the "combined" data to test needs to be specified in the `UNDIFF_DIR` env var prior to running this notebook. 

In [1]:
def test_data_file(args):
    """
    Read and test consistency of conf bounds/estimates
    """
    fp = args[0]
    valid_idx = args[1]
    ds = xr.open_dataset(fp)
    arrs = np.array(
        [
            ds[var].values[:,valid_idx[:,0],valid_idx[:,1]]
            for var in ["pf_lower", "pf", "pf_upper"]
        ]
    )
    ds.close()
    result = np.all(arrs[0,:,:] < arrs[1,:,:]) and np.all(arrs[1,:,:] < arrs[2,:,:])
    return result


def run_test(data_dir):
    """
    Run test on output data directory
    """
    print("Beginning test of consistent confidence bounds in final data.\n")
    # durations to read
    durations = [
        "60m",
        "2h",
        "3h",
        "6h",
        "12h",
        "24h",
        "2d",
        "3d",
        "4d",
        "7d",
        "10d",
        "20d",
        "30d",
        "45d",
        "60d",
    ]

    # template path
    data_fp = os.path.join(data_dir, "pcpt_{}_sum_wrf_{}_{}_undiff.nc")

    # test all locations for each future period
    gcms = ["GFDL-CM3", "NCAR-CCSM4"]
    periods = ["2020-2049", "2050-2079", "2080-2099"]
    results = []
    for gcm in gcms:
        
        # get valid xy indices 
        ds = xr.open_dataset(data_fp.format(gcm, "60m", "2020-2049"))
        arr = ds["pf"].values[0,:,:]
        ds.close()
        valid_idx = np.argwhere(~np.isnan(arr))
        
        for period in periods:
            # construct args for reading data in parallel
            args = [
                (data_fp.format(gcm, duration, period), valid_idx)
                for duration in durations
            ]

            print("Reading/testing data for {}, {}".format(gcm, period))
            # read from each duration in parallel
            p = Pool(15)
            out = p.map(test_data_file, args)
            p.close()
            p.join()

            results.append(np.all(out))
            print("{}, {} complete\n".format(gcm, period))

    final_result = np.all(results)
    # print results
    if final_result:
        print("\nTest result: PASS")
        print("No inconsistencies in estimates found.\n")
    else:
        print("\nTest result: FAIL\n")


In [2]:
import os, time, datetime
import numpy as np
import xarray as xr
from multiprocessing import Pool

data_dir = os.getenv("UNDIFF_DIR")

tic = time.perf_counter()

_ = run_test(data_dir)

print("Elapsed time: {} m\n".format(round((time.perf_counter() - tic) / 60, 1)))

utc_time = datetime.datetime.utcnow()
print("Completion time of previous test: {}".format(utc_time.strftime("%Y-%m-%d %H:%M:%S")))



Beginning test of consistent confidence bounds in final data.

Reading/testing data for GFDL-CM3, 2020-2049
GFDL-CM3, 2020-2049 complete

Reading/testing data for GFDL-CM3, 2050-2079
GFDL-CM3, 2050-2079 complete

Reading/testing data for GFDL-CM3, 2080-2099
GFDL-CM3, 2080-2099 complete

Reading/testing data for NCAR-CCSM4, 2020-2049
NCAR-CCSM4, 2020-2049 complete

Reading/testing data for NCAR-CCSM4, 2050-2079
NCAR-CCSM4, 2050-2079 complete

Reading/testing data for NCAR-CCSM4, 2080-2099
NCAR-CCSM4, 2080-2099 complete


Test result: PASS
No inconsistencies in estimates found.

Elapsed time: 7.9 m

Completion time of previous test: 2020-11-09 18:47:58
