# Example QARTOD Testing

#### Intro
QARTOD (Quality-Assurance of Real-Time Oceanographic Data) is the effort by the broader oceanographic observing community to standardize processes related to quality control of oceanographic data. Part of the standardization is identification and recommendations of algorithms with which to test data returned by the sensor for evaluating data quality. Currently, OOI is implementing the gross range and climatology tests, which utilize either a three-standard-deviation threshold (gross range) or a monthly-varying range (climatology) determined using a two-cycle harmonic model. The available flags are:

The thresholds for the tests are calculated and save in tables that are stored on gitHub for ingenstion into OOINet. 

#### Purpose
We are implementing QARTOD gross range and climatology tests on datasets that are either currently in production or are being tested for functionality on the Dev1 environment. This is an example notebook for the (1) testing and verification of the implemented QARTOD tests are performing as expected and (2) calculating some descriptives summary statistics of the returned QARTOD flags. Sampling and verification involves running the QARTOD tests locally with the appropriate QARTOD values from the gitHub tables and comparing the results with what was returned with the dataset from OOINet. 

#### Objectives:

* Test Output: Run the test locally and identify where the ooinet output differs from the locally run test and flag those data points to provide feedback to CI. This should be done on both tests which are in production as well as in the development environment. We are currently migrating 

#### Supporting tools
If you want to run this notebook as-is, you will need to clone my [ooinet repo](https://github.com/reedan88/OOINet) to your local machine and install is as a local dev repo (which adds it to your base path). You'll also need the [ooi_data_explorations repo](https://github.com/oceanobservatories/ooi-data-explorations). Lastly, you'll want to install the ```ioos_qc``` [python package](https://github.com/ioos/ioos_qc). 


In [2]:
# Import libraries
import os
import re
import gc
import io
import ast
import pandas as pd
import numpy as np
import xarray as xr
import warnings
warnings.filterwarnings("ignore")

# Import dask
import dask
from dask.diagnostics import ProgressBar

Import the M2M tool

In [3]:
from ooinet import M2M
from ooinet.Instrument.common import process_file

Import the relevant ooi_data_explorations tools

In [4]:
from ooi_data_explorations.uncabled.process_metbk import metbk_datalogger
from ooi_data_explorations.combine_data import combine_datasets

---
## Request and load the data
Sub in ooinet-dev1-west.intra.oceanobservatories.org into the avaialbe API urls


In [5]:
for key in M2M.URLS:
    url = M2M.URLS.get(key)
    if "opendap" in url:
        M2M.URLS[key] = re.sub("opendap", "opendap-dev1-west.intra", url)
    else:
        M2M.URLS[key] = re.sub("ooinet","ooinet-dev1-west.intra", url)

Search the Dev1 server for available datasets

In [6]:
M2M.search_datasets(array="CP03ISPM")

Searching https://ooinet-dev1-west.intra.oceanobservatories.org/api/m2m/12576/sensor/inv/CP03ISPM


Unnamed: 0,array,node,instrument,refdes,url,deployments
0,CP03ISPM,WFP01,05-PARADK000,CP03ISPM-WFP01-05-PARADK000,https://ooinet-dev1-west.intra.oceanobservator...,"[1, 2, 3, 4, 5]"
1,CP03ISPM,WFP01,03-CTDPFK000,CP03ISPM-WFP01-03-CTDPFK000,https://ooinet-dev1-west.intra.oceanobservator...,"[1, 2, 3, 4, 5]"
2,CP03ISPM,WFP01,02-DOFSTK000,CP03ISPM-WFP01-02-DOFSTK000,https://ooinet-dev1-west.intra.oceanobservator...,"[1, 2, 3, 4, 5]"


Find the available datastreams for a given **refdes**

In [7]:
M2M.get_datastreams("CP03ISPM-WFP01-02-DOFSTK000")

Unnamed: 0,refdes,method,stream
0,CP03ISPM-WFP01-02-DOFSTK000,recovered_wfp,dofst_k_wfp_instrument_recovered
1,CP03ISPM-WFP01-02-DOFSTK000,recovered_wfp,dofst_k_wfp_metadata_recovered
2,CP03ISPM-WFP01-02-DOFSTK000,telemetered,dofst_k_wfp_instrument
3,CP03ISPM-WFP01-02-DOFSTK000,telemetered,dofst_k_wfp_metadata


In [8]:
# Setup parameters needed to request data
refdes = "CP03ISPM-WFP01-02-DOFSTK000"
method = "recovered_wfp"
stream = "dofst_k_wfp_instrument_recovered"

#### Development
We may also want to examine new QARTOD tests which are on staging in the Dev-1 environment before they are moved to production. The Development environemt at ooinet-dev1-west.intra.oceanobservatories.org. In order to access data on Dev-1, you need to be granted access and be connected to the CI-West VPN (vpn-west.oceanobservatories.org) at Oregon State.

The Dev-1 environment has no "goldcopy" equivalent THREDDs catalog. Instead we'll have to do the normal request and wait for the datasets to be assembled and made available for download.

In [11]:
# Define a generic preprocessing routine. Do NOT use any of the ooi_data_explorations "process_instrument" methods. We want to be comparing "apples-to-apples" 
def preprocess(ds):
    ds = process_file(ds)
    return ds

In [12]:
# Use the gold copy THREDDs datasets
thredds_url = M2M.get_thredds_url(refdes, method, stream)

# Get the THREDDs catalog
thredds_catalog = M2M.get_thredds_catalog(thredds_url)

# Clean the THREDDs catalog
sensor_files, ancillary_files = M2M.clean_catalog(thredds_catalog, stream)

# Generate the urls to access and load the data
sensor_files = [re.sub("catalog.html\?dataset=", M2M.URLS["dodsC"], file) for file in sensor_files]

# Load the data
with ProgressBar():
    data = xr.open_mfdataset(sensor_files, preprocess=preprocess, parallel=True)
    

Waiting for request to process
Waiting for request to process
Waiting for request to process
Waiting for request to process
Waiting for request to process
Waiting for request to process
[########################################] | 100% Completed | 3.14 ss


In [None]:
def swap_timestamps(ds):
    """
    Swaps the timestamps from the host to the instrument timestamp
    for the CTDBPs
    """
    if "internal_timestamp" in ds.variables:
        # Calculate the timestamp
        inst_time = ds.internal_timestamp.to_pandas()
        attrs = ds.internal_timestamp.attrs
        # Convert the time
        inst_time = inst_time.apply(lambda x: np.datetime64(int(x), 's'))
        # Create a DataArary
        da = xr.DataArray(inst_time, attrs=attrs)
        ds['internal_timestamp'] = da
    ds = ds.set_coords(["internal_timestamp"])
    ds = ds.swap_dims({"time":"internal_timestamp"})
    ds = ds.reset_coords("time")
    ds = ds.rename_vars({"time":"host_time"})
    ds["host_time"].attrs = {
        "long_name": "DCL Timestamp",
        "comment": ("The timestamp that the instrument data as recorded by the mooring data "
                    "concentration logger (DCL)")
    }
    ds = ds.rename({"internal_timestamp":"time"})
    return ds

#### Identify Test Parameters
Next, identify which parameters in the dataset have QARTOD applied to them. Sometimes the variable name in the dataset is different that the key that is used by OOINet to build the datasets. For that we can check the attributes of the variable for the "alternate_parameter_name"!

In [13]:
# Create a dictionary of key-value pairs of dataset variable name:alternate parameter name
test_parameters={}
for var in data.variables:
    if "qartod_results" in var:
        # Get the parameter name
        param = var.split("_qartod")[0]
        
        # Check if the parameter has an alternative ooinet_name
        if "ooinet_variable_name" in data[param].attrs:
            ooinet_name = data[param].attrs["ooinet_variable_name"]
        else:
            ooinet_name = param
        
        # Safe the results in a dictionary
        test_parameters.update({
            param: ooinet_name
        })
# Print out the results
test_parameters

{'dofst_k_oxygen_l2': 'dofst_k_oxygen_l2'}

---
## Testing & Verification

To verify the results of the QARTOD tests being run by OOINet, we want to compare the QARTOD flags returned with the datasets against the results from running the tests locally using the same inputs. First, we have to parse out the separate test results from the ```qartod_executed``` variable. Then, we parse and load the appropriate gitHub tables. With the correct input tables, we can then run the different tests locally. Finally, we directly compare the locally-run results against what was returned with the dataset and identify any disagreements. 

#### Parse the QARTOD Executed
The ```qartod_executed``` variable for a given parameter contains the individual QARTOD test flags. For each datum, flags are listed in a string matching the order of the tests_executed attribute. Flags should be interpreted using the standard QARTOD mapping: \[1: pass, 2: not_evaluated, 3: suspect_or_of_high_interest, 4: fail, 9: missing_data\].

For verification, we first want to split out each test into its own separate variable, named using the following convention: {param}\_qartod\_{test_name}. For example, parsing out the gross range test results for the CTD parameter ```sea_water_practical_salinity``` from the qartod flags ```sea_water_practical_salinity_qartod_executed``` will return a variable ```sea_water_practical_salinity_qartod_gross_range``` with just flags corresponding to the results of the gross range QARTOD test.

In [14]:
import io
import ast
import requests

def parse_qartod_executed(ds, parameters):
    """
    Parses the qartod tests for the given parameter into separate variables.
    
    Parameters
    ----------
    ds: xarray.DataSet
        The dataset downloaded from OOI with the QARTOD flags applied.
    pparameters: list[str]
        The name of the parameters in the dataset to parse the QARTOD flags
        
    Returns
    -------
    ds: xarray.DataSet
        The dataset with the QARTOD test for the given parameters split out
        into new seperate data variables using the naming convention:
        {parameter}_qartod_{test_name}
    """
    # For the params into a list if only a string
    if type(parameters) is not list:
        parameters = list(parameters)
    
    # Iterate through each parameter
    for param in parameters:
        # Generate the qartod executed name
        qartod_name = f"{param}_qartod_executed"
        
        if qartod_name not in ds.variables:
            continue
    
        # Fix the test types
        ds[qartod_name] = ds[qartod_name].astype(str)
    
        # Get the test order
        test_order = ds[qartod_name].attrs["tests_executed"].split(",")
    
        # Iterate through the available tests and create separate variables with the results
        for test in test_order:
            test_index = test_order.index(test)
            test_name = f"{param}_qartod_{test.strip()}"
            ds[test_name] = ds[qartod_name].str.get(test_index)

    return ds

In [15]:
# Put the test parameter names in the dataset into a list
parameters = [x for x in test_parameters.keys()]
parameters

['dofst_k_oxygen_l2']

In [16]:
# Parse all of the variables with QARTOD tests applied into separate tests
data = parse_qartod_executed(data, parameters)
data

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,4.51 MiB,4.51 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,int32,numpy.ndarray
"Array Chunk Bytes 4.51 MiB 4.51 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type int32 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,4.51 MiB,4.51 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,int32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.13 MiB,1.13 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,int8,numpy.ndarray
"Array Chunk Bytes 1.13 MiB 1.13 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type int8 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,1.13 MiB,1.13 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,int8,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,datetime64[ns],numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type datetime64[ns] numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,datetime64[ns],numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,4 Tasks,1 Chunks
Type,numpy.ndarray,
"Array Chunk Bytes 288.69 MiB 288.69 MiB Shape (1182467,) (1182467,) Count 4 Tasks 1 Chunks Type numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,4 Tasks,1 Chunks
Type,numpy.ndarray,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 9.02 MiB 9.02 MiB Shape (1182467,) (1182467,) Count 3 Tasks 1 Chunks Type float64 numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,9.02 MiB,9.02 MiB
Shape,"(1182467,)","(1182467,)"
Count,3 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,8 Tasks,1 Chunks
Type,numpy.ndarray,
"Array Chunk Bytes 288.69 MiB 288.69 MiB Shape (1182467,) (1182467,) Count 8 Tasks 1 Chunks Type numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,8 Tasks,1 Chunks
Type,numpy.ndarray,

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,8 Tasks,1 Chunks
Type,numpy.ndarray,
"Array Chunk Bytes 288.69 MiB 288.69 MiB Shape (1182467,) (1182467,) Count 8 Tasks 1 Chunks Type numpy.ndarray",1182467  1,

Unnamed: 0,Array,Chunk
Bytes,288.69 MiB,288.69 MiB
Shape,"(1182467,)","(1182467,)"
Count,8 Tasks,1 Chunks
Type,numpy.ndarray,


#### Load & Parse the GitHub QARTOD Tables
We can grab the QARTOD tables with the test values straight from GitHub, which ensures we are using the same input and threshold values as OOINet. However, the QARTOD tables utilize the ```ooinet_parameter_name``` instead of the dataset variable name. Thus, when loading the tables we need to make sure we are requesting the correct parameter name.

In [37]:
GITHUB_BASE_URL = "https://raw.githubusercontent.com/oceanobservatories/qc-lookup/master/qartod"

def load_gross_range_qartod_test_values(refdes, stream, ooinet_param):
    """
    Load the gross range QARTOD test from gitHub
    """
    subsite, node, sensor = refdes.split("-", 2)
    sensor_type = sensor[3:8].lower()
    
    # gitHub url to the gross range table
    GROSS_RANGE_URL = f"{GITHUB_BASE_URL}/{sensor_type}/{sensor_type}_qartod_gross_range_test_values.csv"
    
    # Download the results
    download = requests.get(GROSS_RANGE_URL)
    if download.status_code == 200:
        df = pd.read_csv(io.StringIO(download.content.decode('utf-8')))
        df["parameters"] = df["parameters"].apply(ast.literal_eval)
        df["qcConfig"] = df["qcConfig"].apply(ast.literal_eval)
        
    # Next, filter for the desired parameter
    mask = df["parameters"].apply(lambda x: True if x.get("inp") == ooinet_param else False)
    df = df[mask]
    
    # Now filter for the desired stream
    df = df[(df["subsite"] == subsite) & 
            (df["node"] == node) & 
            (df["sensor"] == sensor) &
            (df["stream"] == stream)]
    
    return df


def load_climatology_qartod_test_values(refdes, param):
    """
    Load the OOI climatology qartod test values table from gitHub
    
    Parameters
    ----------
    refdes: str
        The reference designator for the given sensor
    param: str
        The name of the 
    """
    
    subsite, node, sensor = refdes.split("-", 2)
    sensor_type = sensor[3:8].lower()
    
    # gitHub url to the climatology test tables
    CLIMATOLOGY_URL = f"{GITHUB_BASE_URL}/{sensor_type}/{sensor_type}_qartod_climatology_test_values.csv"

    # Get the correct climatologyTable
    download = requests.get(CLIMATOLOGY_URL)
    df = pd.read_csv(io.StringIO(download.content.decode('utf-8')))
    df["parameters"] = df["parameters"].apply(ast.literal_eval)
    # Next, filter for the desired parameter
    mask = df["parameters"].apply(lambda x: True if x.get("inp") == param else False)
    df = df[mask]

    # Now filter for the desired stream
    df = df[(df["subsite"] == subsite) & 
            (df["node"] == node) & 
            (df["sensor"] == sensor) &
            (df["stream"] == stream)]
    
    # Get the "zinp" as a check
    zinp = df["parameters"].values[0].get('zinp')
    
    # Get the correct climatologyTable
    climatologyTable = df["climatologyTable"].values[0]

    # Construct the url to the climatologyTable
    CLIMATOLOGY_TABLE_URL = f"{GITHUB_BASE_URL}/{sensor_type}/{climatologyTable}"

    # Download the results
    download = requests.get(CLIMATOLOGY_TABLE_URL)
    if download.status_code == 200:
            df = pd.read_csv(io.StringIO(download.content.decode('utf-8')), index_col=0)
            df = df.applymap(ast.literal_eval)

    return df, zinp

## Run Tests Locally
Next, we run the gross range test locally to get local results that can be compared with the output from the tests. This is done using the ```ioos_qc``` QARTOD package in conjunction with the ```qartod_test_values``` tables.

#### Gross Range Test

In [18]:
# Import the ioos_qc QARTOD package tests
from ioos_qc.qartod import gross_range_test, climatology_test, ClimatologyConfig

In [19]:
# Run through all of the parameters which had the QARTOD tests applied by OOINet and
# run the tests locally, saving the results in a dictionary
gross_range_results = {}
for param in test_parameters:
    # Get the ooinet name
    ooinet_name = test_parameters.get(param)
    
    # Load the gross_range_qartod_test_values from gitHub
    gross_range_qartod_test_values = load_gross_range_qartod_test_values(refdes, stream, ooinet_name)
    
    # Get the qcConfig object, the fail_span, and the suspect_span
    qcConfig = gross_range_qartod_test_values["qcConfig"].values[0]
    fail_span = qcConfig.get("qartod").get("gross_range_test").get("fail_span")
    suspect_span = qcConfig.get("qartod").get("gross_range_test").get("suspect_span")
    
    # Run the gross_range_tenst
    param_results = gross_range_test(
        inp = data[param].fillna(999999).values,
        fail_span = fail_span,
        suspect_span = suspect_span)
    
    # Save the results
    gross_range_results.update(
        {param: param_results}
    )

# Show the results
gross_range_results

{'dofst_k_oxygen_l2': masked_array(data=[1, 1, 1, ..., 1, 1, 1],
              mask=False,
        fill_value=999999,
             dtype=uint8)}

#### Climatology Test

First, we need to check that the dataset has a "depth" parameter which matches the "zinp" parameter in the climatology table. If it doesn't, then we need to add in a dummy parameter that is the same size/shape of the data and filled with dummy "1" values

In [40]:
'depth' in data.variables

True

In [42]:
# Run through all of the parameters which had the QARTOD tests applied by OOINet and
# run the tests locally, saving the results in a dictionary
climatology_results = {}

for param in test_parameters:

    # Get the ooinet name
    ooinet_name = test_parameters.get(param)
    
    # Load the gross_range_qartod_test_values from gitHub
    try:
        climatology_qartod_test_values, zinp = load_climatology_qartod_test_values(refdes, ooinet_name)
    except:
        climatology_results.update({
            param: "Not implemented."
        })
        continue
    
    if climatology_qartod_test_values is None:
        climatology_results.update({
            param: "Not implemented."
        })
        continue

    # Check that the 'zinp' is in the dataset. If not, need to add a dummy variable
    if zinp not in data.variables:
        data['zinp'] = (["time"], np.ones(data["time"].shape))

    # Load the gross_range_qartod_test_values from gitHub
    gross_range_qartod_test_values = load_gross_range_qartod_test_values(refdes, stream, ooinet_name)
    
    # Get the qcConfig object, the fail_span, and the suspect_span
    qcConfig = gross_range_qartod_test_values["qcConfig"].values[0]
    fail_span = qcConfig.get("qartod").get("gross_range_test").get("fail_span")
    
    # Initialize a climatology config object
    c = ClimatologyConfig()
    
    # Iterate through the pressure ranges
    for p_range in climatology_qartod_test_values.index:
        # Get the pressure range
        pmin, pmax = ast.literal_eval(p_range)

        # Convert the pressure range values into a dictionary
        p_values = climatology_qartod_test_values.loc[p_range].to_dict()

        # Check the pressure values. If [0, 0], then set the range [0, 5000]
        if pmax == 0:
            pmax = 5000

        for tspan in p_values.keys():
            # Get the time span
            tstart, tend = ast.literal_eval(tspan)

            # Get the values associated with the time span
            vmin, vmax = p_values.get(tspan)

            # Add the test to the climatology config object
            c.add(tspan=[tstart, tend],
                  vspan=[vmin, vmax],
                  fspan=[fail_span[0], fail_span[1]],
                  zspan=[pmin, pmax],
                  period="month")

    # Run the climatology test
    param_results = climatology_test(c,
                                     inp=data[param].fillna(999999),
                                     tinp=data["time"],
                                     zinp=data[zinp])
    
    # Append the results
    climatology_results.update({
        param: param_results
    })

In [43]:
climatology_results

{'dofst_k_oxygen_l2': masked_array(data=[1, 1, 1, ..., 1, 1, 1],
              mask=False,
        fill_value=999999,
             dtype=uint8)}

### Compare the results
Finally, we want to compare the outputs from the local test with what was returned in the dataset, looking for where they disagree. This will tell us if they are running as expected.

In [44]:
def run_comparison(ds, param, test, test_results):
    """
    Runs a comparison between the qartod results returned as part of the dataset
    and results calculated locally.
    """
    # Get the local test results and convert to string type for comparison
    local_results = test_results[param].astype(str)
    
    # Run comparison
    not_equal = np.where(ds[f"{param}_qartod_{test}_test"] != local_results)[0]
    
    if len(not_equal) == 0:
        return None
    else:
        return not_equal

### Run the comparisons

In [50]:
# Gross Range
gross_range_comparison = {}
for param in gross_range_results.keys():
    check = run_comparison(data, param, "gross_range", gross_range_results)
    gross_range_comparison.update({
        param: check
    })

gross_range_comparison

{'dofst_k_oxygen_l2': None}

In [51]:
# Gross Range
climatology_comparison = {}
for param in gross_range_results.keys():
    check = run_comparison(data, param, "climatology", climatology_results)
    climatology_comparison.update({
        param: check
    })

climatology_comparison

{'dofst_k_oxygen_l2': None}