# Run QARTOD Test on Locally Saved Data

In this notebook we will load locally saved data from the external data folder, extract QARTOD test parameters from spreadsheets on the OOI GitHub, run the QARTOD climatology and gross range tests on the imported data, and save the test results to the interim data folder.

More info about QARTOD tests and the ioos_qc module can be found from the [Integrated Ocean Observing System website](https://ioos.noaa.gov/project/qartod/) and [Python module documentation](https://ioos.github.io/ioos_qc/), respectively.

### Import modules for data manipulation

In [1]:
# Import libraries
import os
import requests
import re
import gc
import io
import ast
import pandas as pd
import numpy as np
import xarray as xr
import warnings
warnings.filterwarnings("ignore")
import sys
import glob

# Import dask tools and ProgressBar
import dask
from dask.diagnostics import ProgressBar

# Import qartod_testing project functions
from qartod_testing.qc_flag_statistics import \
    timeseries_dict_to_xarray, build_data_path, get_test_parameters, \
    get_deployment_ds
from qartod_testing.local_qc_test import run_qartod_gross_range, \
    run_qartod_climatology

### Load locally saved data

In [2]:
# Set reference designator, data stream, and method 
refdes = "GA03FLMB-RIS01-04-PHSENF000"        
method = "recovered_inst"
stream = "phsen_abcdef_instrument"

# Get site, node, and sensor info from deconstructed reference designator
[site, node, sensor] = refdes.split('-', 2)

In [3]:
# Build path to folder where data was saved
folder_path = os.path.join(os.path.abspath('../data/external'), method,
                           stream, refdes)

# Retrieve a list of netCDF files in this directory
file_paths = glob.glob(folder_path+'/*.nc')
file_paths.sort()

### Identify Test Parameters

Next, identify which parameters in the dataset have QARTOD applied to them. Sometimes the variable name in the dataset is different that the key that is used by OOINet to build the datasets. For that we can check the attributes of the variable for the "alternate_parameter_name"!

### Collect test QARTOD lookup value tables from GitHub
We can grab the QARTOD tables with the test values straight from GitHub, which ensures we are using the same input and threshold values as OOINet. However, the QARTOD tables utilize the ```ooinet_parameter_name``` instead of the dataset variable name. Thus, when loading the tables we need to make sure we are requesting the correct parameter name.

Note to Self: This section should probably be deleted altogether since importing lookup table values is done within the QARTOD test functions.

### Run QARTOD tests locally
Next, we run the gross range test locally to get local results that can be compared with the output from the tests. This is done using the ```ioos_qc``` QARTOD package in conjunction with the ```qartod_test_values``` tables.

In [4]:
def save_local_qc_tests(file_paths, test_name):
    """
    """
    # Create copy of list of file paths for loading multi-file, single-deployment datasets
    paths_copy = file_paths.copy()
    
    while len(paths_copy) > 0:
        # Load data from a single deployment
        deploy_ds, deployment, paths_copy = get_deployment_ds(paths_copy) 
    
        # Create a dictionary of key-value pairs of dataset variable name:alternate parameter name
        test_parameters = get_test_parameters(deploy_ds)
        
        # Run local QARTOD test
        print(f'Running local QARTOD {test_name} test for deployment {deployment}.')  
        if len(test_parameters) > 0:
            qc_test_results = eval(f'run_qartod_{test_name}(refdes, stream, test_parameters, deploy_ds)')
            qc_test_results = timeseries_dict_to_xarray(qc_test_results, deploy_ds)
            # Add variable containing deployment number used in sorting and merging overlapping deployments
            qc_test_results['deployment'] = deploy_ds['deployment']
        else:
            pass

        # Build file name and directory for local QARTOD test results
        folder_path = os.path.join(os.path.abspath('../data/interim'), method, stream, refdes)
        os.makedirs(folder_path, exist_ok=True)
        test_results_path = os.path.join(folder_path, f"{test_name}_test-deployment00{deployment}.nc")

        # Save local test results
        qc_test_results.to_netcdf(test_results_path, mode='w')
    return

#### Gross Range Test

In [5]:
# Run local QARTOD gross range tests
save_local_qc_tests(file_paths, "gross_range")

Running local QARTOD gross_range test for deployment 01.
Running local QARTOD gross_range test for deployment 02.
Running local QARTOD gross_range test for deployment 03.


#### Climatology Test

In [6]:
# Run local QARTOD climatology tests
save_local_qc_tests(file_paths, "climatology")

Running local QARTOD climatology test for deployment 01.


IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices