# QARTOD - Single Test

This notebook shows the simplest use case for the IOOS QARTOD package - a single test performed on a timeseries loaded into a Pandas DataFrame. It shows how to define the test configuration and how the output is structured. At the end, there is an example of how to use the flags in data visualization. 

# Setup

In [None]:
# Other imports
import pandas as pd
import numpy as np
from datetime import datetime

from bokeh.layouts import gridplot
from bokeh.plotting import figure, show, output_file, output_notebook
output_notebook()

In [None]:
# Install QC library
!pip install git+git://github.com/ioos/ioos_qc.git
from ioos_qc.config import QcConfig
from ioos_qc import qartod

# Load data

Loads data from a local .csv file and put it into a Pandas DataFrame

In [None]:
# Water level data
# For a fixed station in Kotzebue, AK (https://www.google.com/maps?q=66.895035,-162.566752)
filename = 'water_level_example.csv'
variable_name='sea_surface_height_above_sea_level'

# Load data
data = pd.read_csv(filename, parse_dates=['time'])
data.head()

# Call test method directly

You can all individual QARTOD tests directly, manually passing in data and parameters.

In [None]:
qc_results = qartod.spike_test(
    inp=data[variable_name],
    suspect_threshold=0.8,
    fail_threshold=3
)
print(qc_results)

# QC configuration and Running

While you can call qartod methods directly, we recommend using a `QcConfig` object instead. This object encapsulates the test method and parameters into a single dict or JSON object. This makes your configuration more understandable and portable.

The `QcConfig` object is a special configuration object that determines which tests are run and defines the configuration for each test. The object's `run()` function runs the appropriate tests and returns a resulting dictionary of flag values.

Descriptions of each test and its inputs can be found in the [ioos_qc.qartod module documentation](https://ioos.github.io/ioos_qc/api/ioos_qc.html#module-ioos_qc.qartod)

[QartodFlags](https://ioos.github.io/ioos_qc/api/ioos_qc.html#ioos_qc.qartod.QartodFlags) defines the flag meanings.


In [None]:
# The configuration object can be initialized using a dictionary or a YAML file. Here is one example:
# QC configuration dictionary for a single test: the spike test
qc_config = {
    'qartod': {
      "spike_test": {
        "suspect_threshold": 0.8,
        "fail_threshold": 3
      }
    }
}

# Instantiate the configuration object
qc = QcConfig(qc_config)
print(qc)


In [None]:
# run the test
qc_results =  qc.run(
    inp=data[variable_name],
    tinp=data['time']  # can also use 'timestamp'
)
print(qc_results)

In [None]:
# These results can be visualized using Bokeh

# Set-up
title = "Water Level [MHHW] [m] : Kotzebue, AK"
time = data['time']
qc_test = qc_results['qartod']['spike_test']

# Create the figure object
p1 = figure(x_axis_type="datetime", title='Spike Test : ' + title)
p1.grid.grid_line_alpha=0.3
p1.xaxis.axis_label = 'Time'
p1.yaxis.axis_label = 'Spike Test Result'
p1.line(time, qc_test, color='blue')

# Display it
show(gridplot([[p1]], plot_width=800, plot_height=400))

# Alternative Configuration Method



In [None]:
# Here is the same example using the YAML file:
# QC configuration dictionary for a single test: the spike test
# Instantiate the configuration object
qc = QcConfig('./spike_test.yaml')

# run the test
qc_results =  qc.run(
    inp=data[variable_name],
    tinp=data['timestamp'],
)
qc_results

In [None]:
# These results can be visualized using Bokeh

# Set-up
title = "Water Level [MHHW] [m] : Kotzebue, AK"
time = data['time']
qc_test = qc_results['qartod']['spike_test']

# Create the figure object
p1 = figure(x_axis_type="datetime", title='Spike Test : ' + title)
p1.grid.grid_line_alpha=0.3
p1.xaxis.axis_label = 'Time'
p1.yaxis.axis_label = 'Spike Test Result'
p1.line(time, qc_test, color='blue')

# Display it
show(gridplot([[p1]], plot_width=800, plot_height=400))

# Using the Flags

The array of flags can then be used to filter data or color plots

In [None]:
def plot_results(data, var_name, results, title, test_name):
    """ Plot timeseries of original data colored by quality flag
    
    Args:
        data: pd.DataFrame of original data including a time variable
        var_name: string name of the variable to plot
        results: Ordered Dictionary of qartod test results
        title: string to add to plot title
        test_name: name of the test to determine which flags to use
    """
    # Set-up
    time = data['time']
    obs = data[var_name]
    qc_test = results['qartod'][test_name]

    # Create a separate timeseries of each flag value
    qc_pass = np.ma.masked_where(qc_test != 1, obs)
    qc_suspect = np.ma.masked_where(qc_test != 3, obs)
    qc_fail = np.ma.masked_where(qc_test != 4, obs)
    qc_notrun = np.ma.masked_where(qc_test != 2, obs)

    # start the figure
    p1 = figure(x_axis_type="datetime", title=test_name + ' : ' + title)
    p1.grid.grid_line_alpha=0.3
    p1.xaxis.axis_label = 'Time'
    p1.yaxis.axis_label = 'Observation Value'

    # plot the data, and the data colored by flag
    p1.line(time, obs,  legend='obs', color='#A6CEE3')
    p1.circle(time, qc_notrun, size=2, legend='qc not run', color='gray', alpha=0.2)
    p1.circle(time, qc_pass, size=4, legend='qc pass', color='green', alpha=0.5)
    p1.circle(time, qc_suspect, size=4, legend='qc suspect', color='orange', alpha=0.7)
    p1.circle(time, qc_fail, size=6, legend='qc fail', color='red', alpha=1.0)

    # show the plot
    show(gridplot([[p1]], plot_width=800, plot_height=400))


In [None]:
plot_results(data, variable_name, qc_results, title, 'spike_test')

In [None]:
# plot flag values again for comparison
show(gridplot([[p1]], plot_width=800, plot_height=400))