# [WIP] Anomaly Detection concept with a CTD profile
Comparing traditional QC with hard thresholds versus Anomaly Detection

## Objective:

Illustrate some QC procedures using CoTeDe.

An example on how to QC a CTD profile

- Introduce dataset that will be used
- 

In [None]:
from bokeh.io import output_notebook, show
from bokeh.layouts import column, row
from bokeh.plotting import figure
import numpy as np
from scipy import stats

import cotede
from cotede.qc import ProfileQC
from cotede import qctests, datasets

In [None]:
output_notebook()

## Data
CoTeDe comes with some data to illustrate how it works and provide some examples.
On this tutorial we will use a PIRATA hydrographic CTD, i.e. actual measurements from the Atlantic Ocean.

Let's start by loading a sample dataset.

In [None]:
data = cotede.datasets.load_ctd()

print("There is a total of {} observations.\n".format(len(data["TEMP"])))
print("The variables are: ", data.keys())

This CTD was equipped with backup sensors to provide more robustness. Measurements from the secondary sensor are identified by a 2 in the end of the name. Let's focus here on the primary sensors.

Let's plot temperature and salinity.

In [None]:
p1 = figure(plot_width=420, plot_height=600)
p1.circle(data['TEMP'], -data['PRES'], size=8, line_color="seagreen", fill_color="mediumseagreen", fill_alpha=0.3)
p1.xaxis.axis_label = "Temperature [C]"
p1.yaxis.axis_label = "Depth [m]"

p2 = figure(plot_width=420, plot_height=600)
p2.y_range = p1.y_range
p2.circle(data['PSAL'], -data['PRES'], size=8, line_color="seagreen", fill_color="mediumseagreen", fill_alpha=0.3)
p2.xaxis.axis_label = "Salinity"
p2.yaxis.axis_label = "Depth [m]"

p = row(p1, p2)
show(p)

Considering the unusual magnitudes and variability near the bottom, there are clearly bad measurements on this profile.
Let's start with the most basic QC test and apply a global range to restrict to feasible values.

## Global Range: Check for Feasible Values
 Let's use the thresholds recommended by the GTSPP:
 - Temperature between -2 and 40 $^\circ$C
 - Salinity between 0 and 41

In [None]:
idx_valid = (data['TEMP'] > -2) & (data['TEMP'] < 40)

p1 = figure(plot_width=420, plot_height=600, title="Global Range Check (-2 <= T <= 40)")
p1.circle(data['TEMP'][idx_valid], -data['PRES'][idx_valid], size=8, line_color="seagreen", fill_color="mediumseagreen", fill_alpha=0.3)
p1.triangle(data['TEMP'][~idx_valid], -data['PRES'][~idx_valid], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p1.xaxis.axis_label = "Temperature [C]"
p1.yaxis.axis_label = "Depth [m]"


idx_valid = (data['PSAL'] > 0) & (data['PSAL'] < 41)

p2 = figure(plot_width=420, plot_height=600, title="Global Range Check (0 <= S <= 41)")
p2.y_range = p1.y_range
p2.circle(data['PSAL'][idx_valid], -data['PRES'][idx_valid], size=8, line_color="seagreen", fill_color="mediumseagreen", fill_alpha=0.3)
p2.triangle(data['PSAL'][~idx_valid], -data['PRES'][~idx_valid], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p2.xaxis.axis_label = "Pratical Salinity"
p2.yaxis.axis_label = "Depth [m]"

p = row(p1, p2)
show(p)

Great, we already identified a fair number of bad measuremnts.
The global range test is a simple and light test, and there is no reason to always apply it in normal conditions, but is usually not enough.
We will need to apply more tests to capture more bad measurements.
Several QC tests were already implemente in CoTeDe, so you don't need to code it again.
For instance, the global range test is available as `qctests.GlobalRange` and we can use it like

In [None]:
y = qctests.GlobalRange(data, 'TEMP', cfg={"minval": -2, "maxval": 40})
y.flags

The procedure GlobalRange

## More tests: GTSPP Spike and Gradient tests
OK, let's apply more tests beyond the global range.
Some common ones are the gradient and spike, and we could use CoTeDe to run that like

In [None]:
y_gradient = qctests.Gradient(data, 'TEMP', cfg={"threshold": 10})
y_gradient.flags

In [None]:
y_spike = qctests.Spike(data, 'TEMP', cfg={"threshold": 2.0})
y_spike.flags

## GTSPP's Spike Check

In [None]:
def spike(x):
    """Spike check as defined by GTSPP
    """
    y = np.nan * x
    y[1:-1] = np.abs(x[1:-1] - (x[:-2] + x[2:]) / 2.0) - np.abs((x[2:] - x[:-2]) / 2.0)
    return y


The spike check and many other ones are already implemented in CoTeDe, so let's use it cotede.qctests.spike().

GTSPP recommends a spike threshold equal to 2C for temperature and 0.3 for salinity.

In [None]:
t_spike = qctests.spike(data["TEMP"])

idx_good = np.absolute(t_spike) <= 2
idx_bad = np.absolute(t_spike) > 2

p1 = figure(plot_width=420, plot_height=500)
p1.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p1.xaxis.axis_label = "Temperature [C]"
p1.yaxis.axis_label = "Depth [m]"

p2 = figure(plot_width=420, plot_height=500)
p2.y_range = p1.y_range
p2.circle(t_spike[idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.triangle(t_spike[idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p2.xaxis.axis_label = "Spike(T)"
p2.yaxis.axis_label = "Depth [m]"


s_spike = qctests.spike(data["PSAL"])

idx_good = np.absolute(s_spike) <= 2
idx_bad = np.absolute(s_spike) > 2

p3 = figure(plot_width=420, plot_height=500)
p3.y_range = p1.y_range
p3.circle(data['PSAL'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p3.triangle(data['PSAL'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p3.xaxis.axis_label = "Salinity"
p3.yaxis.axis_label = "Depth [m]"

p4 = figure(plot_width=420, plot_height=500)
p4.y_range = p1.y_range
p4.circle(s_spike[idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p4.triangle(s_spike[idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p4.xaxis.axis_label = "Spike(S)"
p4.yaxis.axis_label = "Depth [m]"

p = column(row(p1, p2), row(p3, p4))
show(p)

## Using CoTeDe QC framework
CoTeDe automates many procedures for QC. Let's start using the standard procedure.

In [None]:
pqc = cotede.ProfileQC(data)

That's it, the primary and secondary sensors were evaluated. First the same variables in the input are available in the output object.

In [None]:
print("Variables available in data: {}\n".format(data.keys()))
print("Variables available in pqc: {}\n".format(pqc.keys()))

In [None]:
print("Flags available for temperature {}\n".format(pqc.flags["TEMP"].keys()))
print("Flags available for salinity {}\n".format(pqc.flags["PSAL"].keys()))

The flags are on IOC standard, thus 1 means good while 4 means bad.
0 is used when the QC there was no QC. For instance, the spike test is defined so that it depends on the previous and following measurements, thus the first and last data point of the array will always have a spike flag equal to 0.

Let's check the salinity with feasible values:

In [None]:
pqc.flags["PSAL"]["global_range"]

In [None]:
pqc.flags["PSAL"]["spike"]

Let's check the salinity measurements that are bad of probably bad according to the Global Range check, i.e. unfeasible values of salinity.

In [None]:
idx = pqc.flags["PSAL"]["global_range"] >= 3
pqc["PSAL"][idx]

The magnitudes of the tests are stored in features.

Let's check which features were saved for temperature,

In [None]:
print("Features for temperature: {}\n".format(pqc.features["TEMP"].keys()))

The flag "overall" is the maximum value among all other flags as recommended by IOC flagging system.
Therefore, if one measurement is flagged bad (flag=4) in a single test, it will get a flag 4.
Likewise, a measurement with flag 1 means that from all applied tests there is no suspicious of being a bad measurement.

In [None]:
pqc.flags["PSAL"]["overall"]

It's the same for salinity. Let's plot the salinity and it's respective normalized bias in respect to the WOA.

In [None]:
idx_good = pqc.flags["PSAL"]["overall"] <= 2
idx_bad = pqc.flags["PSAL"]["overall"] >= 3

pressure = -pqc["PRES"]
salinity = pqc["PSAL"]
woa_normbias = pqc.features["PSAL"]["woa_normbias"]


p1 = figure(plot_width=420, plot_height=500)
p1.circle(salinity[idx_good], pressure[idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle(salinity[idx_bad], pressure[idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p1.xaxis.axis_label = "Salinity"
p1.yaxis.axis_label = "Depth [m]"

p2 = figure(plot_width=420, plot_height=500)
p2.y_range = p1.y_range
p2.circle(woa_normbias[idx_good], pressure[idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.triangle(woa_normbias[idx_bad], pressure[idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p2.xaxis.axis_label = "WOA normalized bias"
p2.yaxis.axis_label = "Depth [m]"

p = row(p1, p2)
show(p)

Let's look at the salinity in respect to the spike and WOA normalized bias.
Near the bottom of the profile there some bad salinity measurement, which are mostly identified with the spike test.
A few measurements aren't critically bad in respect to the spike or the climatology individually.
One of the goals of the Anomaly Detection is to combine multiple features to an overall decision, so that

In [None]:

idx_good = pqc.flags["PSAL"]["spike"] <= 2
idx_bad = pqc.flags["PSAL"]["spike"] >= 3

p1 = figure(plot_width=500, plot_height=600)
p1.circle(pqc.features["PSAL"]["spike"][idx_good], -pqc['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle(pqc.features["PSAL"]["spike"][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)

p2 = figure(plot_width=500, plot_height=600)
p2.y_range = p1.y_range
p2.circle(pqc['PSAL'][idx_good], -pqc['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.line(pqc.features["PSAL"]["woa_mean"] - 6 * pqc.features["PSAL"]["woa_std"], -data['PRES'], line_width=4, line_color="orange", alpha=0.4)
p2.line(pqc.features["PSAL"]["woa_mean"] + 6 * pqc.features["PSAL"]["woa_std"], -data['PRES'], line_width=4, line_color="orange", alpha=0.4)
p2.triangle(data['PSAL'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)

p = row(p1, p2)
show(p)

In [None]:
pqc.features["PSAL"]["woa_normbias"]

In [None]:
t_spike = pqc.features["TEMP"]["anomaly_detection"]

idx_good = np.absolute(t_spike) <= 2
idx_bad = np.absolute(t_spike) > 2

p1 = figure(plot_width=420, plot_height=500)
p1.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p1.xaxis.axis_label = "Temperature [C]"
p1.yaxis.axis_label = "Depth [m]"

p2 = figure(plot_width=420, plot_height=500)
p2.y_range = p1.y_range
p2.circle(t_spike[idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.triangle(t_spike[idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p2.xaxis.axis_label = "Spike(T)"
p2.yaxis.axis_label = "Depth [m]"


s_spike = pqc.features["PSAL"]["woa_normbias"]

idx_good = np.absolute(s_spike) <= 2
idx_bad = np.absolute(s_spike) > 2

p3 = figure(plot_width=420, plot_height=500)
p3.y_range = p1.y_range
p3.circle(data['PSAL'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p3.triangle(data['PSAL'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p3.xaxis.axis_label = "Salinity"
p3.yaxis.axis_label = "Depth [m]"

p4 = figure(plot_width=420, plot_height=500)
p4.y_range = p1.y_range
p4.circle(s_spike[idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p4.triangle(s_spike[idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p4.xaxis.axis_label = "Spike(S)"
p4.yaxis.axis_label = "Depth [m]"

p = column(row(p1, p2), row(p3, p4))
show(p)

In [None]:
print("Variables that were flagged available: {}\n".format(pqc.flags.keys()))
print("Flags for temperature: {}\n".format(pqc.flags["TEMP"].keys()))

pqc.features.keys()
pqc.features["TEMP"].keys()


In [None]:
pqc.features["TEMP"]["anomaly_detection"]

In [None]:
from bokeh.models import ColumnDataSource, CustomJS, Slider

threshold = Slider(title="threshold", value=0.05, start=0.0, end=6.0, step=0.05, orientation="horizontal")


tmp = dict(
    depth=-pqc["PRES"],
    temp=pqc["PSAL"],
    temp_good=pqc["PSAL"].copy(),
    temp_bad=pqc["PSAL"].copy(),
    tukey53H=np.absolute(pqc.features["PSAL"]["spike"]),
    tukey53H_good=np.absolute(pqc.features["PSAL"]["spike"]),
    tukey53H_bad=np.absolute(pqc.features["PSAL"]["spike"]),    
)
idx = tmp["tukey53H"] > threshold.value
tmp["temp_good"][idx] = np.nan
tmp["temp_bad"][~idx] = np.nan
tmp["tukey53H_good"][idx] = np.nan
tmp["tukey53H_bad"][~idx] = np.nan


source = ColumnDataSource(data=tmp)


callback = CustomJS(args=dict(source=source), code="""
    var data = source.data;
    var f = cb_obj.value
    var temp = data['temp']
    var temp_good = data['temp_good']
    var temp_bad = data['temp_bad']
    var tukey53H = data['tukey53H']
    var tukey53H_good = data['tukey53H_good']
    var tukey53H_bad = data['tukey53H_bad']
    for (var i = 0; i < temp.length; i++) {
        if (tukey53H[i] > f) {
            temp_good[i] = "NaN"
            temp_bad[i] = temp[i]
            tukey53H_good[i] = "NaN"
            tukey53H_bad[i] = tukey53H[i]
        } else {
            temp_good[i] = temp[i]
            temp_bad[i] = "NaN"
            tukey53H_good[i] = tukey53H[i]
            tukey53H_bad[i] = "NaN"
        }
    }
    source.change.emit();
""")


threshold.js_on_change('value', callback)


p1 = figure(plot_width=420, plot_height=600)
p1.circle("temp_good", "depth", source=source, size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle("temp_bad", "depth", source=source, size=8, line_color="red", fill_color="red", fill_alpha=0.3)

p2 = figure(plot_width=420, plot_height=600)
p2.y_range = p1.y_range
p2.circle("tukey53H_good", "depth", source=source, size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.triangle("tukey53H_bad", "depth", source=source, size=8, line_color="red", fill_color="red", fill_alpha=0.3)
# inputs = row(threshold)
#threshold = column(slider)


p = column(threshold, row(p1, p2))
show(p)

In [None]:
y_spike

In [None]:
N = y_spike.size
n_greater = np.array([y_spike[np.absolute(y_spike) >= t].size/N for t in np.absolute(y_spike)])

p = figure(plot_width=840, plot_height=400)
p.circle(np.absolute(y_spike), n_greater, size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)

In [None]:
from scipy.stats import exponweib

spike_scale = np.arange(0.0005, 0.2, 1e-3)
param = [1.078231, 0.512053, 0.0004, 0.002574]
tmp = exponweib.sf(spike_scale, *param[:-2], loc=param[-2], scale=param[-1])
p = figure(plot_width=840, plot_height=400)
# p.line(x_ref, pdf, line_color="orange", line_width=8, alpha=0.7, legend_label="PDF")
p.circle(spike_scale, tmp, size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)

In [None]:
N = y_spike.size
n_greater = np.array([y_spike[np.absolute(y_spike) >= t].size/N for t in np.absolute(y_spike)])


p1 = figure(plot_width=420, plot_height=600)
p1.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p1.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)

p2 = figure(plot_width=420, plot_height=600)
p2.circle(SF[idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p2.triangle(SF[idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
p = row(p1, p2)
show(p)

In [None]:
p1 = figure(plot_width=420, plot_height=600)
p1.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)

In [None]:
def draw_histogram(x, bins=50):
    """Plot an histogram
    
    Create an histogram from the output of numpy.hist().
    We will create several histograms in this notebook so let's save this as a function to
    reuse this code.
    """
    x = x[np.isfinite(x)]
    hist, edges = np.histogram(x, density=True, bins=bins)

    #title = 'test'
    # p = figure(title=title, tools='', background_fill_color="#fafafa")
    p = figure(plot_width=750, plot_height=300,
        background_fill_color="#fafafa")
        # tools='', background_fill_color="#fafafa")
    p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
           fill_color="navy", line_color="white", alpha=0.5)
    # p.line(x, pdf, line_color="#ff8888", line_width=4, alpha=0.7, legend_label="PDF")
    # p.line(x, cdf, line_color="orange", line_width=2, alpha=0.7, legend_label="CDF")

    p.y_range.start = 0
    # p.legend.location = "center_right"
    # p.legend.background_fill_color = "#fefefe"
    p.xaxis.axis_label = 'x'
    p.yaxis.axis_label = 'Pr(x)'
    p.grid.grid_line_color="white"
    return p

In [None]:
p = draw_histogram(y_tukey53H[idx_good], bins=50)
show(p)

In [None]:
data.attrs

In [None]:
import oceansdb
WOADB = oceansdb.WOA()

woa = WOADB['TEMP'].extract(var=['mean', 'standard_deviation'], doy=data.attrs['datetime'], lat=data.attrs['LATITUDE'], lon=data.attrs['LONGITUDE'], depth=data['PRES'])

In [None]:
pqc.features["TEMP"]["woa_mean"] - 6 * pqc.features["TEMP"]["woa_std"]

In [None]:
pqc = cotede.ProfileQC(data)

In [None]:
pqc.flags['TEMP'].keys()

In [None]:
pqc.features['TEMP']


In [None]:
p = figure(plot_width=500, plot_height=600)
p.circle(y_tukey53H, -data['PRES'], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)

In [None]:
idx_valid

In [None]:
np.percentile(y_tukey53H[np.isfinite(y_tukey53H)], 25)

In [None]:
idx = y_tukey53H[np.absolute(y_tukey53H) < 6]

p = draw_histogram(y_tukey53H[idx & idx_valid])
show(p)

In [None]:
idx = idx_valid & np.isfinite(y_tukey53H)

mu_estimated, sigma_estimated = stats.norm.fit(y_tukey53H[idx])

print("Estimated mean: {:.3f}, and standard deviation: {:.3f}".format(mu_estimated, sigma_estimated))

In [None]:
y_tukey53H_scaled = (y_tukey53H - mu_estimated) / sigma_estimated

In [None]:
p = figure(plot_width=500, plot_height=600)
p.circle(y_tukey53H_scaled, -data['PRES'], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)

In [None]:
cfg = {'TEMP': {'global_range': {'minval': -4, 'maxval': 45}}}

pqc = ProfileQC(data, cfg)

pqc.flags['TEMP']
pqc.flags['TEMP']['overall']

idx_good = pqc.flags['TEMP']['overall'] <= 2
idx_bad = pqc.flags['TEMP']['overall'] >= 3

p = figure(plot_width=500, plot_height=600)
p.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
show(p)

In [None]:
cfg['TEMP']['spike'] = {'threshold': 6}

pqc = ProfileQC(data, cfg)

pqc.flags['TEMP']
pqc.flags['TEMP']['overall']

idx_good = pqc.flags['TEMP']['overall'] <= 2
idx_bad = pqc.flags['TEMP']['overall'] >= 3

p = figure(plot_width=500, plot_height=600)
p.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
show(p)

In [None]:
cfg['TEMP']['woa_normbias'] = {'threshold': 6}


pqc = ProfileQC(data, cfg)

pqc.flags['TEMP']
pqc.flags['TEMP']['overall']

idx_good = pqc.flags['TEMP']['overall'] <= 2
idx_bad = pqc.flags['TEMP']['overall'] >= 3

p = figure(plot_width=500, plot_height=600)
p.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
show(p)

In [None]:
cfg['TEMP']['spike_depthconditional'] = {"pressure_threshold": 500, "shallow_max": 6.0, "deep_max": 2.0}

pqc = ProfileQC(data, cfg)

pqc.flags['TEMP']
pqc.flags['TEMP']['overall']

idx_good = pqc.flags['TEMP']['overall'] <= 2
idx_bad = pqc.flags['TEMP']['overall'] >= 3

p = figure(plot_width=500, plot_height=600)
p.circle(data['TEMP'][idx_good], -data['PRES'][idx_good], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
p.triangle(data['TEMP'][idx_bad], -data['PRES'][idx_bad], size=8, line_color="red", fill_color="red", fill_alpha=0.3)
show(p)

## The Easiest Way: High level
Let's evaluate this profile using EuroGOOS standard tests.

In [None]:
pqced = cotede.ProfileQCed(data, cfg='eurogoos')

In [None]:
p = figure(plot_width=500, plot_height=600)
p.circle(pqced['TEMP'], -pqced['PRES'], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)

## QC with more control: "medium" level

In [None]:
pqc = cotede.ProfileQC(data, cfg='eurogoos')

In [None]:
pqc.keys()

In [None]:
pqc.flags["TEMP"]

In [None]:
data.keys()

### Low level

In [None]:
from cotede import qctests
y = qctests.GlobalRange(data, 'TEMP', cfg={'minval': -4, "maxval": 45 })
y.flags

In [None]:
y = qctests.Tukey53H(data, 'TEMP', cfg={'threshold': 6, "l": 12})
y.features["tukey53H"]
p = figure(plot_width=500, plot_height=600)
p.circle(y.features["tukey53H"], -data['PRES'], size=8, line_color="green", fill_color="green", fill_alpha=0.3)
show(p)