# Attenuated Signal Test Example

This notebook demonstrates an example of how an attenuation test can be used to detect biofouling on an instrument.

The source data is derived from a historical USGS CTD station located at Lower Sand Island, OR in the Columbia River estuary.
The selected time period shows the tidal influence on salinity over a spring-neap time period.
Near the end of the selected period, there is a decrease in the range of salinity corresponding with biofouling.

The data was downloaded from the [Center for Coastal Margin and Prediction (CMOP) Data Explorer](http://amb6400b.stccmop.org/ws/product/offeringplot_ctime.py?handlegaps=true&series=time,sandi.790.A.CTD.salt.PD0&width=8.54&height=2.92&starttime=2001-7-1%200:00&endtime=2001-09-5%2023:59).

### Setup

In [None]:
# Setup directories
from pathlib import Path
basedir = Path().absolute()
libdir = basedir.parent.parent.parent

# Other imports
import pandas as pd
import numpy as np
from datetime import datetime

import matplotlib.pyplot as plt

from bokeh.layouts import gridplot
from bokeh import plotting
plotting.output_notebook()

In [None]:
# # Install QC library
# !pip install git+git://github.com/ioos/ioos_qc.git

# Alternative installation (install specific branch):
!pip uninstall -y ioos_qc
!pip install git+git://github.com/ioos/ioos_qc.git@attenuated_signal_updates

# # Alternative installation (run with local updates):
# !pip uninstall -y ioos_qc
# import sys
# sys.path.append(str(libdir))
    
from ioos_qc.config import QcConfig
from ioos_qc import qartod

In [None]:
# Method to plot QC results using Bokeh
def plot_results(data, var_name, results, title, test_name):

    time = data.index
    obs = data[var_name]
    qc_test = results['qartod'][test_name]

    qc_pass = np.ma.masked_where(qc_test != 1, obs)
    qc_suspect = np.ma.masked_where(qc_test != 3, obs)
    qc_fail = np.ma.masked_where(qc_test != 4, obs)
    qc_notrun = np.ma.masked_where(qc_test != 2, obs)

    p1 = plotting.figure(x_axis_type="datetime", title=test_name + ' : ' + title)
    p1.grid.grid_line_alpha=0.3
    p1.xaxis.axis_label = 'Time'
    p1.yaxis.axis_label = 'Observation Value'

    p1.line(time, obs,  legend_label='obs', color='#A6CEE3')
    p1.circle(time, qc_notrun, size=2, legend_label='qc not run', color='gray', alpha=0.2)
    p1.circle(time, qc_pass, size=4, legend_label='qc pass', color='green', alpha=0.5)
    p1.circle(time, qc_suspect, size=4, legend_label='qc suspect', color='orange', alpha=0.7)
    p1.circle(time, qc_fail, size=6, legend_label='qc fail', color='red', alpha=1.0)
    p1.circle(time, qc_notrun, size=6, legend_label='qc not eval', color='gray', alpha=1.0)

    plotting.show(gridplot([[p1]], plot_width=800, plot_height=400))

## Run example
---

#### 1. Load data and perform exploratory analysis

In [None]:
# Historical salinity data from Lower Sand Island, Columbia River Estuary
# data: http://amb6400b.stccmop.org/ws/product/offeringplot_ctime.py?handlegaps=true&series=time,sandi.790.A.CTD.salt.PD0&width=8.54&height=2.92&starttime=2001-7-1 0:00&endtime=2001-09-5 23:59
# location same as sandi for CREOFS: https://tidesandcurrents.noaa.gov/ofs/ofs_station.shtml?stname=Lower%20Sand%20Island%20Light&ofs=cre&stnid=sandi&subdomain=ba
filename = basedir.joinpath('attenuated_salinity_example.csv')

data = pd.read_csv(filename, header=1, index_col='\'time PST\'', parse_dates=True)
data.plot()

In [None]:
var_name = data.columns[0]
data[var_name]

#### 2. Plot range and standard deviation of salinity over M2 moving window

In [None]:
# lunar day (M2)
time_delta = 3600 * 24.8
print(f'window: {time_delta}')

# start QC after a tidal data
# - 24.8 hours of data at 2 minute intervals
min_periods = 24.8*60/2
# panads uses phrase "min_periods" to indicate minimum number of observations
# - ioos_qc uses the phrase "min_obs"
print(f'min_obs: {min_periods}')

In [None]:
# range check
range_data = data.rolling(f'{time_delta}S', min_periods=int(min_periods)).apply(np.ptp, raw=True)
range_data.plot()

In [None]:
# check that min_{periods, obs} are NaN
# - note that N-1 are NaNs
range_data.isna().sum()

In [None]:
# stdev check
stdev_data = data.rolling(f'{time_delta}S', min_periods=int(min_periods)).apply(np.std, raw=True)
stdev_data.plot()

#### 3. Run QC and plot results

##### Test range

The beginning 743 points (`min_obs`) are marked as "NOT EVALUATED" because there is not enough data yet to evaluate whether they are pass or fail.

The range of the signal falls so quickly that no points are marked as "SUSPECT", but immediately change from "PASS" to "FAIL".

In [None]:
# QC configuration
# This configuration is used to call the corresponding method in the ioos_qc library
# See documentation for description of each test and its inputs: 
#   https://ioos.github.io/ioos_qc/api/ioos_qc.html#module-ioos_qc.qartod
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 17.25,
            "fail_threshold": 15,
            "test_period": int(time_delta),
            "min_obs": int(min_periods),
            "check_type": "range"
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"

p1 = plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

##### Test standard deviation

The "standard deviation" test picks up likely suspect data whereas the "range" test did not.
The exemplifies the utility of using both tests in tandem.

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": int(min_periods),
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

## Sensitivity Tests
---

### 1. Test results sensitivity to `min_obs`

The following plots demonstrate the sensitivity of the test results in the beginning of a time series to the selection of `min_obs`. 

### `min_obs`: 0

No observations are marked suspect at the beginning of the time series, but the first 744 observations (the size of the rolling window) are labeled "FAILING".

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 0
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### `min_obs`: 10

Only the first 10 observations are marked as NOT EVALUATED, but the remainder of the first 744 samples are labeled as "FAILING".

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 10
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### `min_obs`: 100

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 100
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### `min_obs`: 744

The first 744 samples are marked "NOT EVALUATED", but none are marked as "FAILING".

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 744
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### 2. Test results sensitivity to `suspect_threshold`.

### `suspect_threshold`: 7

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 7,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 744
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### `suspect_threshold`: 6

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 6,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 744
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

### `suspect_threshold`: 5

In [None]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_obs": 744
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')