# Changes in QARTOD Gross Range and Climatology Lookup Tables

The lookup tables for gross range and climatology QARTOD tests on OOI CTDBP data were last updated on 12 Jan 2022, and the "suspect" ranges were calculated with all data before the cutoff date of 31 Dec 2021. However, when we recalculate these test ranges now, the values are different because recovered data that was recorded by the sensor before the cutoff date were recovered and ingested after the current lookup tables were published online. 

Deployment 14 of the Coastal Endurance array was the last deployment to be recovered before the cutoff date on 17 Sept 2021, so this is the last date that recovered instrument data was available for the original lookup tables. We can assume that deployment 14 data was included in the lookup table calculation since it was ingested on 18 Oct 2021. Since deployment 15 data was recovered 31 Mar 2022 and ingested xx xxx 2022, any gross range or climatology test ranges recalculated after 2023 with the same cutoff date will include additional recovered data from 17 Sept to 31 Dec 2021.

(Note: We could change the cutoff date to 17 Sept 2021 when calling on the script to recalculate the lookup tables, and see if that would result in the same lookup table values.)

In [1]:
# Load python modules used in this notebook into the workspace
import ooi_data_explorations.qartod.qc_processing as qc_process
import ooi_data_explorations.common as ooi_tools
from ooi_data_explorations.common import load_gc_thredds
from ooi_data_explorations.uncabled.process_ctdbp import ctdbp_datalogger, ctdbp_instrument

from ooinet import M2M

import numpy as np
import pandas as pd
import xarray as xr
import dask

In [2]:
# set parameters for a particular sensor
site = 'CE01ISSM'
node = 'SBD17'
sensor = '06-CTDBPC000'

refdes = '-'.join([site,node,sensor])

In [3]:
# Checking description of sensor with vocab connected to the reference designator above
vocab = M2M.get_vocab(refdes)
vocab

Unnamed: 0,@class,vocabId,refdes,instrument,tocL1,tocL2,tocL3,manufacturer,model,mindepth,maxdepth
0,.VocabRecord,34,CE01ISSM-SBD17-06-CTDBPC000,CTD,Coastal Endurance,Oregon Inshore Surface Mooring,Surface Buoy,Sea-Bird,SBE 16plusV2,1.0,1.0


In [10]:
# View deployment information 
# Below we can see the dates that instruments were recovered in relation to the date that lookup tables were updated and cutoff date.
deployments = M2M.get_deployments(refdes)
deployments.loc[10:12]

Unnamed: 0,deploymentNumber,uid,assetId,latitude,longitude,depth,deployStart,deployEnd,deployCruise,recoverCruise
10,14,CGINS-CTDBPC-50152,3417,44.65973,-124.09492,1.0,2021-03-31 14:29:00,2021-09-17 15:51:00,SKQ202104S,
11,15,CGINS-CTDBPC-50055,1485,44.65708,-124.09447,1.0,2021-09-17 14:07:00,2022-03-31 00:06:00,TN394,
12,16,CGINS-CTDBPC-50152,3417,44.65958,-124.095,1.0,2022-03-31 15:51:00,2022-10-01 16:21:00,SKQ202205S,


### Finding dates of data ingestion

In [None]:
# Load metadata about the sensor to check available variables and attributes in the dataset
metadata = M2M.get_metadata(refdes)
metadata # Note that ingestion_timestamp exists as a variable for the data streams telemetered, recovered_host, and recovered_inst

The next cell is requesting and downloading data at the end of deployment 14. I temporarily edited the `m2m_collect()` function in the source code to keep the ingestion_timestep variable in the dataset instead of dropping it. Most of the time we won't need the ingestion timestamp, but in this case I needed to check which dates of recovered data were included in the calculation of published lookup tables.  

In [None]:
# download recovered_host data on last day of deployment 14
tag = '.*CTDBP.*\\.nc$'
method = 'recovered_host'
stream = 'ctdbp_cdef_dcl_instrument_recovered'
req_host = ooi_tools.m2m_request(site, node, sensor, method, stream, start='2021-09-17T00:00:00.000Z', stop='2021-09-17T14:00:00.000Z')
data_host = ooi_tools.m2m_collect(req_host, tag)
data_host
# added data ingestion date in notes on OOI Data Explorations

In [None]:
# download recovered_inst data on last day of deployment 14
tag = '.*CTDBP.*\\.nc$'
method = 'recovered_inst'
stream = 'ctdbp_cdef_instrument_recovered'
req_inst = ooi_tools.m2m_request(site, node, sensor, method, stream, start='2021-09-17T00:00:00.000Z', stop='2021-09-17T14:00:00.000Z')
data_inst = ooi_tools.m2m_collect(req_inst, tag)
data_inst
# added data ingestion date in notes on OOI Data Explorations

### Plotting temperature data from Sept to Dec 2021

From just adding 3 months of recovered data, the lower limit for the temperature gross range test increased by 45% of the original value for this reference designator. We are looking for anything significant in the time series that would explain a jump in the lower end of the 99.7% distribution of temperature measurements.

In [3]:
# Combine datasets from all 3 streams to plot the same timeseries that QARTOD test ranges are calculated from

# Import additional functions and modules for processing data and plotting
from ooi_data_explorations.common import get_annotations, add_annotation_qc_flags
from ooi_data_explorations.qartod.endurance.qartod_ce_ctdbp import combine_delivery_methods
import matplotlib.pyplot as plt

data = combine_delivery_methods(site, node, sensor)

# Use annotations to ignore any data that would be dropped before calculating test ranges
# get the current system annotations for the sensor
annotations = get_annotations(site, node, sensor)
annotations = pd.DataFrame(annotations)
if not annotations.empty:
    annotations = annotations.drop(columns=['@class'])
    annotations['beginDate'] = pd.to_datetime(annotations.beginDT, unit='ms').dt.strftime('%Y-%m-%dT%H:%M:%S')
    annotations['endDate'] = pd.to_datetime(annotations.endDT, unit='ms').dt.strftime('%Y-%m-%dT%H:%M:%S')

    # create an annotation-based quality flag
    data = add_annotation_qc_flags(data, annotations)

    # clean-up the data, removing values that were marked as fail either from the quality checks or in the
    # annotations, and all data collected after the cut off date
    data = data.where(data.rollup_annotations_qc_results != 4)

Downloading 237 data file(s) from the OOI Gold Copy THREDSS catalog
Downloading and Processing Data Files: 100%|██████████| 237/237 [02:47<00:00,  1.41it/s]
Downloading 16 data file(s) from the OOI Gold Copy THREDSS catalog
Downloading and Processing Data Files: 100%|██████████| 16/16 [00:12<00:00,  1.27it/s]
Downloading 16 data file(s) from the OOI Gold Copy THREDSS catalog
Downloading and Processing Data Files: 100%|██████████| 16/16 [00:17<00:00,  1.07s/it]


In [4]:
# Limit data to Sept through Dec 2021

start_DT = '2021-09-01T00:00:00'
end_DT = '2022-01-01T00:00:00'

data = data.sel(time=slice(start_DT, end_DT))
data

In [6]:
# Save netcdf file to ooidata directory in user root
import os
# from ooi_data_explorations.common import ENCODINGS

# save the downloaded annotations and qartod lookups and tables
out_path = os.path.join(os.path.expanduser('~'), 'ooidata/m2m/', site.lower())
out_path = os.path.abspath(out_path)
if not os.path.exists(out_path):
    os.makedirs(out_path)

# save the annotations to a csv file for further processing
data_file = '-'.join([site, node, sensor]) + '-2021.nc'
data.to_netcdf(os.path.join(out_path, data_file), mode='w', format='NETCDF4', engine='h5netcdf')