This notebook is for exploring amlr04-20231128 data

Specifically, we explore differences in the measured depth and CTD-calculated depth

In [1]:
import os
import numpy as np
# import pandas as pd
import xarray as xr

from esdglider import gcp, glider, utils

deployment_name = "amlr04-20231128"
mode = "delayed"


# Standard 
bucket_name = 'amlr-gliders-deployments-dev'
deployments_path = f"/home/sam_woodman_noaa_gov/{bucket_name}"
config_path = f"/home/sam_woodman_noaa_gov/glider-lab/deployment-configs"

gcp.gcs_mount_bucket("amlr-gliders-deployments-dev", deployments_path, ro=False)
deployment_info = {
    "deploymentyaml": os.path.join(config_path, f"{deployment_name}.yml"), 
    "mode": mode, 
}
paths = glider.get_path_deployment(deployment_info, deployments_path)

dir_ts = paths["tsdir"]
path_raw = os.path.join(dir_ts, f"{deployment_name}-{mode}-raw.nc")
path_sci = os.path.join(dir_ts, f"{deployment_name}-{mode}-sci.nc")
path_eng = os.path.join(dir_ts, f"{deployment_name}-{mode}-eng.nc")

In [2]:
ds_raw = xr.load_dataset(path_raw)
df_raw = ds_raw.to_pandas()
# display(ds_raw)

# ds_eng = xr.load_dataset(path_eng)
# df_eng = ds_eng.to_pandas()
# display(ds_eng)

# ds_sci = xr.load_dataset(path_sci)
# df_sci = ds_sci.to_pandas()
# display(ds_sci)
ds_raw

## Depth

In [3]:
ds_depth = utils.check_depth(ds_raw)
ds_depth

The max absolute difference between the glider measured depth and depth calculated from the CTD is greater than 5m
count    2.380670e+06
mean     2.517927e+00
std      2.080823e+00
min      0.000000e+00
25%      7.586188e-01
50%      2.021850e+00
75%      3.887984e+00
max      4.749512e+01
dtype: float64


In [4]:
df_depth = ds_depth.to_pandas()
df_depth.sort_values(by="depth_diff_abs", ascending=False)

Unnamed: 0_level_0,depth_measured,depth_ctd,depth_measured_interp,depth_diff,depth_diff_abs
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-01-12 23:29:43.481414912,,119.044063,166.539181,47.495118,47.495118
2023-12-25 20:07:53.574768128,,8.189968,44.399474,36.209505,36.209505
2023-11-29 06:10:54.020294144,,14.389169,44.350166,29.960997,29.960997
2023-11-29 15:33:10.920379648,,14.022768,42.623433,28.600665,28.600665
2023-11-30 02:49:44.635254016,,11.339107,37.819621,26.480515,26.480515
...,...,...,...,...,...
2024-01-27 12:13:54.840911872,0.019411,,0.019411,,
2024-01-27 12:14:01.140991232,0.000000,,0.000000,,
2024-01-27 12:14:05.608306944,0.241251,,0.241251,,
2024-01-27 12:14:11.806548992,0.019411,,0.019411,,


We can see some pretty big depth differences, inlcuding many many depth differences of ~10m at 800+m of depth. These likely indicate an issue (or at least a difference) with the glider depth sensor or CTD pressure sensor, like observed in the amlr03-20231128 deployment. 

However, upon inspection, most of these big depth differences are not actually differences between m_depth and sci_water_pressure. Instead, they're because the glider science computer appeared to error during a dive and thus needed to reboot. After talking with Tony, this was likely usually because of the glidercam. 

When the computer reboots during a dive, it's first point has all of the science values from the last measured point - these values don't reset until the second point recorded by the glider. For instance (open the below dataset in Dataviewer):

In [5]:
ds = utils.data_var_reorder(ds_raw, ["depth", "depth_ctd"])
dt = "2023-12-25"
ds_sub = ds.sel(time=slice("2023-12-25 20:05", "2023-12-25 20:08:20"))
ds_sub

Thus, in the processing script we decided to remove all of these instances . The following bounds were determined after lots of investigation. They may not catch instance where this happened and the gldier traveled less than 6 meters. However, using a depth difference of less than 6m led to false positives, and we are confident this captures the major offenders.

In [10]:
df_curr = df_depth[(df_depth.depth_diff > 6) & (df_depth.depth_ctd < 300)]
df_curr

Unnamed: 0_level_0,depth_measured,depth_ctd,depth_measured_interp,depth_diff,depth_diff_abs
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-11-28 22:10:49.251434240,,11.814444,26.533388,14.718945,14.718945
2023-11-29 06:10:54.020294144,,14.389169,44.350166,29.960997,29.960997
2023-11-29 15:33:10.920379648,,14.022768,42.623433,28.600665,28.600665
2023-11-29 23:54:31.435516416,,12.646281,25.143453,12.497171,12.497171
2023-11-30 02:49:44.635254016,,11.339107,37.819621,26.480515,26.480515
...,...,...,...,...,...
2024-01-12 09:53:31.151947008,,57.649596,42.670502,14.979095,14.979095
2024-01-12 15:47:25.311431936,,295.215734,285.858462,9.357272,9.357272
2024-01-12 16:05:05.044250368,,142.322021,136.320477,6.001544,6.001544
2024-01-12 19:16:56.936248832,,296.323329,287.160093,9.163236,9.163236


In the next cell we print the offending time stamps in a way that makes it easy to copy into the amlr04-20241128-delayed processing script

In [14]:
df_curr.index.values

array(['2023-11-28T22:10:49.251434240', '2023-11-29T06:10:54.020294144',
       '2023-11-29T15:33:10.920379648', '2023-11-29T23:54:31.435516416',
       '2023-11-30T02:49:44.635254016', '2023-11-30T12:26:01.345428480',
       '2023-11-30T13:57:39.851592960', '2023-11-30T17:13:05.463623168',
       '2023-11-30T22:02:17.223785472', '2023-12-01T03:16:39.983062784',
       '2023-12-01T09:13:32.987426816', '2023-12-01T19:01:31.347961344',
       '2023-12-01T23:08:49.339874304', '2023-12-02T06:36:10.718902528',
       '2023-12-02T14:17:18.784057600', '2023-12-02T16:16:14.320037888',
       '2023-12-02T17:47:19.856201216', '2023-12-03T05:34:44.405487104',
       '2023-12-03T16:05:46.832885760', '2023-12-03T17:55:15.642852864',
       '2023-12-03T19:14:24.122589184', '2023-12-04T09:00:58.429901056',
       '2023-12-04T12:44:58.327667200', '2023-12-05T02:49:18.434295552',
       '2023-12-05T13:11:56.089538560', '2023-12-05T18:05:32.308868352',
       '2023-12-05T23:48:32.127563520', '2023-12-07