# Download Data

### Purpose
This jupyter notebook highlights various methods for accessing and downloading data from the Ocean Observatories 

In [1]:
# Import libraries
import os, shutil, sys, time, re, requests, csv, datetime, pytz
import time
import yaml
import pandas as pd
import numpy as np
import netCDF4 as nc
import xarray as xr
import warnings
warnings.filterwarnings("ignore")

In [2]:
# Import the OOINet M2M tool
sys.path.append("/home/andrew/Documents/OOI-CGSN/ooinet/ooinet/")
from m2m import M2M

In [3]:
sys.path.append("..")
from utils import *

In [4]:
import matplotlib.pyplot as plt
%matplotlib inline

#### Set OOINet API access
In order access and download data from OOINet, need to have an OOINet api username and access token. Those can be found on your profile after logging in to OOINet. Your username and access token should NOT be stored in this notebook/python script (for security). It should be stored in a yaml file, kept in the same directory, named user_info.yaml.

In [5]:
userinfo = yaml.load(open("../user_info.yaml"), Loader=yaml.SafeLoader)
username = userinfo["apiname"]
token = userinfo["apikey"]

#### Connect to OOINet

In [6]:
OOINet = M2M(username, token)

---
## Datasets
First, the ```Download_Data``` notebook should be run first. Then, if all the datasets for a given instrument have already been identified, then want to simply load the identified data streams from local memory:

gold_copy = 'http://thredds.dataexplorer.oceanobservatories.org/thredds/catalog/ooigoldcopy/public/'

In [None]:
instruments = OOINet.search_datasets(array="GI03FLMA", instrument="PHSEN")

In [None]:
instruments

In [None]:
instruments.to_csv("../data/PCO2W_instruments.csv")

In [None]:
OOINet.URLS

In [7]:
refdes = "GI01SUMO-SBD12-04-PCO2AA000"

---
## Metadata 
The metadata contains the following important key pieces of data for each reference designator: **method**, **stream**, **particleKey**, and **count**. The method and stream are necessary for identifying and loading the relevant dataset. The particleKey tells us which data variables in the dataset we should be calculating the QARTOD parameters for. The count lets us know which dataset (the recovered instrument, recovered host, or telemetered) contains the most data and likely has the best record to use to calculate the QARTOD tables. 

In [8]:
metadata = OOINet.get_metadata(refdes)
metadata

Unnamed: 0,pdId,particleKey,type,shape,units,fillValue,stream,unsigned,method,count,beginTime,endTime,refdes
0,PD7,time,DOUBLE,SCALAR,seconds since 1900-01-01,-9999999,pco2a_a_dcl_instrument_air,False,telemetered,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z,GI01SUMO-SBD12-04-PCO2AA000
1,PD10,port_timestamp,DOUBLE,SCALAR,seconds since 1900-01-01,-9999999,pco2a_a_dcl_instrument_air,False,telemetered,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z,GI01SUMO-SBD12-04-PCO2AA000
2,PD11,driver_timestamp,DOUBLE,SCALAR,seconds since 1900-01-01,-9999999,pco2a_a_dcl_instrument_air,False,telemetered,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z,GI01SUMO-SBD12-04-PCO2AA000
3,PD12,internal_timestamp,DOUBLE,SCALAR,seconds since 1900-01-01,-9999999,pco2a_a_dcl_instrument_air,False,telemetered,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z,GI01SUMO-SBD12-04-PCO2AA000
4,PD16,preferred_timestamp,STRING,SCALAR,1,empty,pco2a_a_dcl_instrument_air,False,telemetered,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z,GI01SUMO-SBD12-04-PCO2AA000
...,...,...,...,...,...,...,...,...,...,...,...,...,...
75,PD1000,irga_detector_temperature,FLOAT,SCALAR,ºC,-9999999,pco2a_a_dcl_instrument_water_recovered,False,recovered_host,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z,GI01SUMO-SBD12-04-PCO2AA000
76,PD1001,irga_source_temperature,FLOAT,SCALAR,ºC,-9999999,pco2a_a_dcl_instrument_water_recovered,False,recovered_host,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z,GI01SUMO-SBD12-04-PCO2AA000
77,PD1003,partial_pressure_co2_ssw,FLOAT,FUNCTION,µatm,-9999999,pco2a_a_dcl_instrument_water_recovered,False,recovered_host,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z,GI01SUMO-SBD12-04-PCO2AA000
78,PD2605,dcl_controller_timestamp,STRING,SCALAR,1,empty,pco2a_a_dcl_instrument_water_recovered,False,recovered_host,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z,GI01SUMO-SBD12-04-PCO2AA000


#### Sensor Parameters
Each instrument returns multiple parameters containing a variety of low-level instrument output and metadata. However, we are interested in science-relevant parameters for calculating the relevant QARTOD test limits. We can identify the science parameters based on the preload database, which designates the science parameters with a "data level" of L1 or L2. 

Consequently, we through several steps to identify the relevant parameters. First, we query the preload database with the relevant metadata for a reference designator. Then, we filter the metadata for the science-relevant data streams. 

In [9]:
data_levels = OOINet.get_parameter_data_levels(metadata)
data_levels

{'PD10': None,
 'PD1000': None,
 'PD1001': None,
 'PD1002': 1,
 'PD1003': 1,
 'PD11': None,
 'PD12': None,
 'PD16': None,
 'PD2605': None,
 'PD2805': 2,
 'PD7': None,
 'PD842': None,
 'PD863': None,
 'PD93': None,
 'PD992': None,
 'PD993': None,
 'PD994': 0,
 'PD995': 0,
 'PD996': None,
 'PD997': None,
 'PD998': None,
 'PD999': 0}

Filter the metadata based on the data levels for **L1** & **L2** data

In [10]:
def filter_parameter_ids(pdId, pid_dict):
    data_level = pid_dict.get(pdId)
    if data_level is not None:
        if data_level > 0:
            return True
        else:
            return False
    else:
        return False

In [11]:
mask = metadata["pdId"].apply(lambda x: filter_parameter_ids(x, data_levels))
metadata = metadata[mask]

Groupby based on the reference designator - method - stream to get the unique values for each data stream

In [12]:
metadata = metadata.groupby(by=["refdes","method","stream"]).agg(lambda x: pd.unique(x.values.ravel()).tolist())
metadata = metadata.reset_index()
metadata = metadata.applymap(lambda x: x[0] if len(x) == 1 else x)
metadata.head()

Unnamed: 0,refdes,method,stream,pdId,particleKey,type,shape,units,fillValue,unsigned,count,beginTime,endTime
0,GI01SUMO-SBD12-04-PCO2AA000,recovered_host,pco2a_a_dcl_instrument_air_recovered,"[PD1002, PD2805]","[partial_pressure_co2_atm, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,204868,2014-09-10T19:20:34.719Z,2021-08-19T06:01:42.499Z
1,GI01SUMO-SBD12-04-PCO2AA000,recovered_host,pco2a_a_dcl_instrument_water_recovered,"[PD1003, PD2805]","[partial_pressure_co2_ssw, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z
2,GI01SUMO-SBD12-04-PCO2AA000,telemetered,pco2a_a_dcl_instrument_air,"[PD1002, PD2805]","[partial_pressure_co2_atm, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z
3,GI01SUMO-SBD12-04-PCO2AA000,telemetered,pco2a_a_dcl_instrument_water,"[PD1003, PD2805]","[partial_pressure_co2_ssw, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,208850,2014-09-10T19:20:22.741Z,2021-11-02T17:00:08.378Z


This returns all of the methods and streams which have scientific data. For PCO2W and PHSEN we want to drop the entries which have "blank" in them.

In [13]:
mask = metadata["stream"].apply(lambda x: False if "blank" in x else True)
metadata = metadata[mask]
metadata

Unnamed: 0,refdes,method,stream,pdId,particleKey,type,shape,units,fillValue,unsigned,count,beginTime,endTime
0,GI01SUMO-SBD12-04-PCO2AA000,recovered_host,pco2a_a_dcl_instrument_air_recovered,"[PD1002, PD2805]","[partial_pressure_co2_atm, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,204868,2014-09-10T19:20:34.719Z,2021-08-19T06:01:42.499Z
1,GI01SUMO-SBD12-04-PCO2AA000,recovered_host,pco2a_a_dcl_instrument_water_recovered,"[PD1003, PD2805]","[partial_pressure_co2_ssw, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,204871,2014-09-10T19:20:22.741Z,2021-08-19T05:59:29.675Z
2,GI01SUMO-SBD12-04-PCO2AA000,telemetered,pco2a_a_dcl_instrument_air,"[PD1002, PD2805]","[partial_pressure_co2_atm, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,208968,2014-09-10T19:20:34.719Z,2021-11-02T17:02:21.215Z
3,GI01SUMO-SBD12-04-PCO2AA000,telemetered,pco2a_a_dcl_instrument_water,"[PD1003, PD2805]","[partial_pressure_co2_ssw, pco2_co2flux]",FLOAT,FUNCTION,"[µatm, mol m-2 s-1]",-9999999,False,208850,2014-09-10T19:20:22.741Z,2021-11-02T17:00:08.378Z


### Download Data
Now, for each available data stream, request the data from the OOINet THREDDS server and download to a local repository.

In [25]:
for row in metadata.index:
    # Get the method and stream
    method, stream = metadata.loc[row,"method"], metadata.loc[row, "stream"]
    
    if "air" in stream:
        continue
    
   
    # Get the THREDDS url
    thredds_url = OOINet.get_thredds_url(refdes, method, stream)
    
    # Get the catalog
    catalog = OOINet.get_thredds_catalog(thredds_url)
    
    # Remove unwanted datasets from the catalog
    for dataset in catalog:
        if "blank" in dataset:
            catalog.remove(dataset)
    
    # Create a directory to save the data
    save_dir = f"../data/{refdes}/{method}/"
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)
    else:
        pass
    
    # Download the files to the save directory
    OOINet.download_netCDF_files(catalog, save_dir)

Waiting for GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered to process.
Waiting: 100%|████████████████████████████████| 400/400 [01:03<00:00,  6.33it/s]
Downloading files to ../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/
[########################################] | 100% Completed | 12.5s
Waiting for GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water to process.
Waiting: 100%|████████████████████████████████| 400/400 [01:24<00:00,  4.73it/s]
Downloading files to ../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/
[########################################] | 100% Completed | 15.4s


### Try merging datasets between the telemetered, recovered_host, and recovered_inst

In [26]:
refdes = "GI01SUMO-SBD12-04-PCO2AA000"

In [34]:
# Telemetered data sets
telemetered_files = os.listdir(f"../data/{refdes}/telemetered")
telemetered_files = sorted([f"../data/{refdes}/telemetered/" + f for f in telemetered_files if "metbk" not in f])
telemetered_files

['../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0001_GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water_20140910T192022.741000-20140918T142036.980000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0001_GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water_20150624T152540.021000-20150624T152545.665000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water_20150815T193035.605000-20150906T232950.444000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water_20150907T002945.095000-20151004T232912.626000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-telemetered-pco2a_a_dcl_instrument_water_20151005T002906.578000-20151101T232833.858000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/telemetered/deployment0002_GI01SUMO-S

In [35]:
recovered_host_files = os.listdir(f"../data/{refdes}/recovered_host")
recovered_host_files = sorted([f"../data/{refdes}/recovered_host/" + f for f in recovered_host_files if "metbk" not in f])
recovered_host_files

['../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/deployment0001_GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered_20140910T192022.741000-20140924T152037.978000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered_20150815T193035.605000-20150906T232950.444000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered_20150907T002945.095000-20151004T232912.626000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered_20151005T002906.578000-20151101T232833.858000.nc',
 '../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_host/deployment0002_GI01SUMO-SBD12-04-PCO2AA000-recovered_host-pco2a_a_dcl_instrument_water_recovered_20151102T002828.498000-20151129T232756.992000.

In [36]:
recovered_inst_files = os.listdir(f"../data/{refdes}/recovered_inst")
recovered_inst_files = sorted([f"../data/{refdes}/recovered_inst/" + f for f in recovered_inst_files if "metbk" not in f])
recovered_inst_files

FileNotFoundError: [Errno 2] No such file or directory: '../data/GI01SUMO-SBD12-04-PCO2AA000/recovered_inst'

In [37]:
from dask.diagnostics import ProgressBar

In [38]:
def open_datasets(datasets, refdes):
    
    OOINet.REFDES = refdes
    
    # check and remove any files which are malformed
    # and remove the bad ones
    netCDF_files = OOINet._check_files(datasets)
    
    # Load the datasets into a concatenated xarray DataSet
    with ProgressBar():
        print("\n"+f"Loading netCDF_files for {OOINet.REFDES}:")
        ds = xr.open_mfdataset(netCDF_files, preprocess=OOINet._preprocess, parallel=True)
        
    # Add in the English name of the dataset
    refdes = "-".join(ds.attrs["id"].split("-")[:4])
    vocab = OOINet.get_vocab(refdes)
    ds.attrs["Location_name"] = " ".join((vocab["tocL1"].iloc[0],
                                          vocab["tocL2"].iloc[0],
                                          vocab["tocL3"].iloc[0]))    

    return ds

In [39]:
tele_data = open_datasets(telemetered_files, refdes)
host_data = open_datasets(recovered_host_files, refdes)
#inst_data = open_datasets(recovered_inst_files, refdes)

Checking and removing bad files: 
[########################################] | 100% Completed |  1.2s

Loading netCDF_files for GI01SUMO-SBD12-04-PCO2AA000:
[########################################] | 100% Completed |  6.4s
Checking and removing bad files: 
[########################################] | 100% Completed |  1.7s

Loading netCDF_files for GI01SUMO-SBD12-04-PCO2AA000:
[########################################] | 100% Completed |  6.1s


In [None]:
#tele_data = phsen_instrument(tele_data)
#host_data = phsen_instrument(host_data)
#inst_data = phsen_instrument(inst_data)

In [None]:
def combine_datasets(tele_data, host_data, inst_data):
    """Com"""

In [40]:
# Need to make sure each dataset has the same variables
for var in tele_data.variables:
    if var not in host_data.variables:
        host_data[var] = tele_data[var].broadcast_like(host_data["time"])
        
for var in host_data.variables:
    if var not in tele_data.variables:
        tele_data[var] = host_data[var].broadcast_like(tele_data["time"])

In [41]:
# Merge the telemetered dataset and host_dataset
tele_host = tele_data.combine_first(host_data)

In [None]:
for var in tele_host.variables:
    if var not in inst_data.variables:
        inst_data[var] = tele_host[var].broadcast_like(inst_data["time"])

for var in inst_data.variables:
    if var not in tele_host.variables:
        tele_host[var] = inst_data[var].broadcast_like(tele_host["time"])
        
# Concatenate
data = inst_data.combine_first(tele_host)

In [42]:
data = tele_host
data

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type object numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type object numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1350 Tasks,57 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1350 Tasks 57 Chunks Type object numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1350 Tasks,57 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 914.39 kiB 23.63 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float32 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,914.39 kiB,23.63 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1253 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1253 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.79 MiB 47.27 kiB Shape (234083,) (6050,) Count 1352 Tasks 57 Chunks Type float64 numpy.ndarray",234083  1,

Unnamed: 0,Array,Chunk
Bytes,1.79 MiB,47.27 kiB
Shape,"(234083,)","(6050,)"
Count,1352 Tasks,57 Chunks
Type,float64,numpy.ndarray


In [59]:
data = data.drop_vars(["dcl_controller_timestamp", "date_time_string", "internal_timestamp"])

In [63]:
data.to_netcdf(f"../data/{refdes}_combined.nc", engine="h5netcdf")

In [61]:
data.close()

In [None]:
os.listdir("../data/")

In [None]:
# Download the annotations for each reference designator
refdes = "GI01SUMO-RID16-05-PCO2WB000"
annotations = OOINet.get_annotations(refdes)
annotations

In [None]:
annotations.to_csv(f"../data/{refdes}_annotations.csv")