## A more detailed overview of this notebook

This notebook began as a comparison between profiler chlorophyll measurements near the
surface nine times per day to surface chlorophyll observations by the 
MODIS satellite once every eight days. Its scope expanded from there, considerably. 


One such expansion is considering other sources of data. We have for example 
a snapshot of the global ocean called GLODAP. After inspecting that on a global
scale we turn to a comparison of vertical profiles through the water column, 
specifically salinity and temperature. We want to compare GLODAP profiles as somewhat 
*static* snapshots with ongoing active profile measurements from ARGO drifters.


The Regional Cabled Array (RCA)
is an observatory stretching across the sea floor from the coast of Oregon 500 km out to
Axial Seamount. This observatory includes two types of profilers that rise and fall through
the water column: Both deep profilers that ascend from the sea floor and shallow profilers 
that rest on platforms at 200 meters depth and ascend to within a few meters of the surface.


We begin the RCA work focused on the shallow profiler as this is where the highest
concentration of chlorophyll is found.


* Regional Cabled Array (RCA): A cabled observatory on the sea floor off the coast of Oregon
* Site: A location in the RCA
* Platform: A mechanical structure -- static or mobile -- that resides at a site.
* Instrument: An electronic device carrying one or more sensors
* Sensor: A device that measures some aspect of the ocean like pH or temperature
* Stream: A stream of data produced by a sensor as part of an instrument located on a platform at a site in the RCA


This notebook describes a Python package called **yodapy** used to obtain stream data.


Here we use the traditional data path model


* search for data
* order data
* download data
* analyze data


We prefer a newer approach where data are already in place on the public cloud and the model is

* analyze data


Since that is our end-goal some of the data for this project will be (not done yet 3-20) set
in place in advance. 


Back to our process here: Once the data are in place we say that **yodapy** has finished its task.
We then turn to analysis using Python and particularly **XArray**. 

## Purpose of the **yodapy** Python package

`yodapy` is a contraction of **Y**our **O**cean **DA**ta **PY**thon library. It was written 
by Don Setiawan to facilitate working with **OOI** data in general (not just profiler data).


Before `yodapy` was written the process of finding and ordering data for OOI 
was a bit *involved*.  `yodapy` was developed to make this process more 
*programmable* and to provide search capability
without having to know precise terms like 
`RS01SBPS-SF01A-3D-SPKIRA101-streamed-spkir_data_record`.  



This notebook uses `yodapy` to search for, identify, order and download data, all in Python. 
A future extension of `yodapy` will make this process even simpler, referencing data that
are already in place on the public cloud. Rather than order and download data you simply
start working with it using the `XArray` idiom. 

<BR>

> **Takeaway 1: This notebook reviews `yodapy` specific to Regional Cabled Array (RCA) 
data but the pattern of use is relevant to data from other OOI segments.**

<BR>

> **Takeaway 2: A heads-up on authenticating: The OOI system requires you to *authenticate* your identity.
You do this by registering your email at their website. This is unrestricted, there is
no cost and it only takes a couple of minutes. `yodapy` helps you manage your resulting
credentials so once this is set up you are authenticated automatically.**

## Notebook features

Come back and re-write this (and never index your own book)

### Section on GLODAP and ARGO

### Regional Cabled Array and MODIS

- OOI / RCA data orders with `yodapy`
- working with `xarray` `DataArrays` and `Datasets` 
- plotting with `matplotlib`
  - line and scatter plots, multiple y-axes, labels, marker type and size
  - profiler curtain plots: time - depth - chlorophyll (as color) 
  - animation of time series data
  - interactivity
  - color bars
  - intrinsic plotting from DataArrays (with modifiers)

### Ordering and pulling ARGO data from the Coriolis system

## Data management

This Jupyter notebook resides in 
a sub-directory of the User's home directory `~`. It is bundled 
as an open source 
[github repository](https://github.com/robfatland/chlorophyll).
(abbreviated 'repo') on GitHub using the
Linux `git` utility. 
The repo is not intended for large data volumes.  


The data must reside *elsewhere* in the 
working environment, i.e. not within the repo directory.
I use `~/data/` with sub-directories to organize data content outside
of the `~/chlorophyll` directory. 
Each data source (MODIS, GLODAP, ARGO, RCA, ...) gets a dedicated sub-directory in `~/data`.


`xarray` has a wildcard multi-file open utility: `xr.open_mfdataset("Identifier*.nc")`.
This maps multiple NetCDF files to a single Dataset. 


The RCA data are ordered using a less convenient dimension, namely
observation number `obs`. This is just an ordinal integer 1, 2, 3, ...
The code in this notebook modifies this to use dimension `time`.

## Obtain Regional Cabled Array data using `yodapy`

As noted above the `yodapy` library enables Python-based access to OOI data. In this case we will focus
on the Regional Cabled Array (RCA) and particularly on the shallow profiler found at the site 
**Oregon Slope Base**. This site is at the base of the continental shelf in about 3000 meters of water.
The shallow profiler rises and falls nine times per day through the upper 200 meters of the water column.


### OOI data access back-story


To order data from **OOI** requires you to pre-register (free, using your email address). This provides you 
credentials when placing a data order. Orders typically take a few minutes for the OOI
servers to assemble; after which you receive an email with a download link. You download the data to local storage
and read files into memory and proceed from there, a very labor-intensive process.


### How `yodapy` helps


[`yodapy`](http://github.com/cormorack/yodapy) helps you automate OOI data access at each step. 
It sets up a credentials directory within your home directory;
and in so doing helps you avoid accidentally pushing your credentials to `github` where they would be public. `yodapy` 
allows you to create a Python object called an `OOI()` that includes methods for finding sensor data of interest; 
for ordering time-bounded datasets for those sensors; for downloading this data; and for attaching it to a data 
structure (an `xarray Dataset`) for further analysis. It is at this point when you have your data present as a 
`Dataset` that `yodapy` has completed its job. 


The next cell installs `yodapy`. Run this each time you start up this notebook server unless your installation
of the `yodapy` library persists. 


### Getting OOI credentials


To get data from OOI you first create a User account as follows:


- Visit the [OOI website](https://ooinet.oceanobservatories.org/#)
- On the login menu (upper right) select **Register**
- Fill out the New User Registration Form
- Once you have your login credentials: Log in
- The 'Login' menu should be replaced with your User name at the upper right: Also a dropdown menu
  - Use this menu to select User Profile
- At the bottom of your User Profile page you should find **API Username** and **API Token**
  - These two strings comprise your authentication 
  - Keep them somewhere safe
  - Notice that the **Refresh API Token** button permits you to regenerate them whenever you like


Use your OOI API Token with `yodapy` as described further down to automate your authentication process.
If this works as intended you can safely use OOI and not have to worry about cutting and pasting these
token strings every time you want to get data access.

## install yodapy if needed

In [1]:
# mini-source control: Last copied 29-SEP-2020: to tilt*, chlorophyll*, rca*, argo*
#                      last revised 09-OCT-2020
import os, sys, time, glob

from IPython.display import clear_output             # use inside loop with clear_output(wait = True) followed by print(i)
import warnings                                      # use with warnings.filterwarnings('ignore') or 'once'

home_dir = os.getenv("HOME")
this_dir = home_dir + '/chlorophyll/'
data_dir = '/data/'
data1_dir = '/data1'

from matplotlib import pyplot as plt
from matplotlib import colors as mplcolors
import numpy as np, pandas as pd, xarray as xr
from numpy import datetime64 as dt64, timedelta64 as td64

def doy(theDatetime): return 1 + int((theDatetime - dt64(str(theDatetime)[0:4] + '-01-01')) / td64(1, 'D')) # 1, 2, .... , 365, [366]
def dt64_from_doy(year, doy): return dt64(str(year) + '-01-01') + td64(doy-1, 'D')
def day_of_month_to_string(d): return str(d) if d > 9 else '0' + str(d)

print('\nJupyter Notebook running Python {}'.format(sys.version_info[0]))


Jupyter Notebook running Python 3
the data directory is /data/ 



In [5]:
# Ensure that the latest build of yodapy is installed directly from github using
!pip install git+https://github.com/cormorack/yodapy.git -q     # -q cuts the stdout clutter

# this line of code verifies yodapy is installed
from yodapy.utils.creds import set_credentials_file

## One time only: Configure OOI credentials using `yodapy`

Only the first time through here: Carefully follow the instructions in the Python cell below.
You are (temporarily) telling `yodapy` what your `OOI username` and `token` are. 
`yodapy` creates a hard-to-notice sub-directory of your home directory
that contains these credentials in a text file. As long as you are not publishing
your home directory someplace public your credentials will be hidden away.


#### 'Why am I doing this *credentials* business?'


When you use `yodapy` to order data from OOI it will use this 'hidden away' copy
of your credentials to convince OOI your order is legitimate.  

In [2]:
# Run the next line of code to create authentication credentials for the OOI data system. Do this
# by ***carefully**** substituting your actual credentials in the username and token strings
# in this line of code:


if False: 
    
    set_credentials_file(data_source='ooi', username='OOIAPI-XXXXXXXXXXXXXX', token='XXXXXXXXXXXX')


# Un-comment the code and run the cell, just the one line above.
# Once it runs: Comment it out again and delete your credentials. You can obscure them with XXXXX as they are seen now.
# After you obscure your credentials: Be sure not to run this code again as it will break your authentication info.
#
# You can verify this worked by examining the .credentials file in ~/.yodapy. The credentials should match. Notice that 
#   this (slightly hidden) directory is directly connected to your home directory; whereas this IPython notebook 
#   is presumably in a distinct directory; so there should be no chance of a GitHub push sending your 
#   credentials to GitHub. 

## Regional Cabled Array data for 2019

* 3 sites: OSB, AXB, OOE for Oregon Slope Base, Axial Base, Oregon Offshore Endurance
* 3 Platforms: Shallow and Deep profilers plus shallow platform (fixed at 200m depth)
* Large collection of instruments, each with one or more sensors
  * CTD + Dissolved Oxygen
  * PAR, Spectral Irradiance, Spectrophotometer (attenuation / absorbance), Fluorometers 
  * Nitrate, pH, pCO2
  * Ocean velocity measurement

## Initialize the `OOI()` object

In [4]:
from yodapy.datasources import OOI

# uncomment this...
# ooi = OOI()

# use this "no-underscore" version of directory to see the primary methods for the OOI() object
# dirnou(ooi)

# uncomment and run this to see all the components or segments of OOI available
# ooi.sites

In [None]:
# yodapy
# We can explore these methods and attributes further. Note that yodapy has a series of 
# attributes that begin with 'cava_'. 'cava' is shorthand for "cabled array value add", 
#   a project at the University of Washington School of Oceanography supporting cabled array
#   data validation and use in ocean research.
# help(ooi.cava_sites)
ooi.cava_sites

In [None]:
# yodapy
print('\n\n\n')
ooi.cava_parameters

## `ooi.search()` first example

We will begin using `yodapy` proper to narrow down a data search. 


### What resources are available?


Specifically what are the names of sites served by the Regional Cabled Array? 
We begin with a broad search giving only the keyword `region`. 
Then we narrow the search by adding including keywords `site`, `node`, and `instrument` 
to arrive at individual *instruments* or *sensors*. These search results are used to order 
datasets with a specified time range. 


This first example is the broad search. 

In [None]:
# ooi.search(region='endurance')
ooi.search(region='cabled')

In [None]:
# Attribute 'sites' gives broad results as a table of arrays, sites, descriptions, lat/lon: Across all of OOI (62 rows)
ooi.sites    

In [None]:
# Narrow result: Within the Cabled Array region only (118 rows, 6 named columns)
print(ooi.instruments)

## Using `yodapy` to refine OOI searches

The `OOI()` object provided by `yodapy` starts out in a very *broad view* search state. 
It sees the entire OOI project at the level of the observatory sites, by name 
Endurance, Pioneer, Argentine Basin, Cabled Array, Irminger Sea, Station Papa and possibly
others I'm forgetting. 


The `ooi.search()` method uses keywords (`keyword='string-value'`) to narrow this view. 
In this way when the view is narrowed to a single instrument we can use the `ooi.request_data()`
method to order data from that instrument. 

## `ooi.search()` second example and notes on search configuration

We narrow the search using keywords `site`, `node` and `instrument`. 
The `ooi.instruments` result from above provides the vocabulary to use for keyword arguments: 

- `site` keyword is taken from the `site_name` column
  - for example `Oregon Slope Base Seafloor` suggests using `oregon slope base` as the keyword value
- `node` keyword is taken from the `infrastructure_name` column
  - for example 'Shallow Profiler (SF01A)` suggests keyword `shallow profiler` (notice these are not case-sensitive)
- `instrument` keyword is taken from the `instrument_name` column
  - for example `3-Wavelength Fluorometer` suggests keyword `fluorometer`
  


Once the narrow search runs we look at the `ooi.instruments` attribute to see how narrow the results are.
This prints as a table where -- as in example one -- the results are sorted into *one instrument per row*.
This can confirm whether the objective of narrowing the search down to a single instrument was met.


We run the `.data_availability()` method. This gives two outputs: A **table** and below that a 
**time series graphic**.  The table lists each instrument as a separate column. These columns are 
then transposed for the time series graphic: One row of boxes for each instrument. 


***Detail: The green `.data_availability()` chart may fail to render in some cases. Re-running the cell might help.***

In [6]:
# region='cabled' or 'endurance'
# site='slope' or 'slope base deep' or 'oregon offshore' or ... 
# node='platform' or 'shallow profiler' or ...
# instrument='2-Wavelength' or 'ctd' or 'fluorometer' or ...
# ooi.search(region='endurance', site='oregon offshore', node='shallow profiler', instrument='fluorometer')

# ooi.data_availability()

# Taking out the instrument keyword we have: 
# ooi.search(region='cabled', site='slope', node='shallow profiler')
# This produces (with a simple ooi.instruments attribute call) a list of the following:
#   - 3-Wavelength fluorometer (flort: got it for OSB SP 2019)
#   - CTD (ctdpf: got it for OSB SP 2019)
#   - Photosynthetically Available Radiation (parad: got it for OSB SP 2019)
#   - pH (phsen: got it for OSB SP 2019)
#   - Spectral Irradiance (spkir: got)
#   - Spectrophotometer (optaa: got)
#   - NOT YET: Single Point Velocity Meter (velpt: )
#   - Nitrate (nutnr: Got both nutnr_a_sample and nutnr_a_dark_sample)
#   - pCO2 water (two streams: pco2w_a_sami_data_record and pco2w_b (no data past 2018; placed 2018 data

# instrument                  2014    2015    2016    2017    2018    2019    2020
#
# Oregon Slope Base
# SP flort 3-wavelength                                                !
# SP ctdpf                                                             !
# SP parad                                                             !
# SP phsen                                                             !
# SP spkir                                                             !
# SP optaa                                                             !
# SP velpt                                                             !
# SP nutnr_a, nutnr_a_dark                                             !
# SP pco2w_a_sami                                                      !
# SP pco2w_b_sami                                               !      NA
# 200m ctdpf                                                           !
# 200m flort                                                           !
# 200m phsen                                                           !
# 200m do_stable                                                       !
# DP ctdpf wfp                                  !      NA       NA     NA
# DP ctdpf inst                                                        !
# DP acm (VEL3D) inst                                                  !
# DP flcdrdt inst fluorometer                                          !
#
# Axial Base
# SP flort                                                             !
# SP ctdpf                                                             !
# SP parad                                                             !
# SP phsen                                                             !
# SP spkir                          ?
# SP optaa                                                             !
# SP velpt                          ?
# SP nutnr_a, nutnr_a_dark          ?
# SP pco2w_a_sami                                                      !
# SP pco2w_b_sami                   ?
# 200m ctdpf                                                           !
# 200m flort                                                           !
# 200m phsen                                                           !
# 200m do_stable                                                       !
# DP ctdpf wfp                                                         !
# DP ctdpf inst                                                        !
# DP acm (VEL3D) inst               ?
# DP flcdrdt inst CDOM fluorometer                                     !
# DP fl????? inst 2-Wav fluorometer ?
# DP dissolved oxygen                                                  !
# 
# filename anatomy
# deployment0005                            or 0006 etc
#   _RS03AXPS                               site: AX is Axial, SB is slope base
#   -SF03A                                  platform: SF is shallow profiler, DP is deep profiler, PC is 200m platform 
#   -3B                                     number + letter: unknown
#   -OPTAAD301                              6-letter instrument + 'A'/'D' + 30X/10X
#   -streamed                               'streamed' or 'recovered_inst' or 'recovered_wfp'
#   -optaa_sample                           instrument designator, sometimes 'dpc_xxxxx_instrument_recovered'
#   _20191004T073957.414490                 datetime start
#   -20191014T220233.907019                 datetime end
# .nc                                       NetCDF file
#
# run this to see fluorometers available at Oregon Offshore (without using the 'node' keyword)
# 
#   filters endurance + oregon offshore + fluorometer turn up 7 hits...
#     2 are Oregon Offshore Surface Mooring: 3 wavelength... of future interest in expanding the MODIS connection
#     2 are Oregon Offshore deep profiler CDOM fluorometer
#     2 are Oregon Offshore deep profiler 2 wavelength...    of future interest also (not sure if this is on the RCA)
#     1 is Oregon Offshore shallow profiler 3 wavelength     *** Current interest: RCA MODIS connect ***
#
# ooi.search(region='endurance', site='oregon offshore', instrument='fluorometer')
# ooi.instruments

# ooi.data_availability()

# This ooi.search() call: 
# 
# ooi.search(region='cabled', instrument='fluorometer') 
# 
# produces 12 hits. Here is the breakdown; where results suggest site and node search keywords. 
#  Note that Deep Profiler sites have degeneracy in 'recovered_inst' versus 'recovered_wfp' (appear twice)
# 
#     - (4) Axial Base Deep Profiler Mooring (CDOM Fluorometer,  2-Wavelength Fluorometer)
#     - (4) Oregon Slope Base Deep Profiler Mooring (CDOM Fluorometer, 2-Wavelength Fluorometer)
#     - (1) Oregon Slope Base Shallow Profiler Mooring (200m Platform; 2-Wavelength Fluorometer)
#     - (1) Oregon Slope Base Shallow Profiler Mooring (Shallow Profiler; 3-Wavelength Fluorometer)
#     - (1) Axial Base Shallow Profiler Mooring (200m Platform; 2-Wavelength Fluorometer)
#     - (1) Axial Base Shallow Profiler Mooring (Shallow Profiler; 3-Wavelength Fluorometer)

# Resulting searches: Choose one of these...
# ooi.search(region='cabled', site='oregon slope base', node='shallow', instrument='fluorometer')
# ooi.search(region='cabled', site='oregon slope base', node='200m', instrument='fluorometer')
# ooi.search(region='cabled', site='axial base', node='shallow', instrument='fluorometer')
# ooi.search(region='cabled', site='axial base', node='200m', instrument='fluorometer')

# ...and run...
# ooi.data_availability()

In [None]:
## Final `yodapy` section: Obtain data

Three useful Python/yodapy cells follow. The first is run iteratively to refine a 
search. The second and third are run consecutively to place an order and download it
when it is ready.


> ***Strongly suggest: After data arrives move it to an out-of-repo data directory***


In [None]:
###################################
# 
# Set up a data order
#
# Use this cell (possibly multiple times) to narrow your search. Then use the following two cells to 
#   order and retrieve this data. Remember that the last thing you used for .search() is stored inside
#   the ooi object as its 'state'. This makes the ordering of the data in the next cell much simpler
#   because you only need to specify the time range. 
#
# What keywords does a Python method take?
#
# Below the ooi.search() method is provided with keywords; but how to discover what these are?
# Answer: Enter ?ooi.search to read the document string.
#
# Instructions for this cell
#
# Notice below there is an ooi.search() method call. This uses a sequence of keywords with values 
#   to narrow the focus of your search. The next line of code 'ooi.instrument' will print the 
#   results of this search as a table. In a simplest case if your resulting table is exactly one
#   row then your will be ordering data from just that instrument stream. 
# 
# A more advanced approach is to order multiple data streams at once; but this is not described here.
#
# First step: Run the search with only 'region', 'site', and 'node' keywords. Do not include  
#   'instrument' or 'stream' keywords. This gives a results table with multiple instruments (rows).
#   Example: ooi.search(region='cabled', site='slope', node='shallow profiler')
#
# Second step: Refine this search by including 'instrument' or 'stream' keywords. Give corresponding
#   values from the results table to get a new results table with just one row for the instrument 
#   you are interested in. 
#   Example: ooi.search(region='cabled', site='slope', node='shallow profiler', stream='velpt')
#
# At this point the search parameters in the OOI() object can be used for a focused data order.
# Place this order using the subsequent two cells.

# run `?ooi.search` to see available keywords
ooi.search(region='endurance', site='offshore', node='200', stream = 'optode, do_stable, phsen,pco2')
ooi.instruments
# ooi.data_availability()

In [None]:
%%time

# 2019 CTD (9 months are available ) required ~4 mins. Other sensors closer to 20 minutes

# Assume the above cell narrowed the search results to a single instrument. Assume also that we 
#   are interested in June 1 -- September 15 of 2019. We now use the ooi object to generate a 
#   data request.
#
# .request_data() generates a data request
# .to_xarray() polls the OOI system until the order completes; this will take a couple of minutes
#
begin_date = '2019-01-01'
end_date = '2020-01-01'
ooi.request_data(begin_date=begin_date, end_date=end_date)
ds = ooi.to_xarray()
len(ds)

In [None]:
%%time

# run this to download the data (possibly multiple files) from a completed data request from above
#   one year can take between 4 and 10 minutes
# 
filenamelist = ooi.download_netcdfs()
len(filenamelist)

In [None]:
ooi.raw()

In [None]:
type(ooi.raw())

## Two problems with this data

- The data order tends to yield multiple files that are contiguous in time. For example
The first might run June 1 to June 27 and the second might run June 28 to July 10. We
would like to consider them as a single Dataset and fortunately this is built into the
XArray package as a method: `.open_mfdataset()`. Here the `mf` abbreviates *multiple files*. 


```
ds=xr.open_mfdataset(...filename description string including wildcard...)
```


- The data are ordered by a dimension called `obs` for *observation number*. This runs
`1, 2, 3, ...` for each data *file*. The coordinate `time` is available as a dependent
coordinate; but to combine multiple files into a single dataset we do not want to have
`obs = 1, 2, 3, ..., 7010, 7011, 1, 2, 3, ...` with redundant observations. We simply
want everything related to a `time` dimension that increases monotonically as all the
data are combined. For this we use the XArray Dataset `.swap_dims()` method which is
passed a small dictionary that articulates how the swap will happen. 


```
ds = ds.swap_dims({'obs':'time'})
```

These two commands are orchestrated together by means of a *preprocessor* function.



## Save streamlined datasets

The following cell opens "multi-file" datasets. It uses short 'good stuff' lists to preserve
important information and dump everything else: Across dimensions, coordinates, data variables 
and attributes. It then writes these simplified Datasets as NetCDF files.  

In [None]:
%%time

def load_and_save_streamlined(source, output, keep_dims, keep_coords, keep_data_vars, keep_attrs):

    def lass_preprocessor(fds):                 # per-file datasets have dimension 'obs'
        return fds.swap_dims({'obs':'time'})    #   ...so we pre-swap that for time

    ds = xr.open_mfdataset(data_dir + source, preprocess = lass_preprocessor, concat_dim='time', combine='by_coords')
    for key in ds.dims: 
        if key not in keep_dims: ds = ds.drop_dims(key)
    for key in ds.coords: 
        if key not in keep_coords: ds = ds.drop(key)
    for key in ds.data_vars: 
        if key not in keep_data_vars: ds = ds.drop(key)
    attrs_dict = ds.attrs.copy()         
    for key in attrs_dict: 
        if key not in keep_attrs: ds.attrs.pop(key)
    ds.to_netcdf(data_dir + output)
    return ds

strRoot = 'rca/2019/depl*'
strSite = 'SBPS*'
strPlatform = 'SF*'

# particular to phsen the pH sensor
ds_phsen = load_and_save_streamlined(strRoot + strSite + strPlatform + 'phsen*.nc', 
                                    'rca/simpler/osb_sp_phsen_2019.nc', 
                                    ['time'],
                                    ['time', 'int_ctd_pressure'],
                                    ['ph_seawater'],
                                    ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

ds_nutnr_a_dark = load_and_save_streamlined(strRoot + strSite + strPlatform + 'nutnr_a_dark*.nc', 
                                    'rca/simpler/osb_sp_nutnr_a_dark_2019.nc', 
                                    ['time', 'wavelength'],
                                    ['time', 'int_ctd_pressure'],
                                    ['nitrate_concentration'],
                                    ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

ds_nutnr_a_sample = load_and_save_streamlined(strRoot + strSite + strPlatform + 'nutnr_a_sample*.nc', 
                                    'rca/simpler/osb_sp_nutnr_a_sample_2019.nc', 
                                    ['time', 'wavelength'],
                                    ['time', 'int_ctd_pressure'],
                                    ['nitrate_concentration'],
                                    ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])


# ds_ctdpf = load_and_save_streamlined('rca/2019/depl*ctdpf*.nc', 'rca/simpler/osb_sp_ctdpf_2019.nc', 
#                                     ['time'],
#                                     ['time', 'seawater_pressure'],
#                                     ['seawater_temperature', 'practical_salinity', 
#                                      'corrected_dissolved_oxygen', 'density'],
#                                     ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

# ds_parad = load_and_save_streamlined('rca/2019/depl*parad*.nc', 'rca/simpler/osb_sp_parad_2019.nc', 
#                                     ['time'],
#                                     ['time', 'int_ctd_pressure'],
#                                     ['par_counts_output'],
#                                     ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

# ds_spkir = load_and_save_streamlined('rca/2019/depl*spkir*.nc', 'rca/simpler/osb_sp_spkir_2019.nc', 
#                                     ['time', 'spectra'],
#                                     ['time', 'int_ctd_pressure'],
#                                     ['spkir_downwelling_vector'],
#                                     ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

# ds_optaa = load_and_save_streamlined('rca/2019/depl*optaa*.nc', 'rca/simpler/osb_sp_optaa_2019.nc', 
#                                     ['time', 'wavelength'],
#                                     ['time', 'wavelength', 'int_ctd_pressure'],
#                                     ['beam_attenuation', 'optical_absorption'],
#                                     ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])

# ds_flort = load_and_save_streamlined('rca/2019/depl*flort*.nc', 'rca/simpler/osb_sp_flort_2019.nc', 
#                                      ['time'],
#                                      ['time', 'int_ctd_pressure'],
#                                      ['fluorometric_chlorophyll_a', 'fluorometric_cdom', 
#                                       'total_volume_scattering_coefficient', 'seawater_scattering_coefficient', 
#                                       'optical_backscatter'],
#                                      ['node', 'id', 'geospatial_lat_min', 'geospatial_lon_min'])