#June 2018 OOI Biology Workshop Data Validation Report


This notebook will examine shelf and off-shelf profilers located on the Endurace Array off Newport, Oregon, USA. The goal is to compare independent measurements collected by the Peterson group based out of HMSC in Newport, Oregon. I will compare temperature, salinity, and dissolved oxygen measurements. The Peterson group conducts bi-weekly cruises at 7 stations along the Newport Line from 1-25 miles off shore.  The shelf stations for the Newport Line sampling (NH-5) (44.6517, -124.1770) and OOI moorings (44.641, -124.3022) are seperated by 5 nm. The off-shelf station operated by OOI has a shallow profiler mooring that we can compare individual CTD casts. Again, the Peterson group station (NH-25) is inshore of the OOI sampling station.  ![alt text](http://oceanobservatories.org/wp-content/uploads/2018/03/EA_WAOR_2018_labels.png)

## Using asynchronous data request Oregon Shelf Surface Profilier Mooring and NH-05 CTD Cast

Now that we have our profiler of interest, let's take a look at the instruments at the OOI Data Team Portal to get an idea of what we are working with. We want to get data from reference designator CE02SHSP-SP001-01-DOSTAJ000. Important lesson: Auxilliary sensors, such as DO and fluorometers will have CTD interpolated pressure, temperature, and salinity. In order to compare these physical parameters to independent CTD casts, I only need to request data from the dissolved oxygen sensor. Looking at the data portal, the data availability is sparse. Decided to focus on September 2016, specifially Sept 6th. Now it's time to request available data. 

In [None]:
# Setup Instrument Variables
site = 'CE02SHSP'
node = 'SP001'
instrument = '01-DOSTAJ000'
method = 'recovered_cspp'
stream = 'dosta_abcdjm_cspp_instrument_recovered'

In [None]:
# Setup the Python processing environment 
import requests
import datetime
import pandas as pd

In [None]:
# API Information
USERNAME ='OOIAPI-Y4VVWHNQL1983S'
TOKEN= 'Q9LA7YR8PRQSGK'
DATA_API = 'https://ooinet.oceanobservatories.org/api/m2m/12576/sensor/inv'
VOCAB_API = 'https://ooinet.oceanobservatories.org/api/m2m/12586/vocab/inv'
ASSET_API = 'https://ooinet.oceanobservatories.org/api/m2m/12587'

In [None]:
whos

In [None]:
# Specify some functions to convert timestamps
ntp_epoch = datetime.datetime(1900, 1, 1)
unix_epoch = datetime.datetime(1970, 1, 1)
ntp_delta = (unix_epoch - ntp_epoch).total_seconds()

def ntp_seconds_to_datetime(ntp_seconds):
    return datetime.datetime.utcfromtimestamp(ntp_seconds - ntp_delta).replace(microsecond=0)
  
def convert_time(ms):
  if ms != None:
    return datetime.datetime.utcfromtimestamp(ms/1000)
  else:
    return None

In [None]:
# Setup the API request url
data_request_url ='/'.join((VOCAB_API,site,node,instrument))
print data_request_url

# Grab the information from the server
r = requests.get(data_request_url, auth=(USERNAME, TOKEN))
data = r.json()
data

Looks like we have the right information!

In [None]:
# Setup the API request url
data_request_url = ASSET_API + '/events/deployment/query'
params = {
  'beginDT':'2016-01-01T00:00:00.000Z',
  'endDT':'2016-12-25T00:00:00.000Z',
  'refdes':site+'-'+node+'-'+instrument,   
}

# Grab the information from the server
r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
data = r.json()

df = pd.DataFrame() # Setup empty array
for d in data:
  df = df.append({
      'deployment': d['deploymentNumber'],
      'start': convert_time(d['eventStartTime']),
      'stop': convert_time(d['eventStopTime']),
      'latitude': d['location']['latitude'],
      'longitude': d['location']['longitude'],
      'sensor': d['sensor']['uid'],
      'asset_id': d['sensor']['assetId'],
    }, ignore_index=True)
df

Looking at deployments between Jan. 2016- Dec. 2016. 

Let's take a look at annotations, but for a longer time period.

In [None]:
ANNO_API = 'https://ooinet.oceanobservatories.org/api/m2m/12580/anno/find'
params = {
  'beginDT':int(datetime.date(2016,1,1).strftime('%s'))*1000,
  'endDT':int(datetime.date(2017,12,25).strftime('%s'))*1000,
  'refdes':site+'-'+node+'-'+instrument,
}

r = requests.get(ANNO_API, params=params, auth=(USERNAME, TOKEN))
data = r.json()

df = pd.DataFrame() # Setup empty array
for d in data:
  df = df.append({
    'annotation': d['annotation'],
    'start': convert_time(d['beginDT']),
    'stop': convert_time(d['endDT']),
    'site': d['subsite'],
    'node': d['node'],
    'sensor': d['sensor'],
    'id': d['id']
  }, ignore_index=True)
pd.set_option('display.max_colwidth', -1) # Show the full annotation text
df

Time to import more packages to plot and format data.

In [None]:
import matplotlib.pyplot as plt

!pip install netCDF4
import netCDF4 as nc

!pip install xarray
import xarray as xr

!pip install cmocean
import cmocean

In an effort to try different ways or requesting data, I used the Data Portal to request data for the DO sensor from August 26, 2016- October 2, 2016. I copied the url from my email. 

In [None]:
data_url = 'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/zemans-oregonstate-edu/20180621T190500-CE02SHSP-SP001-01-DOSTAJ000-recovered_cspp-dosta_abcdjm_cspp_instrument_recovered/deployment0003_CE02SHSP-SP001-01-DOSTAJ000-recovered_cspp-dosta_abcdjm_cspp_instrument_recovered_20160823T182628.916000-20161002T143541.103000.nc'
ds = xr.open_dataset(data_url)

# Swap the dimensions
ds = ds.swap_dims({'obs': 'time'})
ds

We loaded in the data!

In [None]:
#Simple plot of pressure over Sept 2016
ds['int_ctd_pressure'].plot()

Making a quick plot of pressure. Those are some large dbar readings considering this profiler is in 80 meters of water! Call pressure variable to make sure in dbar.

In [None]:
ds['int_ctd_pressure']


Units are in dbar. Large spikes are outliers? 

In [None]:
# Extract the values
dtime = ds['time'].values
pressure = ds['int_ctd_pressure'].values
temperature = ds['temperature'].values
salinity = ds['salinity'].values
oxygen=ds['dissolved_oxygen'].values


In [None]:
# Scatterplot of Temperature
fig,ax1 = plt.subplots(1,1,figsize=(16,4))
sc1 = ax1.scatter(dtime, pressure,c=temperature, cmap='RdYlBu_r') # Add s=2 to set the markersize
ax1.invert_yaxis() # Invert y axis
ax1.set_xlim(dtime[0],dtime[-1]) # Set the time limits to match the dataset
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
ax1.set_ylabel('Pressure (dbar)')
ax1.set_title('Endurance Shelf Profiler');
ax1.set_ylim([0, 80])
ax1.invert_yaxis() # Invert y axis


Trying out a scatterplot of temperature from end of Aug-Sept 2016. Interesting 'hot spots' at the surface with temperatures ~15 degrees. This was during the Warm Blob so temperatures make sense. Also compared briefly to the NDBC Buoy 46050 located off Newport, OR.

In [None]:
plt.hist(salinity)

Seeing large outliers in salinity.

In [None]:
plt.hist(oxygen)

Also see some negative dissolved oxygen measurements. 

In [None]:
print "Max value element salinity : ", max(salinity)
print "Max value element oxygen : ",max(oxygen)
print "Min value element oxygen : ",min(oxygen)
print "Min value element salinity", min(salinity)

In [None]:
!pip install cmocean
import cmocean

Profiles of Temperature, salinity, and oxygen from Sept. 2016.

In [None]:
#Temperature, Salinity & Oxygen
fig, (ax1,ax2,ax3) = plt.subplots(3,1, sharex=True, sharey=True, figsize=(16,12))
sc1 = ax1.scatter(dtime, pressure, c=temperature, cmap=cmocean.cm.thermal) 
sc2 = ax2.scatter(dtime, pressure, c=salinity,cmap=cmocean.cm.haline,vmin=30,vmax=35) 
sc3 = ax3.scatter(dtime, pressure, c=oxygen, cmap='Blues',vmin=0,vmax=300)

# Because the X and Y axes are shared, we only have to set limits once
ax1.set_ylim([0, 80])
ax1.invert_yaxis() # Invert y axis
ax1.set_xlim(dtime[0],dtime[-1]) # Set the time limits to match the dataset

# Add the colorbars
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
cbar = fig.colorbar(sc2, ax=ax2,orientation='vertical')
cbar.ax.set_ylabel('Salinity')
cbar = fig.colorbar(sc3, ax=ax3, orientation='vertical')
cbar.ax.set_ylabel('Oxygen (%s)' % ds['corrected_dissolved_oxygen'].units)
cbar.update_ticks()
cbar.formatter.set_useOffset(False)

# Add labels & titles
ax1.set_ylabel('Pressure (dbar)')
ax2.set_ylabel('Pressure (dbar)')
ax3.set_ylabel('Pressure (dbar)')

fig.suptitle('Shelf Profiler Endurance Array')
fig.subplots_adjust(top=0.95);


In [None]:
import matplotlib.pyplot as plt
import numpy as np

Alright, let's take a look at a specific day.

In [None]:
a = ds.sel(time=slice('2016-09-06', '2016-09-07'))
a

In [None]:
print(ds.time.size)
print(a.time.size)

In [None]:
# A quickplot
a['int_ctd_pressure'].plot();

In [None]:
# Extract a full up or down cast
ds2 = ds.sel(time=slice('2016-09-06 10:35:00', '2016-09-06 23:45:00'))
ds2['int_ctd_pressure'].plot();


In [None]:
# Now let's create some vertical profile plots
fig, (ax1,ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(ds2.temperature, ds2.int_ctd_pressure, 'b.', markersize=3)
ax2.plot(ds2.salinity, ds2.int_ctd_pressure, 'b.', markersize=3)
ax3.plot(ds2.dissolved_oxygen,ds2.int_ctd_pressure,'.b',markersize=3)


ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax3.set_xlabel('Oxygen (%s)' % ds['dissolved_oxygen'].units)

ax1.set_ylabel('Pressure (dbar)')

ax1.invert_yaxis()
fig.suptitle('Endurance Shelf Profiler Sept 6 2016')
# ax1.set_ylim(475,0)



Profiles look okay. Again, seeing the high temperatures at the surface with some stratification in the upper water column. There is some wonky salinity data with peaks and dips. 


**Now, let's take a look at some Peterson Lab data. Queried our database to find a NH05 trip that happened on Sept 6, 2016. Now, how do we upload our .csv file into google colab? There looks to be a function that can take a .csv file from your google drive, but it was not working. Instead, used Sage's univerisity website to upload into the notebook to begin playing around with CTD data.** 

In [None]:
import pandas as pd

p = pd.read_csv('https://marine.rutgers.edu/~sage/OOI_Data_Workshops/NHLineData.csv')

In [None]:
p.head()

The Peterson group uses ml/L when they process oxygen data. OOI dissolved oxygen units are umol kg-1. We need to convert units in order to compare the casts. 

O2 [micromole/kg] = O2 [micromole/L] / ρ

O2 [micromole/L] = 44.6596 × O2 [ml/L]

Here, ρ is the potential density of water [kg/L] at zero pressure and at the potential
temperature (e.g., 1.0269 kg/L; e.g., UNESCO, 1983). The value of 44.6596 is derived from
the molar volume of the oxygen gas, 22.3916 L/mole, at standard temperature and pressure
(0°C, 1 atmosphere; e.g., García and Gordon, 1992).

In [None]:

p['OIIOxygen'] = p['Oxygen'].astype(float)*44.66/1.02
p.head()

Taking out all measurements from NH-05

In [None]:
NH5=p[p.Station == 'NH05']
NH5

In [None]:
# Now let's create some profile plots from the Newport Line CTD at NH-05


peterpressure = NH5['Pressure'].values
petertemperature = NH5['Temperature'].values
petersalinity = NH5['Salinity'].values
peteroxygen=NH5['OIIOxygen'].values

fig, (ax1,ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature, peterpressure, 'b.', markersize=10)
ax2.plot(petersalinity, peterpressure, 'b.', markersize=10)
ax3.plot(peteroxygen,peterpressure,'b.',markersize=10)


ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax3.set_xlabel('Oxygen')

ax1.set_ylabel('Pressure (dbar)')

fig.suptitle('NH-05 CTD Cast Sept 6, 2016')

ax1.invert_yaxis()

And now, let's put the OOI profiler cast with the NH-05 CTD cast

In [None]:
fig, (ax1, ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature,peterpressure,'b')
ax1.plot(ds2.temperature,ds2.int_ctd_pressure,'r.')
ax1.set_xlabel('Temperature (C)')
ax1.set_ylabel('Pressure (dm)')



ax2.plot(petersalinity,peterpressure,'b',label='Cruise CTD')
ax2.plot(ds2.salinity,ds2.int_ctd_pressure,'r.',label='Endurance Profiler')
ax2.set_xlabel('Salinity (psu)')

ax3.plot(peteroxygen,peterpressure,'.b')
ax3.plot(ds2.dissolved_oxygen,ds2.int_ctd_pressure,'r.')
ax3.set_xlabel('Oxygen (%s)' % ds['dissolved_oxygen'].units)


ax1.invert_yaxis()

fig.suptitle('Endurance Shelf Compared with Newport Line CTD')
fig.subplots_adjust(top=0.9)

legend = ax2.legend(loc='lower right', shadow=True, fontsize='small')




Conclusion? 

It looks like the cruise data (blue line) and the OOI profiler (red line) track pretty well. There is some discrepancy at the surface. The dissolved oxygen profiles have an obvious offset from, with the maximum offset being ~100 umol kg-1. One important question, do we trust the oxygen censor on the Peterson group CTD? 

 ## Using Synchronous data request API Oregon Shelf Surface Profiler Mooring and NH-05 CTD Cast

Just to play around with different ways to call and request data, this code will be looking at the same shelf profiler dissolved oxygen (CE02SHSP-SP001-01-DOSTAJ000), but using json response which is handy in order to get a quick and dirty look at instruments. 

In [None]:
# Setup the Python processing environment 
import requests
import datetime
import pandas as pd

In [None]:
# API Information
USERNAME ='OOIAPI-Y4VVWHNQL1983S'
TOKEN= 'Q9LA7YR8PRQSGK'
DATA_API = 'https://ooinet.oceanobservatories.org/api/m2m/12576/sensor/inv'
VOCAB_API = 'https://ooinet.oceanobservatories.org/api/m2m/12586/vocab/inv'
ASSET_API = 'https://ooinet.oceanobservatories.org/api/m2m/12587'

Instrument data for Endurance Shelf Profiler CTD. Calling some jsons to look at what each of the parameters are called.

In [None]:
# Instrument Information
site = 'CE02SHSP'
node = 'SP001'
instrument = '01-DOSTAJ000'
method = 'recovered_cspp'
stream = 'dosta_abcdjm_cspp_instrument_recovered'

data_request_url ='/'.join((DATA_API,site,node,instrument,method,stream))

params = {
  'beginDT':'2016-09-06T00:00:00.000Z',
  'endDT':'2016-09-07T00:00:00.000Z',
  'limit':50000 
}

In [None]:
# Grab the data
r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
data = r.json()

In [None]:
len(data)

In [None]:
# Instrument Information
site = 'CE02SHSP'
node = 'SP001'
instrument = '01-DOSTAJ000'
method = 'recovered_cspp'
stream = 'dosta_abcdjm_cspp_instrument_recovered'

data_request_url ='/'.join((DATA_API,site,node,instrument,method,stream))

params = {
  'beginDT':'2016-09-06T00:00:00.000Z',
  'endDT':'2016-09-07T00:00:00.000Z',
  'limit':20000 
}

In [None]:
# Grab the data
r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
data = r.json()

In [None]:
len(data)

Notice that I called the instrument information twice, with different limit params. The first request was 50,000 and the second was 20,000. When I request more data points, my dataset was smaller, based on len(data). Seems to be an issue centered around how often and when the instrument samples. It's good protocol to look up sampling frequency for instruments. 

In [None]:
data[0]

Let's call the selected parameters: pressure, salinity, temperature, and dissolved oxygen.

In [None]:
# Selected Instruments to Plot
instruments = [
  ['CE02SHSP','SP001','01-DOSTAJ000','recovered_cspp','dosta_abcdjm_cspp_instrument_recovered','int_ctd_pressure'],
  ['CE02SHSP','SP001','01-DOSTAJ000','recovered_cspp','dosta_abcdjm_cspp_instrument_recovered','ctdpf_j_cspp_instrument_recovered-salinity'],
  ['CE02SHSP','SP001','01-DOSTAJ000','recovered_cspp','dosta_abcdjm_cspp_instrument_recovered','ctdpf_j_cspp_instrument_recovered-temperature'],
    ['CE02SHSP','SP001','01-DOSTAJ000','recovered_cspp','dosta_abcdjm_cspp_instrument_recovered','dissolved_oxygen'],
]

Grabbing the data for all of Sept. 2016. 

In [None]:
# Specify additional parameters for the API request 
params = {
  'beginDT':'2016-09-01T00:00:00.000Z',
  'endDT':'2016-09-30T00:00:00.000Z',
  'limit':10000,   
}

In [None]:
# Grab the data for each instrument
out = []
for jj in range(len(instruments)):
  data_request_url ='/'.join((DATA_API,instruments[jj][0],instruments[jj][1],instruments[jj][2],instruments[jj][3],instruments[jj][4]))
  r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
  data = r.json()
  print(instruments[jj]) 
  print(len(data))
  time = []
  values = []
  for i in range(len(data)):
    time.append(ntp_seconds_to_datetime(data[i]['time']))
    values.append(data[i][instruments[jj][5]])
  out.append({'time':time,'value':values});

In [None]:
# Time Processing Routines 
ntp_epoch = datetime.datetime(1900, 1, 1)
unix_epoch = datetime.datetime(1970, 1, 1)
ntp_delta = (unix_epoch - ntp_epoch).total_seconds()

def ntp_seconds_to_datetime(ntp_seconds):
    return datetime.datetime.utcfromtimestamp(ntp_seconds - ntp_delta).replace(microsecond=0)


In [None]:
import matplotlib.pyplot as plt

!pip install netCDF4
import netCDF4 as nc

!pip install xarray
import xarray as xr

!pip install cmocean
import cmocean

In [None]:
# Plot the data
fig,axs = plt.subplots(len(out), sharex=True, sharey=False, figsize=(8,10))

for jj in range(len(out)):
  axs[jj].scatter(out[jj]['time'], out[jj]['value'], marker='.')
  #axs[jj].set(ylabel=instruments[jj][5])
  #axs[jj].set_title('-'.join(instruments[jj][0:3]))
  axs[jj].text(.92, .9, ('%sm' % instruments[jj][-1]), horizontalalignment='left', verticalalignment='top', transform=axs[jj].transAxes)
  
plt.xlim(datetime.date(2016,9,1),datetime.date(2016,9,25))
plt.xticks(rotation=30)

axs[0].set_ylim(0,100)
axs[1].set_ylim(0,40)
axs[2].set_ylim(0,30)
axs[3].set_ylim(0,500)

axs[0].set_ylabel('Pressure')
axs[1].set_ylabel('Salinity')
axs[2].set_ylabel('Temperature')
axs[3].set_ylabel('DO')

axs[0].set_title('Endurance Shelf Profiler')




The data points for an entire month look sparse. Notice also that the len(data) is low compared to the next set of plots which are looking at a single day in Sept. 2016. 



In [None]:
import matplotlib.pyplot as plt
import numpy as np

Renaming our parameters

In [None]:
time=out[0]['time']
pressure=out[0]['value']
temperature=out[2]['value']
salinity=out[1]['value']
oxygen=out[3]['value']



In [None]:
# Scatterplot of Temperature
fig,ax1 = plt.subplots(1,1,figsize=(16,4))
sc1 = ax1.scatter(time, pressure, c=temperature, cmap='RdYlBu_r') # Add s=2 to set the markersize
ax1.invert_yaxis() # Invert y axis
#ax1.set_xlim(time[0],time[-1]) # Set the time limits to match the dataset
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
ax1.set_ylabel('Pressure (dbar)')
ax1.set_title('Shelf Profiler');





In [None]:
#Temperature, Salinity, Oxygen 
fig, (ax1,ax2,ax3) = plt.subplots(3,1, sharex=True, sharey=True,figsize=(16,12)) #sharey=True
sc1 = ax1.scatter(time, pressure, c=temperature, cmap=cmocean.cm.thermal) 
sc2 = ax2.scatter(time, pressure, c=salinity, cmap=cmocean.cm.haline) 
sc3 = ax3.scatter(time, pressure, c=oxygen, cmap=cmocean.cm.oxy)
# Because the X and Y axes are shared, we only have to set limits once
ax1.invert_yaxis() # Invert y axis
#ax1.set_xlim(time[0],time[-1]) # Set the time limits to match the dataset

# Add the colorbars
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
cbar = fig.colorbar(sc2, ax=ax2, orientation='vertical')
cbar.ax.set_ylabel('Salinity')
cbar = fig.colorbar(sc3, ax=ax3, orientation='vertical')
cbar.ax.set_ylabel('Oxygen (%s)')
cbar.update_ticks()
cbar.formatter.set_useOffset(False)


# Add labels & titles
ax1.set_ylabel('Pressure (dbar)')
ax2.set_ylabel('Pressure (dbar)')


fig.suptitle('Endurance Shelf Profiler')
fig.subplots_adjust(top=0.95);



Let's create vertical profile for all of September 2016. 

In [None]:
fig,(ax1,ax2,ax3)=plt.subplots(1,3,sharey=True)

ax1.plot(temperature,pressure,'r.')
ax1.invert_yaxis()
ax2.plot(salinity,pressure,'r.')
ax3.plot(oxygen,pressure,'r.')

ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax3.set_xlabel('Oxygen (%s)' % ds['dissolved_oxygen'].units)


Using DataTeam help, slicing our datetime to find a single vertical profile. Using jsons not the best way to find a single up or down cast. 

In [None]:

starttime=datetime.datetime(2016, 9, 6, 0, 15, 41)
endtime=datetime.datetime(2016, 9, 6, 12, 15, 41)

x=[out[0]['time'][ii] for ii in range (len(out[0]['time'])) if out[0]['time'][ii]<endtime]
print(x[0],x[-1])

Instead, let's just take a day from September, the same day that corresponds to the Newport Line sampling at NH-05, Sept.6th.

In [None]:
# Specify additional parameters for the API request 
params = {
  'beginDT':'2016-09-06T00:10:00.000Z',
  'endDT':'2016-09-07T00:00:00.000Z',
  'limit':10000,   
}

In [None]:
# Grab the data for each instrument
out = []
for jj in range(len(instruments)):
  data_request_url ='/'.join((DATA_API,instruments[jj][0],instruments[jj][1],instruments[jj][2],instruments[jj][3],instruments[jj][4]))
  r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
  data = r.json()
  print(instruments[jj]) 
  print(len(data))
  time = []
  values = []
  for i in range(len(data)):
    time.append(ntp_seconds_to_datetime(data[i]['time']))
    values.append(data[i][instruments[jj][5]])
  out.append({'time':time,'value':values});

In [None]:
# Time Processing Routines 
ntp_epoch = datetime.datetime(1900, 1, 1)
unix_epoch = datetime.datetime(1970, 1, 1)
ntp_delta = (unix_epoch - ntp_epoch).total_seconds()

def ntp_seconds_to_datetime(ntp_seconds):
    return datetime.datetime.utcfromtimestamp(ntp_seconds - ntp_delta).replace(microsecond=0)


In [None]:
time=out[0]['time']
pressure=out[0]['value']
temperature=out[2]['value']
salinity=out[1]['value']
oxygen=out[3]['value']

In [None]:
#Temperature, Salinity, Oxygen 
fig, (ax1,ax2,ax3) = plt.subplots(3,1, sharex=True, sharey=True,figsize=(16,12)) #sharey=True
sc1 = ax1.scatter(time, pressure, c=temperature, cmap=cmocean.cm.thermal) 
sc2 = ax2.scatter(time, pressure, c=salinity, cmap=cmocean.cm.haline) 
sc3 = ax3.scatter(time, pressure, c=oxygen, cmap=cmocean.cm.oxy)
# Because the X and Y axes are shared, we only have to set limits once
ax1.invert_yaxis() # Invert y axis
#ax1.set_xlim(time[0],time[-1]) # Set the time limits to match the dataset

# Add the colorbars
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
cbar = fig.colorbar(sc2, ax=ax2, orientation='vertical')
cbar.ax.set_ylabel('Salinity')
cbar = fig.colorbar(sc3, ax=ax3, orientation='vertical')
cbar.ax.set_ylabel('Oxygen')
cbar.update_ticks()
cbar.formatter.set_useOffset(False)


# Add labels & titles
ax1.set_ylabel('Pressure (dbar)')
ax2.set_ylabel('Pressure (dbar)')


fig.suptitle('Endurance Shelf Profiler')
fig.subplots_adjust(top=0.95);


In [None]:
fig,(ax1,ax2,ax3)=plt.subplots(1,3,sharey=True)

ax1.plot(temperature,pressure,'r.')
ax1.invert_yaxis()
ax2.plot(salinity,pressure,'r.')
ax3.plot(oxygen,pressure,'r.')

ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax3.set_xlabel('Oxygen (%s)' % ds['dissolved_oxygen'].units)

Something weird with salinity at depth.

Let's bring in the NH-05 data

In [None]:
import pandas as pd

p = pd.read_csv('https://marine.rutgers.edu/~sage/OOI_Data_Workshops/NHLineData.csv')

In [None]:

p['OIIOxygen'] = p['Oxygen'].astype(float)*44.66/1.02
p.head()

In [None]:
NH5=p[p.Station == 'NH05']
NH5

In [None]:
# Now let's create some profile plots from the Newport Line CTD at NH-05

peterpressure = NH5['Pressure'].values
petertemperature = NH5['Temperature'].values
petersalinity = NH5['Salinity'].values
peteroxygen=NH5['OIIOxygen'].values

fig, (ax1,ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature, peterpressure, 'b.', markersize=10)
ax2.plot(petersalinity, peterpressure, 'b.', markersize=10)
ax3.plot(peteroxygen,peterpressure,'b.',markersize=10)


ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax2.set_xlabel('Oxygen')

ax1.set_ylabel('Pressure (dbar)')

ax1.invert_yaxis()

In [None]:
fig, (ax1, ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature,peterpressure,'b')
ax1.plot(temperature,pressure,'r.')
ax1.set_xlabel('Temperature (C)')
ax1.set_ylabel('Pressure (dm)')



ax2.plot(petersalinity,peterpressure,'b',label='Cruise CTD')
ax2.plot(salinity,pressure,'r.',label='Endurance Profiler')
ax2.set_xlabel('Salinity (psu)')

ax3.plot(peteroxygen,peterpressure,'.b')
ax3.plot(oxygen,pressure,'.r')
ax3.set_xlabel('Oxygen')

ax1.invert_yaxis()

fig.suptitle('Endurance Shelf Compared with Newport Line CTD')
fig.subplots_adjust(top=0.9)

legend = ax2.legend(loc='lower right', shadow=True, fontsize='small')


Again, we see the high overlap between the temperature and salinity, but the offset in dissolved oxygen.

## Looking at Oregon Offshelf Shallow Profiler Mooring and NH-25 CTD Cast

Will use the OOI Synchronous Data Request API to pull out temperature, salinity, and dissolved oxygen data from Shallow Profiler (CE04OSPS-SF01B-2A-CTDPFA107). We will focus on the data collected on Sept. 20, 2016. The time frame was picked because we only had certain instances of NH-25 sampling in 2016. 

In [None]:
# Setup the Python processing environment 
import requests
import datetime
import pandas as pd

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# API Information
USERNAME ='OOIAPI-Y4VVWHNQL1983S'
TOKEN= 'Q9LA7YR8PRQSGK'
DATA_API = 'https://ooinet.oceanobservatories.org/api/m2m/12576/sensor/inv'
VOCAB_API = 'https://ooinet.oceanobservatories.org/api/m2m/12586/vocab/inv'
ASSET_API = 'https://ooinet.oceanobservatories.org/api/m2m/12587'

In [None]:
# Time Processing Routines 
ntp_epoch = datetime.datetime(1900, 1, 1)
unix_epoch = datetime.datetime(1970, 1, 1)
ntp_delta = (unix_epoch - ntp_epoch).total_seconds()

def ntp_seconds_to_datetime(ntp_seconds):
    return datetime.datetime.utcfromtimestamp(ntp_seconds - ntp_delta).replace(microsecond=0)

In [None]:
# Instrument Information
site = 'CE04OSPS'
node = 'SF01B'
instrument = '2A-CTDPFA107'
method = 'streamed'
stream = 'ctdpf_sbe43_sample'

data_request_url ='/'.join((DATA_API,site,node,instrument,method,stream))

In [None]:
# Specify additional parameters for the API request 
params = {
  'beginDT':'2016-09-06T00:10:00.000Z',
  'endDT':'2016-09-07T00:00:00.000Z',
  'limit':10000,   
}

In [None]:
# Grab the data
r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
data = r.json()
data

In [None]:
instruments=[
    ['CE04OSPS','SF01B','2A-CTDPFA107','streamed','ctdpf_sbe43_sample','seawater_pressure'],
    ['CE04OSPS','SF01B','2A-CTDPFA107','streamed','ctdpf_sbe43_sample','seawater_temperature'],
    ['CE04OSPS','SF01B','2A-CTDPFA107','streamed','ctdpf_sbe43_sample','practical_salinity'],
    ['CE04OSPS','SF01B','2A-CTDPFA107','streamed','ctdpf_sbe43_sample','corrected_dissolved_oxygen'],
]

In [None]:
# Specify additional parameters for the API request 
params = {
  'beginDT':'2016-09-20T00:00:00.000Z',
  'endDT':'2016-09-21T00:00:00.000Z',
  'limit':10000,   
}

In [None]:
# Grab the data for each instrument
out = []
for jj in range(len(instruments)):
  data_request_url ='/'.join((DATA_API,instruments[jj][0],instruments[jj][1],instruments[jj][2],instruments[jj][3],instruments[jj][4]))
  r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
  data = r.json()
  print(instruments[jj]) 
  print(len(data))
  time = []
  values = []
  for i in range(len(data)):
    time.append(ntp_seconds_to_datetime(data[i]['time']))
    values.append(data[i][instruments[jj][5]])
  out.append({'time':time,'value':values});

In [None]:
time=out[0]['time']
pressure=out[0]['value']
temperature=out[1]['value']
salinity=out[2]['value']
oxygen=out[3]['value']


In [None]:
whos

In [None]:
!pip install cmocean
import cmocean

Looking at profile of temperature, salinity, and oxygen for Sept. 20, 2016.

In [None]:
#Temperature, Salinity, Oxygen
fig, (ax1,ax2,ax3) = plt.subplots(3,1, sharex=True, sharey=True,figsize=(16,12)) #sharey=True
sc1 = ax1.scatter(time, pressure, c=temperature, cmap=cmocean.cm.thermal) 
sc2 = ax2.scatter(time, pressure, c=salinity, cmap=cmocean.cm.haline) 
sc3 = ax3.scatter(time, pressure, c=oxygen, cmap=cmocean.cm.oxy)
# Because the X and Y axes are shared, we only have to set limits once
ax1.invert_yaxis() # Invert y axis
ax1.set_xlim(time[0],time[-1]) # Set the time limits to match the dataset

# Add the colorbars
cbar = fig.colorbar(sc1, ax=ax1, orientation='vertical')
cbar.ax.set_ylabel('Temperature ($^\circ$C)')
cbar = fig.colorbar(sc2, ax=ax2, orientation='vertical')
cbar.ax.set_ylabel('Salinity')
cbar = fig.colorbar(sc3, ax=ax3, orientation='vertical')
cbar.ax.set_ylabel('Oxygen')
cbar.update_ticks()
cbar.formatter.set_useOffset(False)


# Add labels & titles
ax1.set_ylabel('Pressure (dbar)')
ax2.set_ylabel('Pressure (dbar)')


fig.suptitle('Endurance Off-Shelf Profiler')
fig.subplots_adjust(top=0.95);

Dissolved oxygen looks at little high (400 umol kg-1) for an off-shelf station in September. Let's plot some vertical plots of the same parameters. 

In [None]:
fig,(ax1,ax2,ax3)=plt.subplots(1,3,sharey=True)

ax1.plot(temperature,pressure,'r.')
ax1.invert_yaxis()
ax1.set_xlabel('Temperature (C)')
ax1.set_ylabel('Pressure (dm)')
ax2.plot(salinity,pressure,'r.')
ax2.set_xlabel('Salinity')
ax3.plot(oxygen,pressure,'r.')
ax3.set_xlabel('Dissolved Oxygen')

Now, lets bring in the data from NH-25 on Sept. 20, 2016. 

NH-25 Newport Line CTD Data.

In [None]:
import pandas as pd

p = pd.read_csv('https://marine.rutgers.edu/~sage/OOI_Data_Workshops/NHLineData.csv')

In [None]:
p['OIIOxygen'] = p['Oxygen'].astype(float)*44.66
p.head()

In [None]:
NH25=p[p.Station == 'NH25']
NH25


In [None]:
# Now let's create some profile plots from the Newport Line CTD at NH-05

peterpressure = NH25['Pressure'].values
petertemperature = NH25['Temperature'].values
petersalinity = NH25['Salinity'].values
peteroxygen=NH25['OIIOxygen'].values

fig, (ax1,ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature, peterpressure, 'b.', markersize=10)
ax2.plot(petersalinity, peterpressure, 'b.', markersize=10)
ax3.plot(peteroxygen,peterpressure,'b.',markersize=10)


ax1.set_xlabel('Temperature ($^\circ$C)')
ax2.set_xlabel('Salinity')
ax3.set_xlabel('Oxygen')

ax1.set_ylabel('Pressure (dbar)')

ax1.invert_yaxis()

fig.suptitle('NH-25')

In [None]:
fig, (ax1, ax2,ax3) = plt.subplots(1, 3, sharey=True)

ax1.plot(petertemperature,peterpressure,'b')
ax1.plot(temperature,pressure,'r.')
ax1.set_xlabel('Temperature (C)')
ax1.set_ylabel('Pressure (dm)')



ax2.plot(petersalinity,peterpressure,'b',label='Cruise CTD')
ax2.plot(salinity,pressure,'r.',label='Endurance Profiler')
ax2.set_xlabel('Salinity (psu)')

ax3.plot(peteroxygen,peterpressure,'.b')
ax3.plot(oxygen,pressure,'.r')
ax3.set_xlabel('Oxygen')

ax1.invert_yaxis()

fig.suptitle('Endurance Off-Shelf Shallow Profiler Compared with NH-25 CTD')
fig.subplots_adjust(top=0.9)

legend = ax2.legend(loc='lower right', shadow=True, fontsize='small')

Again, we see some good overlap in temperature and salinity. Noticably, the profiler stops at 200 m. The oxygen is more troubling. The profiler is consistently higher than the NH-25 CTD. 

Luckily, we have a surface mooring near the off-shelf profiler. Let's take a look at the dissolved oxygen at the surfacing mooring for September 2016. 

## Offshelf Surface Mooring and Dissolved Oxygen

Let's make a quick Synchromous request (quick and dirty!) and look at the dissolved oxygen values at the surface mooring (deployed at 7 m) near the off-shore profiler on the Endurance Array. Are the values similar? 

In [None]:
# Setup the Python processing environment 
import requests
import datetime
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# API Information
USERNAME ='OOIAPI-Y4VVWHNQL1983S'
TOKEN= 'Q9LA7YR8PRQSGK'
DATA_API = 'https://ooinet.oceanobservatories.org/api/m2m/12576/sensor/inv'
VOCAB_API = 'https://ooinet.oceanobservatories.org/api/m2m/12586/vocab/inv'
ASSET_API = 'https://ooinet.oceanobservatories.org/api/m2m/12587'

In [None]:
# First, we need to add some more Python libraries
import requests
import datetime
import time

In [None]:
# Instrument Information
site = 'CE04OSSM'
node = 'RID27'
instrument = '04-DOSTAD000'
method = 'recovered_host'
stream = 'dosta_abcdjm_dcl_instrument_recovered'

data_request_url ='/'.join((DATA_API,site,node,instrument,method,stream))



params = {
  'beginDT':'2016-09-01T00:00:00.000Z',
  'endDT':'2016-09-30T00:00:00.000Z',
  'limit':1000,   
}

In [None]:
# Grab the data
r = requests.get(data_request_url, params=params, auth=(USERNAME, TOKEN))
data = r.json()

In [None]:
data[0]

In [None]:
# Time Processing Routine
ntp_epoch = datetime.datetime(1900, 1, 1)
unix_epoch = datetime.datetime(1970, 1, 1)
ntp_delta = (unix_epoch - ntp_epoch).total_seconds()

def ntp_seconds_to_datetime(ntp_seconds):
    return datetime.datetime.utcfromtimestamp(ntp_seconds - ntp_delta).replace(microsecond=0)

In [None]:
# Process the data
time = []
oxygen= []
pressure=[]
for i in range(len(data)):
  time.append(ntp_seconds_to_datetime(data[i]['time']))
  oxygen.append(data[i]['dissolved_oxygen'])
  


In [None]:
import numpy as np 

In [None]:
plt.plot_date(time, oxygen, 'r.', label='Oxygen')
plt.xlabel('Date')
plt.ylabel('Oxygen')


In [None]:
np.mean(oxygen)

Looking at a quick plot of surface mooring time series, we see a peak in dissolved oxygen on Sept 16, 2016, but with a mean of 239 umol kg-1. These numbers are much lower than the surface measurements from off-shore shallow profiler. Something amiss with the profiler oxygen readings?

## Conclusions

Some general conclusions and future data quality:

1. Certain biological parameters should be examined in detail and values should be quality controlled based on region and season. 
2. Co-located data needs to have it's own quality control. In our case, Winkler titrations could be used for DO values to compare with our own CTD.
3. When requesting data, make sure to examine annotations and any data gaps. Think about which way you want to access the data. 
4. Examine other biological parameters at these instruments and compare to cruise data.
5. The Endurance Array will complement work done by the Peterson group, especially considering we can only collect physical data when the weather permits. The instruments are out there when we can't be!

In [None]:
print ('k bye')